Tech confs

Talk - Anthony Shaw: Write faster Python! Common performance anti patterns

This talk will show small, specific examples of Python code that can be refactored to be faster without compromising on readability. At the start of the talk, I'll explain how to set up a profiler to measure application performance and how to track improvements and regressions.

Watch

Talk - Peacock: Getting Started with Statically Typed Programming in Python 3.10

Since 2015, it has been possible to write Python like a statically typed language with typing modules and other features introduced in Python 3.5. This can significantly improve the development experience and review process. I have been using type hints in my work for several years and have been studying Haskell and TypeScirpt. I believe this session will be a stepping stone for "type hints newbies." What I will talk about in this talk: - Advantages of using Typing - Getting help from editors - Facilitating code reviews - How to get started with Typing - Argument and return types for functions - Using the standard Collections types - The difference between tuple and other types - Abstract and concrete types - Generics, user-defined types - Type Hinting Updates in Python 3.9 and 3.10 - (3.9) Type Hinting Generics In Standard Collections - (3.10) Allow writing union types as X | Y - (3.10) Parameter Specification Variables - (3.10) Explicit Type Aliases - (3.10) User-Defined Type Guards What is not covered in this talk - Basic Python 3 syntax - (Not required): Experience developing in statically typed languages Related contents: - A talk at PyCon JP 2020 (JA): https://pycon.jp/2020/en/timetable/?id=203955 - https://docs.python.org/3/library/typing.html

Watch

Talk. - Francesco Murdaca/Maya Costantini: How to Make Your Python Jupyter Notebook Standalone an...

Even though many developers (including data scientists) focus on their core problems when working on their experiments, one basic aspect can make these projects not reusable. We are not considering anything machine learning-related yet. One of the first steps during the development of a project is the selection of libraries or dependencies. When someone runs pip install package-name, they might not be aware that along with the library that is going to be installed, so-called direct dependency, many other dependencies will be installed on your machine, so-called transitive dependencies. Any change in one of those dependencies can break your experiment. It’s fundamental to have a way to state all the dependencies used, including the operating system, python interpreter, and hardware used to run a certain experiment. In this session, the speakers will present an open source JupyterLab extension for Python dependency management developed by the Thoth team. They will learn what resolution engine can be used (e.g. Pipenv, Thoth), the difference between these resolution engines. Moreover they will learn what to do in different scenarios emulating typical Jupyter notebook experiences to learn how to use the new extension. By the end of this session, attendees will learn the importance of reproducibility, how to use the Thoth Jupyterlab extension for Python projects and the benefits of a cloud resolution engine with respect to other existing ones. They will be able to run a tutorial using only a GitHub account and a browser as it will be run in a completely open cloud environment.

Watch

Talk - Deepak K Gupta: Speed Up Data Access with PyArrow Apache Arrow Data is the new API

Till now we’re used to accessing data over API’s and the API’s used to make sure that we get the data in the desired format which unfortunately requires data to go through serialization / deserialization cycle before being returned by the API What if we can change or arrange the data in such a way where it neither needs an API nor any serialisation / deserialization to access and understand the data that too using multiple programming languages If it sounds interesting then welcome to the world of Apache Arrow which defines a language independent columnar memory format which supports zero-copy reads for lightning-fast data access without serialization overhead. The python library of the same is called PyArrow and can be integrated with python specific libraries like pandas and numpy and can propagate the benefits to the same. Welcome to this talk where you’ll learn about the architecture, use cases and reasons for using Apache Arrow using PyArrow. I’ll share how to as well as some of the interesting statistics of the difference it makes in our day to day access & analytics. I’ll also talk about Apache Flight, which is a high performance wire protocol focused on bulk transfer for analytics. This Session NOT a tutorial about PyArrow but a set of interesting improvements, facts and statistics which can help you to decide whether it makes sense to explore for the work you’re doing.

Watch

Talk - Ryan Kuhl: GraphQL The Devil's API

While there are advantages to using GraphQL vs. traditional REST APIs such as descriptive queries, there are also a plethora of potential pitfalls, such as the n+1 query problem and idiosyncratic fickleness. We leverage data-loaders, async/await, dynamic query generation, and other performance optimizations in GraphQL to create a flexible, performant interface for our front-end services. Let’s do GraphQL the right way!

Watch

Talk - Roman Yurchak/Hood Chatham: Pyodide: A Python distribution for the browser

Pyodide is a Python distribution for the browser and Node.js based on WebAssembly. It includes a port of CPython 3.9 to WebAssembly/Emscripten, and makes it possible to install and run Python packages in the browser. Pyodide comes with a robust Javascript ⟺ Python foreign function interface so that you can mix these two languages in your code with minimal friction. We will walk through simple examples of how to run Python applications in the browser with Pyodide. We will also discuss the process of porting existing Python packages, including what makes a package suitable to port and what challenges are likely to arise. Some Criteria that Determine Suitability of a Project for Porting: Purely computational projects are simple to port to run in the browser. We are missing threading and multiprocessing, so you will need to be able to run single threaded. File system code mostly works unchanged. However, much of the UI and network access are very different inside the browser. Packages with a clean divide between doing computation and doing UI will be simpler to port, the UI parts may need to be rewritten or shimmed but the pure computation need not be.

Watch

Talk - Graham Bleaney/Pradeep Kumar Srinivasan: Securing Code with the Python Type System

Preventing security vulnerabilities often brings to mind heavyweight security tools. But what if it doesn’t have to be that way? What if you could use the concepts already built into Python to make your code incrementally more secure? In this talk, we'll see how Python types allow you to improve your project's security incrementally. First, we’ll show how simple type annotations by themselves can prevent security-impacting logic errors. Second, we'll see how you can prevent injection vulnerabilities such as SQL injection using a special type in your APIs (PEP 675). Next, we demonstrate how to leverage runtime type validation to securely deal with user-controlled data (such as HTTP requests). Finally, we show how types naturally enable powerful typing-based tools like Pysa and CodeQL to perform static taint flow analysis and catch complex vulnerabilities that span multiple functions. No security tool is a panacea, however, so we’ll also show you where typing and the tools that rely on it can fail. Slides: https://pycon-assets.s3.amazonaws.com/2022/media/presentation_slides/18/2022-04-28T19%3A35%3A09.209346/PyCon_2022_-_Typing_for_Security.pdf

Watch

Talk - Dustin Ingram: Securing the Open Source Software Supply Chain

Supply Chain Security: so hot right now. With the recently increased focus on securing software systems, there has been a incredible explosion of tools, methodologies, standards, best practices, and more. Given the sheer quantity, it's hard to keep track and stay informed: how can you know what's right for you? The same attributes that make open source software desirable to use also make it challenging to secure. When anyone can publish an open-source library, how can you decide what's safe to use? If anyone can contribute, how can you trust the maintainers? If source code and development is in public, how can we identify and respond to vulnerabilities when attackers will know about them as soon as we do? In this talk, we'll explore new tools and best practices that you can use today as open-source software user to improve the security of your software supply chain and trust in the ecosystem. We'll show how each of these serves a different purpose, and protects you from a unique way in which your software supply chain could be vulnerable. Finally, we'll discuss upcoming and potential improvements to the entire open-source ecosystem.

Watch

Talk - Jes Ford: The Model Review: improving transparency, reproducibility, & knowledge sharing...

Code Review is an integral part of software development, but many teams don’t have similar processes in place for the development and deployment of Machine Learning (ML) models. I will motivate the decision to create a Model Review process, starting from the principles of transparency, reproducibility, and knowledge sharing. MLflow is a useful Python package to help simplify and automate much of the tracking necessary to create detailed records of machine learning experiments. Much of this talk will be spent introducing this tool, and demonstrating the core MLflow Tracking functionality. I’ll discuss how my team is currently running a Model Review process for any ML models that we push to production, and how we use MLflow to streamline this work and learn from each other. Slides: https://pycon-assets.s3.amazonaws.com/2022/media/presentation_slides/68/2022-04-26T03%3A24%3A02.545732/model_review_slides_jesford.pdf

Watch

List of videos

Talk - Anthony Shaw: Write faster Python! Common performance anti patterns

Talk - Peacock: Getting Started with Statically Typed Programming in Python 3.10

Talk. - Francesco Murdaca/Maya Costantini: How to Make Your Python Jupyter Notebook Standalone an...

Talk - Deepak K Gupta: Speed Up Data Access with PyArrow Apache Arrow Data is the new API

Talk - Ryan Kuhl: GraphQL The Devil's API

Talk - Roman Yurchak/Hood Chatham: Pyodide: A Python distribution for the browser

Talk - Graham Bleaney/Pradeep Kumar Srinivasan: Securing Code with the Python Type System

Talk - Dustin Ingram: Securing the Open Source Software Supply Chain

Talk - Jes Ford: The Model Review: improving transparency, reproducibility, & knowledge sharing...