List of videos

Talk - Sebastiaan Zeeff: Demystifying Python’s Internals: Diving into CPython by implementing...
Diving into the CPython source code can feel daunting. Whether you want to start contributing or just want to get a better understanding of Python by exploring its source code, it’s often difficult to know where to start or what you’re missing. In my talk, I will show you around the CPython source code by implementing a new operator, a pipe operator. While doing so, I will discuss core parts of the internals, such as Python’s grammar, its syntax trees, and the underlying logic that will perform the operation. By the end, you will have a good idea of the moving parts involved in core language features. I will also take you through the steps necessary to make it all work. I’ll show you how I obtained a copy of the source code, regenerated the parser and token files, and how I compiled my modified version of CPython. I will also write and run tests to help me implement my changes. This should give you a mental framework that helps you while diving into more comprehensive resources, like the excellent Python Developer’s Guide. My talk is aimed at everyone who wants to explore CPython’s internals. You don’t have to be an expert in Python, although some affinity with Python helps with understanding the internals. I will also use C to implement some of the operator logic, but knowledge of C is by no means required. In short, if you’re interested in diving into the CPython source code, this talk is for you. Slides: https://pycon-assets.s3.amazonaws.com/2022/media/presentation_slides/71/2022-04-28T15%3A55%3A04.181398/demystifying_cpython_internals_HANDOUT.pdf
Watch
Talk - Josh Weissbock: Distributed Web Scraping in Python
Web scraping is easy to do in Python, but it quickly becomes tedious when routinely running large batch scraping jobs. This talk looks at how to build a distributed web scraper to reduce batch scraping job times and improve durability of your code as well as lessons learned & stories along the way. Slides: https://pycon-assets.s3.amazonaws.com/2022/media/presentation_slides/48/2022-04-29T04%3A28%3A21.308613/PyCon_2022_-_Distributed_Web_Scraping.pdf
Watch
Talk - Meredydd Luff: Building a Python Code Completer
Code completion is almost magic, and it makes writing code feel so good. But how does it actually work? I built a code completion engine from scratch - and in this talk, I'll tell you its secrets. We'll learn how Python parses and compiles code, what an AST is, and how we can use this knowledge to work out what a programmer might type next. And to prove it's not that complicated, I'll build a little code completer, live on stage, in about five minutes. I'll also talk about how code completion is like games programming, how we should broaden our thinking about "types" in Python, and how we can use information that isn't in your code to make coding even more satisfying.
Watch
Talk - Joseph Lucas: Serialization More than pickling
Have you ever needed to persist an object or instance? You probably researched serialization (converting an object to a byte-stream). The default for python is pickle, but there are other serialization options. In this talk, we'll explore some of those other options as well as their efficiency and security considerations.
Watch
Talk - Paul Kehrer/Alex Gaynor: Shipping Python Extensions in Rust Two Million Times a Day
For as long as Python has been around, a strength has been the ecosystem of packages written not in Python, but in C -- whether that's PIL, or numpy, or simplejson, or one of the thousands of others. But why C? Why not some other language? In the last several years, Rust has emerged as a serious competitor to C. This talk will explore how we went about the process of using Rust in the pyca/cryptography package, the challenges we faced, the successes we found, and what this means for your projects.
Watch
Talk - Ajinkya Rajput/Ashish Bijlani:Bad actors vs our community: detecting software supply chain...
Rapid prototyping or development is one of the most favourite features of the Python software ecosystem. This is possible due to efficient reuse of software libraries enabled by package managers such as PyPi. While PyPI maintainers have streamlined the process of publishing and distributing a package for developers, bad actors evidently exploit this infrastructure to propagate malware. For example, simply by publishing a malicious package with a name similar to a popular package, bad actors can exploit carelessness or inexperience of developers and elevate a simple installation typo to a remote code execution attack. In this talk, we will present technical details of our large-scale vetting system that analyzes millions of published software package versions for malware and other “risky” attributes, such as sudo access, source inconsistencies, abandonware, and unsafe installation hooks. We will share our experience while building this system, and present examples of new malware we have detected as case studies. Finally, we will introduce our free tool OSSIE, a Python PyPi package, for developers to audit project dependencies and notify them when dependencies turn malicious. The presented tool is extremely user friendly and is an attempt towards furthering usable security. Slides: https://pycon-assets.s3.amazonaws.com/2022/media/presentation_slides/115/2022-04-30T23%3A12%3A58.090937/pycon22.pdf
Watch
Talk - Liran Haimovitch: Effective Protobuf: Everything You Wanted To Know, But Never Dared To Ask
A talk of 40 minutes covering the following topics: 1. Introduction to serialization and its place in software engineering 2. Static typed vs dynamic typed serialization 3. Textual vs binary serialization: pros and cons 4. Popular serialization frameworks 5. Why Protobuf 6. Quick intro to Protobuf (just enough to get by) 7. Protobuf performance challenges and tradeoffs 8. Async synchronization: pros and cons 9. Field encoding: under the hood and what we learn 10. Managing the cost of abstractions 11. Data deduplication and compression 12. Field reuse: the whys and hows 13. gRPC: pros and cons 14. Protobuf over websocket or HTTP 15. Thank you Slides: https://pycon-assets.s3.amazonaws.com/2022/media/presentation_slides/51/2022-04-29T22%3A04%3A43.392216/Effective_Protobuf.pdf
Watch
Talk - Nir Barazida: Dock Your Jupyter Notebook
To perfect your Jupyter Notebook craft, you'd want to make your work reproducible and shareable outside your local machine. In this talk, we will learn how to use Docker to build an isolated and pre-defined environment suited for ML project that runs smoothly on a remote machine.
Watch
Talk - Antoine Toubhans: Flexible ML Experiment Tracking System for Python Coders with DVC and St...
There are so many tools to do data science today that it can be difficult to navigate. Many of them are AI platforms that “do everything by clicking on a UI” and do not leverage pre-existing tools e.g., GIT for versioning, or good old python IDE instead of Jupyter Notebooks. On the other hand, ML engineering is not classical software engineering: in addition to the code, the data should also be versioned; in its essence, ML engineering is an exploratory work: one can not know if the model is going to work before testing it; there is no clear way to guarantee the quality of the trained model: the data-scientist has to play with it to make it “talk”. In this talk, we will build a fully customizable and complete system in python to track Machine Learning experiments. For the purpose of this talk, we will train a neural network (Tensorflow) to classify images between cat and dog, though, the main focus is on the tooling and not the ML algorithm. We will use: DVC (Data Version Control) to 1) version the data alongside the code with GIT 2) build training pipelines to orchestrate the python scripts 3) version experiments. Streamlit to build data exploration apps to play with the trained models. Both DVC and Streamlit are open-source libraries with python APIs. In the second part of the talk, we will focus on various ways of combining DVC and Streamlit. For instance, we will see how to build a Streamlit app that allows selecting any trained model tracked with DVC (provided its GIT commit), loading it, and testing it on given input images. I will provide code samples and live demos throughout the talk.
Watch