PyCon US 2023
2023
List of videos

Mariatta Wijaya: Welcome to PyCon US 2023
Welcome speech from PyCon US 2023 Chair Mariatta Wijaya.
Watch
Keynote Speaker - Ned Batchelder
Ned Batchelder has been active in the Python community for more than 20 years. He is an organizer of Boston Python, and the maintainer of coverage.py and a handful of other tools. He works at 2U/edX on the Open edX project which powers edx.org and thousands of other online learning sites around the world. He blogs at https://nedbatchelder.com, and is on Mastodon as @nedbat@hachyderm.io.
Watch
Keynote Speaker - Python Steering Council
The Python Steering Council is a 5-person elected committee that assumes a mandate to maintain the quality and stability of the Python language and CPython interpreter, improve the contributor experience, formalize and maintain a relationship between the Python core team and the PSF, establish decision making processes for Python Enhancement Proposals, seek consensus among contributors and the Python core team, and resolve decisions and disputes in decision making among the language.
Watch
Keynote Speaker - James Powell
James Powell's hacker name is “dontusethiscode” (or sometimes “dutc”)… which is also the name of the company where he works. Since 2014, Don't Use This Code has been confusing procurement staff, contract attorneys, and even the occasional managing director—“is this the name of a real company?” Yes, yes it is, and the Don't Use This Code team provides consulting and training services in software development, scientific computing, data analysis, and data engineering. James is also a prolific speaker, having spoken at over eighty conferences worldwide. He has worn a suit and tie for every programming talk he's ever given, which is good, because it sets the bar on expectations really low. One time at PyCon, Guido even asked about the suit, to which James replied, “I am the most subversively dressed person here. James is a dedicated supporter of the open source and open source scientific computing communities. In his volunteer time, he serves as the chairman of the NumFOCUS board of directors, as well as a lead organizer for NYC Python. He has also advised, chaired, and served on the organizing committee for over fifty open source conferences worldwide.
Watch
Keynote Speaker - Margaret Mitchell
Margaret Mitchell is a researcher focused on the ins and outs of machine learning and ethics-informed AI development in tech. She has published over 50 papers on natural language generation, assistive technology, computer vision, and AI ethics, and holds multiple patents in the areas of conversation generation and sentiment classification. She currently works at Hugging Face as Chief Ethics Scientist, driving forward work in the ML development ecosystem, ML data governance, AI evaluation, and AI ethics. She previously worked at Google AI as a Staff Research Scientist, where she founded and co-led Google's Ethical AI group, focused on foundational AI ethics research and operationalizing AI ethics Google-internally. Before joining Google, she was a researcher at Microsoft Research, focused on computer vision-to-language generation; and was a postdoc at Johns Hopkins, focused on Bayesian modeling and information extraction. She holds a PhD in Computer Science from the University of Aberdeen and a Master's in computational linguistics from the University of Washington. While earning her degrees, she also worked from 2005-2012 on machine learning, neurological disorders, and assistive technology at Oregon Health and Science University. She has spearheaded a number of workshops and initiatives at the intersections of diversity, inclusion, computer science, and ethics. Her work has received awards from Secretary of Defense Ash Carter and the American Foundation for the Blind, and has been implemented by multiple technology companies. She likes gardening, dogs, and cats.
Watch
Keynote Speaker - Carol Willing
Carol Willing is the VP of Engineering at Noteable, a three-time Python Steering Council member, a Python Core Developer, PSF Fellow, and a Project Jupyter core contributor. In 2019, she was awarded the Frank Willison Award for technical and community contributions to Python. As part of the Jupyter core team, Carol was awarded the 2017 ACM Software System Award for Project Jupyter's lasting influence. She's also a leader in open science and open-source governance serving on Quansight Labs Advisory Board and the CZI Open Science Advisory Board. She's driven to make open science accessible through open tools and learning materials.
Watch
Keynote Speaker - Deb Nicholson: Community Service Awards & Python Software Foundation Update
Deb Nicholson is the Executive Director at the Python Software Foundation, the non-profit steward of the Python programming language. She is a free software policy expert and a passionate community advocate. After years of local organizing on free speech, marriage equality, government transparency and access to the political process, she joined the free software movement in 2006. She has previously served the open source ecosystem through her work at the Open Source Initiative, Software Freedom Conservancy, and the Open Invention Network. She’s won the O’Reilly Open Source Award and the Award for the Advancement of Free Software for her efforts to broaden the free and open source software movement. She is also a founding organizer of the Seattle GNU/Linux Conference, an annual event dedicated to surfacing new voices and welcoming new people to the free software community. She lives with her husband and her lucky black cat in Cambridge, Massachusetts.
Watch
Keynote Speaker - Guido van Rossum
Guido van Rossum created Python in 1990 while working at CWI in Amsterdam. He was the language's BDFL until he stepped down in 2018. He has held various tech jobs, including Senior Staff Engineer at Google and Principal Engineer at Dropbox. He is currently a Distinguished Engineer at Microsoft, where he is still actively involved in Python's development. Born and raised in the Netherlands, he moved to the US in 1995 and currently lives with his family in the Bay Area.
Watch
Tutorials - Juhi, Dana: Intro to Hugging Face: Fine-tuning BERT for NLP tasks
You’ve heard about ChatGPT’s conversational ability and how DALL-E can create images from a simple phrase. Now, you want to get your hands dirty training some state of the art (SOTA) deep learning models. We will use Jupyter notebooks to fine-tune an NLP model based on BERT to do sentiment analysis. In this hands-on tutorial, we will learn about using HuggingFace models from pre-trained open-source checkpoints and adapting these models to our own specific tasks. We will see that using SOTA NLP and computer vision models has been made easier with a combination of HuggingFace and PyTorch. At the end of this session, you will know how to fine-tune a large public pre-trained model to a particular task and have more confidence navigating the deep learning open source landscape. Gideon's Kitchen | Key & Peele: https://youtu.be/DpaALjnfkcI
Watch
Tutorials -Zac Hatfield-Dodds, Ryan Soklaski: Introduction to Property-Based Testing
Has testing got you down? Ever spent a day writing tests, only to discover that you missed a bug because of some edge case you didn’t know about? Does it ever feel like writing tests is just a formality - that you already know your test cases will pass? Property-based testing might be just what you need! After this introduction to property-based testing, you’ll be comfortable with Hypothesis, a friendly but powerful property-based testing library. You’ll also known how to check and enforce robust properties in your code, and will have hands-on experience finding real bugs. Where traditional example-based tests require you to write out each exact scenario to check - for example, assert divide(3, 4) == 0.75 - property-based tests are generalised and assisted. You describe what kinds of inputs are allowed, write a test that should pass for any of them, and Hypothesis does the rest! from hypothesis import given, strategies as st @given(a=st.integers(), b=st.integers()) def test_divide(a, b): result = a / b assert a == b * result There’s the obvious ZeroDivisionError, fixable with b = st.integers().filter(lambda b: b != 0), but there’s another bug lurking. Can you see it? Hypothesis can! Audience: This tutorial is for anybody who regularly writes tests in Python, and would like an easier and more effective way to do so. We assume that you are comfortable with traditional unit tests - reading, running, and writing; as well as familar with ideas like assertions. Most attendees will have heard "given, when, then" and "arrange, act, assert". You may or may not have heard of pre- and post-conditions - we will explain what "property-based" means without reference to Haskell or anything algebraic.
Watch
Tutorials - Trey Hunner: Intro to Python for Brand New Programmers
Brand new to programming and want to get some hands-on Python experience? Let's learn some Python together! During this tutorial we will work through a number of programming exercises together. We'll be doing a lot of asking questions, taking guesses, trying things out, and seeking out help from others. In this tutorial we'll cover: - Types of things in Python: strings, numbers, lists - Conditionally executing code - Repeating code with loops - Getting user input This tutorial is intended to ease you into Python. Each exercise section is careful not to assume prior programming knowledge. I expect you to have experience typing on computers and to have rudimentary math skills (just arithmetic). I am not expecting you to have experience with programming. We will define new terms as we use them You'll leave this tutorial, having written a couple small programs Python yourself. Hopefully you'll also leave with a bit of excitement about what Python can do and curiosity to keep diving deeper.
Watch
Tutorials - Reuven M. Lerner: Comprehending comprehensions
Comprehensions are one of the most important — and misunderstood — parts of Python. In this tutorial, I'll walk you through comprehensions, including how to write them, and why you would want to do so. By the time you finish this tutorial, you'll fully understand list, set and dict comprehensions, as well as nested comprehensions and generator expressions. You'll understand the differences between regular "for" loops and comprehensions, and where to use them.
Watch
Tutorials - Patrick Arminio: Build a production ready GraphQL API using Python
This workshop will teach you how to create a production ready GraphQL API using Python and Strawberry. We will be using using Django as our framework of choice, but most of the concept will be applicable to other frameworks too. We'll learn how GraphQL works under the hood, and how we can leverage type hints to create end to end type safe GraphQL queries. We'll also learn how to authenticate users when using GraphQL and how to make sure our APIs are performant. If we have enough time we'll take a look at doing realtime APIs using GraphQL subscriptions and how to use GraphQL with frontend frameworks such as React.
Watch
Tutorials - Simon Willison: Data analysis with SQLite and Python
SQLite is the world's most widely used database and has been a part of the Python standard library since 2006. It continues to evolve and offer more capabilities every year. This tutorial will transform you into a SQLite power-user. You'll learn to use SQLite with Python, level up your SQL skills and take advantage of libraries such as sqlite-utils and tools such as Datasette to explore and analyze data in all kinds of shapes and sizes. This hands-on tutorial will cover: - The sqlite3 module in the Python standard library - A review of SQL basics, plus advanced SQL features available in SQLite - Using sqlite-utils for advanced manipulation of SQLite databases - Datasette as a tool for exploring, analyzing and publishing data - Applying the Baked Data architectural pattern to build a data application using Datasette and deploy it to the cloud This tutorial is aimed at beginner-to-intermediate Python users with some previous exposure to basic SQL. Attendees will leave this workshop with a powerful new set of tools for productively exploring, analyzing and publishing data.
Watch
Tutorials - Mario Munoz: Web Development With A Python-backed Frontend: Featuring HTMX and Tailwind
Want to bring hypermedia into your web design workflow, ditching the complexity of JSON-over-HTTP for a more RESTful approach? Create and design your web application with htmx and spark joy in your design process. Splash in a little Tailwind CSS, too. (Ssshh. You're a Full-Stack Developer now.)
Watch
Tutorials - Leah Berg, Ray: Feature Engineering is for Everyone!
In Machine Learning, features are the inputs for a machine learning model. Feature Engineering is the process of creating features from raw data and selecting the best features for a model. However, it is not just a tool for Data Scientists - Data Analysts and Developers can use it too. In this tutorial, we will create features that can be used for creating Data Visualizations, Rules Based Automations, and Machine Learning Models. Attendees will learn how to explore, create and select features from various data types such as “discrete/categorical” and “continuous” data. Attendees will learn techniques such as One-hot encodings for categories, text vectorization, date manipulation and more. By the end of this tutorial, attendees will understand how to create features for their projects.
Watch
Tutorials - Pavithra Eswaramoorthy, Dharhas Pothina: Data of Unusual Size: Interactive Visualization
While most folks aren't at the scale of cloud giants or black hole research teams that analyze Petabytes of data every day, you can easily fall into a situation where your laptop doesn't have quite enough power to do the analytics you need. "Big data" refers to any data that is too large to handle comfortably with your current tools and infrastructure. As the leading language for data science, Python has many mature options that allow you to work with datasets that are orders of magnitudes larger than what can fit into a typical laptop's memory. In this hands-on tutorial, you will learn the fundamentals of analyzing massive datasets with real-world examples on actual powerful machines on a public cloud – starting from how the data is stored and read, to how it is processed and visualized. You will understand how large-scale analysis differs from local workflows, the unique challenges associated with scale, and some best practices to work productively with your data. By the end, you will be able to answer: What makes some data formats more efficient at scale? Why, how, and when (and when not) to leverage parallel and distributed computation (primarily with Dask) for your work? How to manage cloud storage, resources, and costs effectively? How interactive visualization can make large and complex data more understandable (primarily with hvPlot)? How to comfortably collaborate on data science projects with your entire team? The tutorial focuses on the reasoning, intuition, and best practices around big data workflows, while covering the practical details of Python libraries like Dask and hvPlot that are great at handling large data. It includes plenty of exercises to help you build a foundational understanding within three hours.
Watch
Tutorials - Lisa Carpenter: How to create beautiful interactive GUIs and web apps
In this 3.5 hour tutorial, attendees will learn how to use the streamlit library in Python to create interactive graphical user interfaces (GUIs) for their data science projects. Through a series of hands-on exercises, attendees will gain practical experience using streamlit to build and customize their own interactive GUIs. The tutorial will begin by introducing attendees to the basics of streamlit, including how to install and set up the library, as well as the key concepts and components of streamlit applications. Attendees will then learn how to use streamlit to create simple, yet effective, GUIs for their data science projects, including how to display and interact with data, add text and images, and create custom layouts and widgets. As the tutorial progresses, attendees will have the opportunity to work on more advanced topics, such as using streamlit to create custom interactive plots and charts, and integrating streamlit with other popular libraries such as Pandas and Altair. By the end of the tutorial, attendees will have a solid understanding of how to use streamlit to create effective and engaging interactive GUIs for their own data science projects. The tutorial will be led by an experienced data scientist with a strong background in Python and streamlit, and will include plenty of hands-on exercises to help attendees apply what they learn in a practical setting. Attendees will also have access to detailed tutorial materials and code samples, as well as support from the instructor and other attendees throughout the tutorial.
Watch
Tutorials - Ted Patrick: Writing Serverless Python Web Apps with PyScript
Python web applications running in the browser or mobile without a Server? What seemed to be a dream just a few months ago is not possible thanks to PyScript. It is now possible to write websites, apps, and games running entirely on the browser. This Tutorial is an introduction to PyScript and will walk you through learning about basic concepts, how to set up your project and create amazing applications. More specifically: - Create your project configuration - Define a python environment with all the dependencies to run your code - Loading and manipulating with user data - Writing your python code - Accessing the DOM and other browser features - Using Javascript libraries from your Python application - Optimizing your application - Look at what’s different between “standard” python vs. Python on the browser - Have a lot of fun and hack together on your ideas!
Watch
Tutorials - Ítalo Epifânio: Write your first package using literate programming
Literate programming is a programming paradigm that incorporates explanations in natural language (such as Spanish) embedded with the traditional code. Literate programming allows developers to tell a story with their codes, improving the understanding of the project, focusing on documentation, and making it easier to onboard developers. Although being a very well-regarded concept discussed by respected researchers like Donald Knuth, literate programming tools like Jupyter notebooks are considered inefficient for serious software development. This perception has limited Jupyter notebooks to simple python scripts and educational materials. The Nbdev library has proven that literate programming is useful in developing big and serious projects, like FastAi. This tutorial will show attendees how to get the benefits of literate programming while also following software development best practices. We'll get hands-on experience in writing and publishing a Python Package while using Jupyter Notebooks. In addition to publishing the package, we'll also learn how to deploy the docs, run simple tests and run these tests on CI/CD, making sure that our package will only get published if the tests pass. Even though this tutorial uses Jupyter Notebooks and Nbdev the student doesn't need previous knowledge of these tools. A simple computer with Python and pip installed is all we'll use. Students should have some minimal Python knowledge and Git experience (Simple commands like push, pull, add and commit). A GitHub account will also be necessary.
Watch
Tutorials - Mike Müller: The How and Why of Object-oriented Programming in Python
Python supports multiple programming paradigms. You can write procedural programs, use functional programming features, or use the full power object-oriented programming. In this tutorial you will learn how to get the most out of object-oriented programming. After this tutorial you will able to: - design you own objects - take advantage of inheritance for code re-use - implement special methods for pythonic objects - convert programs from a procedural to a object-oriented approach This tutorial is based on a small but comprehensive example. Starting from a procedural solution with lists and dictionaries, the tutorial gradually introduces how to create own objects to solve the same problem. The next steps introduce the concepts of inheritance and special methods to take full advantage of object-oriented programming.
Watch
Tutorials - Geir Arne Hjelle: Introduction to Decorators: Power Up Your Python Code
You can use decorators in your Python code to change the behavior of one or several functions. Many popular libraries are based on decorators. For example, you can use decorators to register functions as web endpoints, mark functions for JIT compilation, or profile your functions. Using decorators makes your code simpler and more readable. However, to unlock the full capability of decorators, you should also be comfortable writing your own. In this tutorial, you'll learn how decorators work under the hood, and you'll get plenty of practice writing your own decorators. You'll be introduced to necessary background information about how functions are first-class objects in Python and how you can define inner functions. You'll learn how to unwrap the @decorator syntactic sugar and how to write solid decorators that you can use in your code. Being comfortable with using and creating decorators will make you a more efficient Python programmer.
Watch
Tutorials - Cheuk Ting Ho: Power up your work with compiling and profiling
Have you been troubled by Python code that took too long to run? Do you want to know why and how to improve? In this workshop, we will introduce Numba - a JIT compiler that is designed to speed up numerical calculations. Most people found all of it is like a mystery - It sounds like magic, but how does it work? Under what conditions does it work? And because of it, new users found it hard to start using it and it requires a steep learning curve to get the hang of it. This workshop will provide all the knowledge that you need to make Numba works for you. This workshop is for Data scientists or developers who have math-heavy code that would like to speed up with the benefit of Numpy and Numba.
Watch
Tutorials - Olga Matoula, Aya Elsayed: Automate Documentation with Sphinx & GitHub Actions
You've built an awesome API; time to give it some exposure! But, how do you keep a documentation website up-to-date as your code evolves? This tutorial will teach you how to write, generate, host, automate and version your documentation easily so it becomes part of your software development life cycle.
Watch
Tutorials - Mx Chiin-Rui Tan: Exploring Eco topics with Python
From Deforestation to Wildlife Trade to Carbon Polluters, learn how to use Python to explore current Eco topics! As Earth's sustainability edges ever closer to tipping point, it has never been more important for us inhabitants to be aware of the impact we have on the environment, and the deteriorating state of our planet. This tutorial will democratize access to practical Python skills for the relevant sciences, applying these skills to pressing Eco issues, and ultimately empower non-subject experts with working proficiency of relevant open-source tools for discovering more facts about our natural world, at a time when disinformation is rife. Key Python takeaways: intro to and/or application of numpy, pandas, matplotlib, networkx, geopandas, xarray, and rioxarray. Format: interactive computer lab, with attendees working hands-on through pre-prepared Jupyter Notebook content at a group pace led by the instructor. Audience: no prior Eco/Scientific domain knowledge or experience with the Python packages being taught required, but attendees must have basic Python programming proficiency and ability to set-up access to JupyterLab with the required mainstream dependencies (as per instructions provided in advance).
Watch
Tutorials - Ron Nathaniel: How To Troubleshoot and Monitor Applications using OpenTelemetry
OpenTelemetry is a free, open-source Observability Protocol. OpenTelemetry sits at the application layer, and exports Traces, Metrics, and Logs to a backend for observing. It is extremely helpful and beneficial to developers in mean "time-to-detection" and "time-to-resolution" of bugs and issues that occur at the application layer; this ranges from detecting and alerting for errors raised (such as TypeError), to finding that a specific microservice (such as AWS Lambda) ran for twice as long as usual, all the way to seeing the output of a service and comparing it to the expected output to find a bug in the logic of the service. This tutorial is geared towards beginner/intermediate Python developers, who have some experience in Python, its syntax, and very minimal experience in Requests and Flask is needed (extremely popular libraries, with 50k and 60k stars on GitHub, respectively). No OpenTelemetry experience is needed at all. This is a total and complete introduction into OpenTelemetry, consisting instrumenting your first application, viewing your first traces and metrics, and if time-allows then deploying your first Jaeger instance locally (no experience is needed, only Docker desktop), to allow students of this workshop tutorial to build their own in-house observability platform, be-it for their selves or employers. It is important that every developer have at least a solid understanding of Traces, Metrics, and Logs, which we know today as the three pillars of observability. These are the foundational building blocks for monitoring Production environments at the application layer. The extended base workshop is available here and the base slides are available here. Thank you.
Watch
Tutorials - Reka Horvath: Building human-first and machine-friendly CLI applications
Command line tools have two audiences: * humans using it directly * other tools, scripts working together with it In this tutorial, you'll learn how to build CLI applications for both of these user groups. We'll get to know the Command Line Interface Guidelines (https://clig.dev/), an open source, language-agnostic collection of principles and guidelines. We'll build an application following those principles in Python, using typer and Rich. Our short-term goal for this workshop is to build a CLI catalogue of past PyCon talks. The long-term goal is to provide tools (incl. code snippets and checklists) that you can use for your own CLI applications.
Watch
Tutorials - Ethan Swan: Building a Model Prediction Server
In predictive modeling, training a model is only half the battle; predictions typically need to be “served” to other systems in production via an API or similar interface. In this tutorial we’ll start with a trained scikit-learn model and build a working FastAPI application to deliver its predictions in realtime. No prior experience with API development is expected.
Watch
Tutorials - Matt Harrison: Getting Started with Polars
Have you heard of this Polars thing? How is it different from Pandas? Do you want to check it out? In this workshop you will get exposed to Polars with a real-world dataset. You will see: - Common operations - Differences with Pandas - How to speed up your data pipeline - Feature gaps that you might miss coming from Pandas You will be provided with a notebook and labs to explore Polars.
Watch
Tutorials - Kevin Lacaille, Mansi Shah: Eroding Coastlines: A Geospatial & Computer Vision Analysis
Attendees will gain hands-on experience exploring satellite imagery and using Python tools for geospatial data analysis. They will apply what they’ve learned to identify & analyze instances of coastal erosion, one of the most pressing environmental & humanitarian challenges facing our planet today.
Watch
Tutorials - Dave, Bianca, Valerio, Mahe: Publishing your Python project, the conda way
Conda is an open source, language agnostic package and environment management system for Linux, macOS, and Windows. The conda ecosystem, including the conda-forge package repository, is widely used to install, run and update packages and their dependencies. In this tutorial you will learn how to create a full-fledged and easy to install Python software package using conda. We will start by introducing software packaging and packaging concepts (package dependencies, open source licensing, ...), and an introduction to the conda ecosystem and how conda implements software packaging. Most of our time will be spent doing a hands-on walk through of how to prepare a Python software package for conda, and then how to submit that package to the conda-forge, a widely used community driven package repository. The workshop is a hands-on workshop, where participants use their own laptops to prepare a full-fledged Python software package that is submission-ready for the conda-forge package repository. Participants need to bring a WiFi enabled laptop with a web browser, a command line interface, a text editor program, and git and/or a GitHub client already installed. Workshop participants will gain a basic understanding of software packaging, and how to prepare and publish their packages in the conda ecosystem.
Watch
Sponsor Presentation - Jason Davenport: Developing on Google Cloud with Python and DataFrames
Sponsor: Google Learn how Google Cloud is creating a better Python experience for developers interested in building applications, pipelines, and analytics. This session will cover how developers can use Python with BigQuery, and what's new for developers for DataFrame support in Google Cloud, and ways you can extend these techniques to make great applications or analyses for business impact.
Watch
Sponsor Presentation—S. Ostrowski: Accelerate your workflow from local Python prototype to the cloud
By Savannah Ostrowski Sponsor: Microsoft Have you ever struggled taking a Python application from your local machine to the cloud? Had a hard time figuring out what infrastructure you need or how to configure it for your app? Spent too much time researching how to set up your local development environment for cloud development? Learn how to use real-world cloud development application templates via CLI to go from local development environment to the cloud. Scaffold your application, provision resources, deploy code, monitor your application health, and set up a CI/CD pipeline, all in a couple of steps and just a few minutes.
Watch
Sponsor Presentation - Brian McNamara, Dan Furman: Using Python to Power Serverless Applications
Sponsor: Capital One Capital One uses Python to power a large number of serverless applications to improve developer experience and increase customer value. Because Python enables fast development cycles, engineers and data scientists at Capital One have more time to focus on delighting our customers. Our Python development is also more potent on serverless as it eliminates multiple overhead requirements. In this talk, we will cover best practices we've learned along the way for using Python to build serverless solutions to enable a fast, intuitive and iterative developer experience for: API calls Streaming data Machine learning inference Attendees will learn the techniques and tools available when using Python to build a production-grade serverless system complete with observability and development practices baked in without ever provisioning a server. The presentation will feature a demonstration of Python-based AWS Lambda functions-as-a-service.
Watch
Sponsor Presentation - Johannes Messner: Modern, typed Python for (multimodal) ML
Full title: Modern, typed Python for (multimodal) ML: From training to deployment Sponsor: Jina AI Typing is at the center of „modern Python“, and tools (mypy, beartype) and libraries (FastAPI, SQLModel, Pydantic, DocArray) based on it are slowly eating the Python world. This talks explores the benefits of Python type hints, and shows how they are infiltrating the next big domain: Machine Learning Target audience: Mainly machine learning practitioners that care about improving their code quality and making use of the ever evolving Python ecosystem. This includes people that focus on model training as well as people that focus on model deployment and serving. The secondary target audience is anyone that likes to know more about Python type hints and how they can be helpful in their code base. Intended takeaways: The audience should leave the talk with three main learnings: - Why Python type hints are useful - Why they are particularly useful in the ML domain - How they can leverage libraries like DocArray in practice Preliminary outline: The talk can be seen as two parts: Part 1: Typing in Python - min 0-5: Introduction - min 5-15: Typing and type hints in Python: Short history, and why is it useful? - min 25-25: Tool landscape: Type checkers (mypy, beartype) and other libraries (Pydantic, FastAPI) Part 2: Python type hints in ML - min 25-40: Why is typing useful in ML? Tensor shapes, multi-modal data, and more - min 40-60: How to get the most out of typing focused tools for ML: jaxtyping and DocArray - How to organize your data using type hints - How to keep track of your tensor shapes using type hints - How to bridge the gap between training and deployment thanks to typing focused libraries
Watch
Sponsor Presentation - How to build stunning Data Science Web applications in Python
By Vincent Gosselin and Florian Jacta Sponsor: Taipy This workshop presents Taipy, a new low-code Python package that allows you to create complete Data Science applications, including graphical visualization and managing algorithms, pipelines, and scenarios. It is composed of two main independent components: - Taipy Core - Taipy GUI. In this workshop, participants will learn how to use: - Taipy Core to create scenarios, use models, retrieve metrics easily, version control their application configuration, - Taipy GUI to create an interactive and powerful user interface in a few lines of code. - Taipy Studio, a brand-new pipeline graphical editor inside VS Code that facilitates the creation of scenarios and pipelines. They will be used to build a complete interactive AI application where the end user can explore data and execute pipelines (make predictions) from within the application. With Taipy, the Python developer can transform simple pilots into production-ready end-user applications. Taipy GUI goes way beyond the capabilities of the standard graphical stack: Gradio, Streamlit, Dash, etc. Similarly, Taipy Core is simpler yet more powerful than the standard Python back-end stack.
Watch
Sponsor Presentation - Tuana Celik: Building LLM-based Agents
Full title: Building LLM-based Agents: How to develop smart NLP-driven apps with Haystack Sponsor: deepset Large Language Models (LLMs), like ChatGPT, have taken the Internet by storm. People are amazed at how these models can generate language with the snapshot of general world knowledge they have accumulated. However, in many real-world use cases a snapshot of general world knowledge is not enough. How do you equip them with the relevant tools to solve a task - like SQL, a CRM or web search? Or for a task, how do you control which tool is used or which knowledge base is referred to? Using LLMs as so-called “Agents” allows you to use them as a decision maker in your application that react to all sorts of user requests. Given a user request, the agents can create an action plan. In this talk, we will learn how to build agent-driven applications with Haystack. We will show how to build an agent and connect it to other tools or knowledge bases. We will illustrate the concept with practical use cases and examples. Each step will be accompanied by code examples. By the end of the talk, you will have seen these concepts applied in practice, and you will be able to build an agent-driven application for your own use case.
Watch
Sponsor Presentation - Paul Everitt: Joyful Django DX with PyCharm
Sponsor: JetBrains Django and Python make fullstack and API web projects a breeze. But as Python has matured, significant tooling has risen to improve the development experience (DX). Can you use this tooling, in a modern editor and IDE, to stay in the flow and make your development…joyful? In this session we’ll put PyCharm to work, at helping us work. Navigation, refactoring, autocomplete – the usual suspects. We’ll also see “test-first” development to stay in the IDE, plus how this can apply to frontends. Finally, we’ll follow along with topics from Adam Johnson’s book “Boost Your Django DX”, with a surprise at the end.
Watch
Sponsor Presentation - Fixing legacy code, one pull request at a time
By Guillaume Dequenne Sponsor: Sonar Dealing with legacy codebases can be a chore. In this workshop, we talk about modernizing an old Flask application one pull request at a time while incorporating Python best practices. We will talk about how to integrate Code Quality tools in your workflow and in your IDE, so that every pull request is checked to good standards. By the end of this Workshop, we hope everyone can set up a workflow that removes technical debt over time and makes their codebases sustainable. This presentation can be followed interactively. In order to do so, make sure that you meet the following prerequisites: - Have an account on GitHub and be logged-in - Fork the workshop GitHub repository in your personal account, with all its branches (by default, only the main branch is forked): https://github.com/SonarSource/pycon-sonar-workshop - Optionally, have Git and an IDE (PyCharm/VSCode) to clone the application locally
Watch
Sponsor Presentation - Python & Bloomberg: An Open Source Duo
Presented by: - Pradyun Gedam - Bernat Gabor - Laszlo Kiss Kollar - Mario Corchero - Matt Wozniski - Pablo Galindo Salgado Sponsor: Bloomberg Join this talk where we will briefly introduce Bloomberg and have some of our engineers discuss their engagement in the Python Open Source ecosystem. We will also present some exciting troubleshooting tools that are widely used at Bloomberg that we are publishing as open source. You will leave this talk having learned about the technical details and new features related to these open source tools, which you might use daily in the future!
Watch
Sponsor Presentation—Advancements in High-Performance AI/ML through PyTorch's Python Compiler
Full title: Breaking Boundaries: Advancements in High-Performance AI/ML through PyTorch's Python Compiler By Sam Gross, Justin Jeffress and Suraj Subramanian Sponsor: Meta As GPUs continue to become faster, PyTorch, one of the most widely used frameworks in AI/ML has faced challenges keeping up with performance demands. To mitigate this, parts of PyTorch have been moved into C++. This approach goes against the original intent of PyTorch as a Python-based framework and complicates contributions from the community. The PyTorch development team recognized the need to address these challenges while maintaining PyTorch's Python roots and set ambitious goals to improve performance, decrease memory usage, enable state-of-the-art distributed capabilities, and ensure more PyTorch code is written in Python. To achieve these goals, they developed a Python compiler. Attendees of this talk will get an inside look at how the PyTorch development team approached these challenges and implemented their innovative solution to achieve a 43% speedup in performance. We will discuss the benefits and challenges of this approach, as well as the techniques and technologies used to build the PyTorch Python compiler. This talk will provide valuable insights into the development process of and offer attendees a deeper understanding of how PyTorch continues to evolve and innovate.
Watch
Sponsor Presentation - Python Profiling State of the World
By Shana Matthews and Indragie Karunaratne Sponsor: Sentry Most Python devs are familiar with built-in profiling tools like cProfile, but the world of profilers has expanded rapidly. In this hands-on workshop we'll explore various open-source profiling technologies with different overhead and accuracy tradeoffs, as well as several ways of visualizing profile data like flow charts, call trees, and flamegraphs.
Watch
Sponsor Presentation - Python Meets Heterogeneous Computing
Full title: Python Meets Heterogeneous Computing: An Exploration of Distributed Computations By William Cunningham and Santosh Kumar Radha Sponsor: Covalent In this talk, we will delve into the exciting world of advanced hardware and its use for distributed computations. As the heterogeneous computing landscape evolves with the introduction of quantum computers, GPUs, and specialized hardware, we will explore the interesting patterns in Python that allow us to interact with this heterogeneity and maximize performance effectively. The talk will address the challenges of distributed computations in the heterogeneous computing era, including monitoring real-time calculations, rapid iteration, prototyping in complex experiments, and ensuring smooth production runs with long access queues in specialized hardware. However, we will not only highlight the difficulties but also share valuable strategies for overcoming these obstacles and achieving optimal performance in these environments using open-source tools.
Watch
Sponsor Presentation - Flagging in the Backend: Shipping API's with Flask and LaunchDarkly
By Cody De Arkland Sponsor: LaunchDarkly Everyone acts like the party is in the Frontend, but the Backend is what keeps your platforms running. Python has a number of options for building backend API's but the most common are Flask and Django. API's are the backbone of most applications, and shipping new ones is often risky - but they don't have to be. In this session, Cody will take you through where he started his coding journey with Python and Flask, go hands on with examples of backend API's, and show how you can ship faster and safer using feature flags.
Watch
Sponsor Presentation—Breaking Away from the Empire: Avoiding the Evil Clutches of If-Then Statements
By Jason Koo and Alison Cossette Sponsor: neo4j Become a Rebel and learn how to create a hyperspace navigation app to avoid Imperial patrols using the power of graphs in Python! Say goodbye to complex if-then statements and embrace a more elegant and flexible approach. In this talk we'll introduce graphs as a data structure, how to model this kind of data, and how to use them in place of more complicated logic code. Discover how graphs can simplify decision-making logic in your code for improved readability and extensibility. Join us on a journey to break free from the empire of if-then statements and unlock the full potential of your data. Expand your toolset and liberate your code!
Watch
Sponsor Presentation—The ChatGPT Privacy Tango: Dancing with Data Security and Large Language Models
By Jason Mancuso and Mike Gardner Sponsor: Cape Privacy In the world of AI and natural language processing, privacy and utility often find themselves engaged in a delicate dance. As users attempt to leverage the power of Large Language Models (LLMs) like ChatGPT for sensitive or confidential data, they face the challenge of maintaining privacy without compromising the value these models bring. Enter the "ChatGPT Privacy Tango" – a metaphor for the intricate steps needed to balance these competing interests. During this talk, we'll delve into a system we've designed to help users navigate the Privacy Tango, striking a balance between preserving privacy and maximizing utility with ChatGPT and LLMs. We'll discuss the significance of protecting personally identifiable information (PII) and maintaining data security while still enjoying the advantages of AI-powered language models. Additionally, we'll cover the pros and cons of this approach and touch on alternative systems that we've considered. We invite you to join us as we explore the fascinating interplay of data privacy, secure enclaves, and PII removal, shedding light on the path towards more privacy-aware applications with Large Language Models. By the end of our session, you'll be better prepared to dance the ChatGPT Privacy Tango, armed with the knowledge and tools needed to safeguard sensitive information while harnessing the power of ChatGPT and LLMs.
Watch
Talks - Nicholas H.Tollervey, Paul Everitt: Build Yourself a PyScript
PyScript and Pyodide have gained a lot of attention, as Python in the browser presents interesting opportunities. And architectural questions as well. What does it mean to write an extensible, friendly web platform targeting Python? In this talk, learn how PyScript works and watch a treatment of key technical issues for writing web apps with the WebAssembly version of Python. What does “file” mean? How do you install something? What are web workers and how do they impact your architecture? PyScript itself is constantly evolving on these topics. Come for a spirited discussion with a fast-paced format.
Watch
Talks - Bert Wagner: Cross-Server Data Joins on Slow Networks with Python
While working from home has its perks, you've found one thing missing in your remote work life: speed of network data transfer. It doesn't matter if you can write the most efficient Python data transformation code when your jobs are bottlenecked by slow data movement happening between your local laptop and remote servers. In this talk we will address techniques for querying and joining data across distant machines efficiently with Python. We will also discuss how to handle scenarios where you need to join datasets that won't fit in your laptop's memory, including several techniques and packages for making cross server joins. This session won't stop you from getting angry when your ISP throttles your home internet connection, but it will teach you ways to work with local and remote datasets as efficiently as possible.
Watch
Talks - Brandt Bucher: Inside CPython 3.11's new specializing, adaptive interpreter
Python 3.11 was released on October 24th, bringing with it a new "specializing, adaptive interpreter." As one of the engineers who works on this ambitious project, my goal is to introduce you to the fascinating way that your code now optimizes itself as it's running, and to explore the different techniques employed under-the-hood to make your programs 25% faster on average. Along the way, we'll also cover many of the challenges faced when optimizing dynamic programming languages, some of the tools you can use to observe the new interpreter in action, and what we're already doing to further improve performance in Python 3.12 and beyond.
Watch
Talks - Victor Stinner: Introducing incompatible changes in Python
In the Python 2 era, it was decided to migrate at a D-Day: convert all your code base to Python 3. It didn't go as well as expected. We learnt lessons from this mistake. Incompatible changes are now introduced differently in Python. Today, changes start with a deprecation warning for at least two Python releases before removing old functions. We think about how to write a single code base working on the old and new Python versions. More and more often, instructions to migrate existing code are provided, or even automated tools. Changes breaking too many projects are reverted when there is not enough time to update enough projects. Code search helps detecting affected projects, notify them, and maybe also propose changes to prepare their code. In the future, Python is working on a stable ABI to be able to build C extensions once and use them on many Python versions. The HPy project is an interesting candidate for this goal. More and more projects are being tested on the Python version currently under development (Python 3.12)
Watch
Talks - Ron Nathaniel: How To Monitor and Troubleshoot Applications using OpenTelemetry
OpenTelemetry is a free, open-source Observability Protocol. OpenTelemetry sits at the application layer, and exports Traces, Metrics, and Logs to a backend for observing. It is extremely helpful and beneficial to developers in the mean "time-to-detection" and "time-to-resolution" of bugs and issues that occur at the application layer; this ranges from detecting and alerting for errors raised (such as TypeError), to finding that a specific microservice (such as AWS Lambda) ran for twice as long as usual, all the way to seeing the output of a service and comparing it to the expected output to find a bug in the logic of the service. This talk is meant as a great eye-opening introduction into basic Monitoring and Troubleshooting code that may be running in a galaxy far, far away on a Cloud Provider’s computer. This talk is geared towards complete beginners to the Monitoring and Observability world, and to show them just how easy it is to get set up and running. No OpenTelemetry or otherwise experience is needed, just a basic understanding of Python syntax to read and understand the minimal code changes required for OpenTelemetry.
Watch
Talks - Erik Tollerud: How Python is Behind the Science of the James Webb Space Telescope
The James Webb Space Telescope (JWST) is one of the largest science projects in history. Its aim is to blow the door open on infrared astronomy: it has already found the earliest galaxies, will reveal the birth of stars and planets, and look for planets that could harbor life outside our solar system. Not to mention it has and will produce a lot of spectacular pictures that help us all understand our place in the cosmos in a way never before possible. And while there were many varied programming languages used for development and operation of JWST, the language used for most of the science is Python. In this talk I will walk through some of the early science of JWST and how it has been made possible by Python and the broad and deep open source Python scientific ecosystem.
Watch
Talks - Uzoma Nicholas Muoh: Improving Efficiency in Transportation Networks using Python
When we think about what Python is for, we often think of things like analytics, machine learning, and web apps, but python is a workhorse that plays a tremendous and often invisible role in our day-to-day lives, from medicine to finance, and even the transportation of goods from manufacturers to the shelves of our neighborhood stores. Transportation networks are highly dynamic, goods are always moving from point A to point B and money is being gained or lost every minute. Improving efficiency in a transportation network is critical to the survival of a business that provides transportation and distribution services as well as ensuring timely delivery of goods to customers. This talk examines three real-world examples of how Python is used to improve the efficiency of transportation networks, particularly we will explore: * Finding the optimal match between a driver and a load at the lowest possible cost using Google's ortools; * Generating recommendations for macro level optimizations to a transportation network using networkX; and * Helping the decision making process by answering the question "Should I accept this work?" using skfuzzy. Key Takeaways include: * Graph analytics and data science concepts that facilitate getting goods from manufacturers to stores more efficiently and at a lower cost to businesses; and * An appreciation of the complexity of the logistics industry and the role Python plays in making the life of drivers better.
Watch
Talks - Christopher Ariza: Building NumPy Arrays from CSV Files, Faster than Pandas
Twenty years ago, in 2003, Python 2.3 was released with csv.reader(), a function that provided support for parsing CSV files. The C implementation, proposed in PEP 305, defines a core tokenizer that has been a reference for many subsequent projects. Two commonly needed features, however, were not addressed in csv.reader(): determining type per column, and converting strings to those types (or columns to arrays). Pandas read_csv() implements automatic type conversion and realization of columns as NumPy arrays (delivered in a DataFrame), with performance good enough to be widely regarded as a benchmark. Pandas implementation, however, does not support all NumPy dtypes. While NumPy offers loadtxt() and genfromtxt() for similar purposes, the former (recently re-implemented in C) does not implement automatic type discovery, while the latter (implemented in Python) suffers poor performance at scale. To support reading delimited files in StaticFrame (a DataFrame library built on an immutable data model), I needed something different: the full configuration options of Python's csv.reader(); optional type discovery for one or more columns; support for all NumPy dtypes; and performance competitive with Pandas read_csv(). Following the twenty-year tradition of extending csv.reader(), I implemented delimited_to_arrays() as a C extension to meet these needs. Using a family of C functions and structs, Unicode code points are collected per column (with optional type discovery), converted to C-types, and written into NumPy arrays, all with minimal PyObject creation or reference counting. Incorporated in StaticFrame, performance tests across a range of DataFrame shapes and type heterogeneity show significant performance advantages over Pandas. Independent of usage in StaticFrame, delimited_to_arrays() provides a powerful new resource for converting CSV files to NumPy arrays. This presentation will review the background, architecture, and performance characteristics of this new implementation.
Watch
Talks - Hannah, Lalleh, Timothy, Uma: Instrumentation Nightmares: A review of our toughest cases
Ever wonder how companies like New Relic, Data Dog, and Sentry instrument your code? In this talk we will briefly review how to hook into the Python import system in order to instrument code. We'll present some useful design patterns and tricks of the trade. Then, we'll launch straight into real world examples and challenging instrumentation we've done over the years. Take a deep dive with us into some of the most popular Python libraries in use today and learn how they work underneath. We'll talk about proxies, wrapt, async, Python's web server specifications, and more! You will walk away from this talk with an understanding of how instrumentation works under the hood and how to make your own code instrumentation friendly. You'll also learn about various design patterns; some that are gotos for instrumentation and some that make instrumentation nightmarishly difficult. We hope you will join us on this instrumentation journey and come away with an understanding of how it all works to make developer's lives easier.
Watch
Talks - Paul Ganssle: Working with Time Zones: Everything You Wish You Didn't Need to Know
Time zones are complicated, but they are a fact of engineering life. Time zones have skipped entire days and repeated others. There are time zones that switch to DST twice per year. But not necessarily every year. In Python it's even possible to create datetimes with non-transitive equality (a == b, b == c, a != c). In this talk you'll learn about Python's time zone model and other concepts critical to avoiding datetime troubles. Using the zoneinfo module introduced in Python 3.9 (PEP 615), this talk covers how to deal with ambiguous and imaginary times, datetime arithmetic around a Daylight Savings Time transition, and datetime's new fold attribute, introduced in Python 3.6 (PEP 495).
Watch
Charlas - Marlene Marchena: Mi viaje personal enseñando programación a alumnos neurodivergentes
Esta historia comienza cuando un niño de nueve años me dijo que quería aprender python. También me dijo que la escuela era aburrida y que no tenía amigos. ¿Te suena esta historia familiar? Pues a mí sí, es por ese motivo que decidi enseñar programación a niños. A lo largo de los años me he confrontado a alumnos neurodivergentes (puede incluir autismo, TDAH, dislexia, dispraxia, etc.). En esta charla, voy a compartir mi experiencia del uso de la tecnología para romper la barrera del aislamiento y el estigma que pesa sobre las personas neurodivergentes. Proporcionando una experiencia educativa inclusiva y adaptada a los diferentes estilos de aprendizaje, es posible cambiar el paradigma de la educación y el empleo para las personas neurodivergentes.
Watch
Talks - Juliana Karoline de Sousa: Create interactive games using MicroPython and electronics
Do you want to have fun and learn Python? Let's learn how to use electronics and programming to create games using MicroPython and a micro:bit board. In this talk you'll learn how the micro:bit board works with MicroPython and how you can use push buttons, an accelerometer sensor and a LED display to create interactive games. The game examples will be Chase the Dot, Genius and Car Crash. For each game we'll see how the game works, the source code and a demonstration.
Watch
Talks - Samweli Mwakisambwe: Using Python and PyQgis to make cool maps
QGIS is a freely downloadable open source GIS software suite that contains a desktop option, mobile, and web component. QGIS is free to download and use, it is released with a GPL v3 license allowing users to download and use it without concerns compared to other GIS software. QGIS core support in creating different types of maps, thanks to the recent features updates QGIS version 3.14 was released with a Temporal Controller feature that is responsible for handling all the temporal layers inside QGIS. Temporal support was added in the core part of QGIS hence users can now easily create animation maps from temporal location datasets inside QGIS without any additional plugin. QGIS has scripting support using Python language, it also allows enhancement to its functionality through plugins that are written using Python language. This usage of the Python language in QGIS (PyQgis) is achieved by using SIP and PyQt. Through the bindings QGIS has exposed its core functionality via PyQgis API that can be used to create standalone python applications that can use QGIS features in making maps. The aim of this talk will be to showcase how one could use Python and QGIS to build map animations from temporal location data using the QGIS Temporal Controller Python API. The session will also provide a guide on PyQgis Temporal API, python scripting inside QGIS, how to build standalone python applications and how to create QGIS python plugins that can help in making maps. The talk is aimed at Python geospatial programmers and anyone looking to learn how to use open source tools in analyzing location data. Expecting to raise the participant's awareness and value about the work done and on the open source tools used in the geospatial field.
Watch
Talks - Gajendra Deshpande: Three Musketeers: Sherlock Holmes, Mathematics and Python
Mathematics is a science and one of the most important discoveries of the human race on earth. Math is everywhere and around us. It is in nature, music, sports, economics, engineering, and so on. In our daily life, we use mathematics knowingly and unknowingly. Many of us are unaware that forensic experts use mathematics to solve crime mysteries. In this talk, we will explore how Sherlock Holmes, the famous fictional detective character created by Sir Arthur Conan Doyle uses Mathematics and Python programming language to solve crime mysteries. We will solve simple crime puzzles using mathematics and python scripts. Finally, we will solve a few complex hypothetical crime mysteries using advanced python concepts. The participants will learn how to use the concepts of mathematics such as statistics, probability, trigonometry, and graph theory, and python and its packages such as SciPy, NumPy, and Matplotlib to solve the crime puzzles.
Watch
Charlas: - Nicole Franco Leon: Jaguares y serpientes
¿Es posible juntar la necesidad de conservar al jaguar y usar Python como herramienta para lograrlo? Sí, gracias a los procesos de telemetría, nos podemos permitir hacer el seguimiento de individuos a distancia brindando información imposible de recopilar en primera persona, tales como geoposición, velocidad, frecuencia cardiaca, temperatura corporal y altitud, entre otras. Pero te preguntarás en donde entra Python en todo esto, es por ello, que en esta charla aprenderemos a como condensar, categorizar, y cuestionar los diferentes datos del dominio ambiental a un modelo entendible para los humanos y que sea capaz de ser procesado por Python. Usaremos ArcPy (Paquete de Python para ejecutar funciones de índole geográfico dentro de ArcGis Pro) para procesar los datos obtenidos mediante la telemetría y realizar análisis geográficos que nos permitirán entender el comportamiento del jaguar y si es posible su conservación. Si lo tuyo son los animales, Python y un poco de conservación, esta charla es un buen punto de inicio.
Watch
Charlas: Resolviendo crimenes con Python mediante el Procesamiento del Lenguaje Natural (NLP)
Presented by: Carolina Passarello En mi desempeño como Ingeniera en Sistemas de Información en el area de Informatica Forense del Poder Judicial en Argentina, realizo pericias informáticas referentes a todo tipo de delitos y crímenes: homicidios, femicidios, robos, secuestros y muchos más. Uno de los delitos más comúnmente denunciados es el delito de Grooming (en mi pais tiene una pena de 6 meses a 4 años de prision), el cual consiste en una práctica de ciberacoso por medio de telecomunicaciones electrónicas en la que un adulto engaña a un menor de edad con una finalidad sexual, a travez de redes sociales o la aplicación Whatsapp. Cuando una persona denuncia este delito y se ordena una pericia informatica por parte de la Justicia, realizo una extracción forense del dispositivo celular en cuestión, obteniendo así las conversaciones realizadas entre el presunto autor del delito y el/la menor de edad. Las conversaciones pueden contener pocas o cientos de oraciones y no siempre tienen contenido explicito relacionado al delito, lo que es complicado en tiempo y esfuerzo para los operadores de la Justicia analizar el significado de cada una de ellas y contextualizarlo. Allí entra en escena el modelo que desarrollé con Python que por medio del procesamiento del lenguaje natural (NLP) y técnicas de machine learning y Deep learning le dan una solución rapida a la confirmación del delito. En el futuro este modelo podrá ser extensible a otros crímenes tal como por ejemplo el femicidio, ya sea previamente al hecho: una mujer pueda utilizarlo desde una aplicación web accesible desde cualquier punto geográfico y pueda predecir en habla hispana una conducta en una conversación realizada con su posible agresor referente al machismo o misoginia, entre otras, o una vez sucedido para confirmar el hecho objeto del crimen por parte de los administradores de la Justicia.
Watch
Charlas - Marina Moro López: Biohacking con Python: cómo convertirse en el señor Burns fluorescente
La charla abrirá con una breve introducción al biohacking, seguida por una mini clase (súper leve, lo prometo) de teoría genética con el propósito de entender perfectamente la metodología del caso práctico. Éste es el verdadero centro de la charla y consistirá en editar nuestro propio ADN con CRISPR (una herramienta de corte y empalme biológico) y un script de Python (que diseñará las secuencias genéticas necesarias para el experimento) para biohackearnos ciertos genes y convertirnos en el señor Burns fluorescente. Todo esto nos servirá para ver el tremendo potencial de la sinergia entre la ingeniería genética y Python, no sólo en ejemplos cómicos como el ya mencionado, sino también en el ámbito sanitario como tratamiento de enfermedades.
Watch
Charlas - Oscar Cortez: Modernizando tu paquete Python con pyproject y hatch
Python sigue evolucionando con forme pasan los años, y de igual manera lo hacen las herramientas que giran entorno a nuestro lenguaje. En esta charla veremos el pasado (distutils, setuptools), el presente (flit, poetry, build, twine), y el futuro (pyproject.toml, hatch) del empaquetamiento en Python, la parte mas crucial para el crecimiento de todo un ecosistema. Vamos a usar una mirada holística para analizar el estado actual y como podemos mejorar el flujo de trabajo para empaquetar y distribuir aplicaciones en Python. Esta charla es para mi? Esta charla esta pensada para cualquier persona con o sin experiencia en Python que quiera aprender o mejorar la forma para empaquetar paquetes en Python.
Watch
Charlas - Elena Guidi: Salvemos los pingüinos con el green computing
"Green computing" es un término que nació en 1992 y que busca reducir el impacto ambiental de las actividades digitales. En esta charla vamos a ver que es el green computing (también llamada computación verde) y que estudia esta área de la informática, con algunos ejemplos de mejoras de data centers. También vamos a ver algunas cosas que podemos hacer en nuestro día a día y una introducción al el green programming con python (o codificación ecológica) La tecnología tiene un potencial muy alto de ayudar al medio ambiente, ¡el objetivo de esta charla es que todos lo sepamos! Veremos también que podemos hacer con python para saber cuanto es green nuestro código. (No hay prerequisites, esta charla es para todos los públicos)
Watch
Charlas: Introducción a FastAPI
(English version below) Aprende a hacer una API lista para producción en muy poco tiempo usando FastAPI... explicado con memes. Con documentación y validación de datos automáticas, basada en estándares, alto desempeño y otras ventajas. Además, puedes escribir todo el código con autocompletado y chequeos de errores de tipos, incluso para tus propios datos. En esta charla verás de qué se trata FastAPI, qué beneficios te da y por qué sería útil para ti. También verás cómo declarar datos para recibir en cada request (cada mensaje HTTP), usando tipos de Python estándar. Incluyendo parámetros en el path, en queries, y en cuerpos (body) de mensajes. También verás cómo declarar cuerpos de mensajes complejos con datos muy anidados. Y así, aún con código muy simple, tener documentación para todo tu API, serialización (conversión de datos) y validación, todo siguiendo estándares, y todo automáticamente. Learn how to create an API ready for production in very little time using FastAPI... explained with memes. Your API will have automatic validation, documentation based on standards, high performance, and several other features. All this, having editor support including autocompletion everywhere. In this talk you will learn what FastAPI can do, and how it could benefit you. You will see how to declare the data you want to receive in each request using standard Python type annotations. Including path parameters, query parameters, body payloads with JSON, etc. You will also see how to use simple, standard, Python type annotations to declare complex JSON body payloads with deeply nested structures, and get automatic data validation, serialization, and documentation.
Watch
Charlas - Sofía Denner: Unit Testing con Pytest
Spanish: ¿Y esto cómo lo testeo? Al momento de escribir tests, no siempre es simple saber qué cosas hay que testear o cómo hacerlo. En esta charla voy a hablar de mocks, de buenas prácticas, voy a dar algunos tips y voy a mostrar ejemplos de todo esto usando pytest. La charla se va a dividir en tres partes: - ¿Por qué (y para quién) testeamos? - ¿Cómo escribir tests? Ejemplos de tests y cómo escribir código preparado para pruebas unitarias. - ¿Cómo le sacamos el jugo a pytest?: Ejemplos de fixtures, parametrizaciones, etc. English: So, how do I test this? When writing tests, is not always easy to know what to test and how to do it. I’m going to talk about mocks, good practices, a few tips, and I’m going to show some examples using pytest. The talk is splitted in three parts: - Why (and for whom) do we write tests? - How to write unit tests? (Some test examples, and how to write code ready to be tested). - How to take advantage of pytest main features? (Examples of fixtures, parametrize, etc.).
Watch
Talks - Joongi Kim: Improving debuggability of complex asyncio applications
The key of debugging is observability and reproducibility. Despite a series of the asyncio stdlib improvements for the last few years, it is still challenging to see what’s happening in complex real-world asyncio applications. Particularly, when multiple asyncio libraries and your codes are composed together, it is hard to track down silently swallowed cancellations and resource-hogging floods of tasks triggered by internals of 3rd-party callbacks. Moreoever, such misbehaviors are often observed only in production environments where the app faces the actual workloads and I/O patterns, making it even harder to reproduce. In this talk, I present an improved version of aiomonitor, called aiomonitor-ng (next generation). The original aiomonitor provides a live access to a running asyncio process using a telnet socket and a basic REPL to inspect the list of tasks and their current stacks. After getting several times of assistance in production debugging with it, I have added more features to help tracking the above issues of asyncio apps running in production: task creation tracker and termination tracker. These trackers keeps the stack traces whenever a new task is created or terminated, and provides a holistic view of chained stack traces when the tasks are nested with arbitrary depths. aiomonitor-ng also demonstrates a rich async TUI (terminal UI) based on prompt toolkit and Click, with auto-completion of commands and arguments, far enhancing the original version’s simple REPL. With the improved aiomonitor-ng, I could successfully debug several production bugs. I hope this talk would help our fellow asyncio developers to make more complex yet stable applications at scale.
Watch
Talks - Nina Zakharenko: Why You Should Care About Open Source Supply Chain Security
Over the past several years, large-scale hacks triggered by compromised software supply chains have dominated the news. The aftermath has inspired the creation of new organizations, tools, and systems to help prevent and respond to similar lines of attack in the future. In this talk, you'll learn about the insidious nature of supply chain attacks, common points of intrusion, and why the open source ecosystem is especially vulnerable. Next, you’ll learn about the basic concepts and terms involved in supply chain security and learn about open source projects and frameworks you can apply to protect the integrity of your own software. Lastly, you’ll learn about ways that you can evaluate the supply chain security practices of the dependencies you rely on. You’ll leave the talk understanding how supply chain attacks happen, why they’re so difficult to detect, and take away actionable solutions allowing you to be better prepared for the next wave of supply chain attacks.
Watch
Talks - William Woodruff: Ergonomic codesigning for the Python ecosystem with Sigstore
Code signing is coming to the Python packaging ecosystem, in the form of Sigstore: individual package maintainers and users will be able to sign for and verify the authenticity of their Python packages, respectively, without the historical and technical baggage of PGP. This talk will serve two purposes: (1) as a introduction to Sigstore, and its security model, to Python developers, and (2) as a technical overview of ongoing efforts to integrate Sigstore into Python packaging. Attendees will be introduced to the cryptographic fundamentals of codesigning, how Sigstore accomplishes codesigning without long-term key material (a critical downside to PGP), as well as the guarantees they can derive from strong codesigning in the Python packaging ecosystem. They'll also be introduced to the technical aspects of Sigstore's integration into Python packaging, including a peek behind the scenes at the standardization process and other foundational efforts required to introduce a new codesigning format to one of the world's largest packaging ecosystems.
Watch
Talks - Zac Hatfield-Dodds: Async: scaling structured concurrency with static and dynamic analysis
Async python is a relatively recent addition to Python’s longstanding concurrency options of processes and threads - and offers a very different programming experience. Where processes run independently and threads switch at the whim of the kernel scheduler, async tasks take a different tradeoff: managing shared state is as easy as in single-threaded synchronous Python, but it’s on you to ensure that there are enough await, async for, and async with statements where tasks can switch to make steady progress. In this talk, we’ll explore the advantages of structured concurrency - especially error handling, timeouts, cancellation, and readable code - and both convenient and reliable ways to mitigate the problems of cooperative concurrency (when one uncooperative slow task can bring your whole program to a halt). I’ll introduce you to static analysis with flake8-trio and explain how to write your own AST-based tools, and show how dynamic analysis can help us catch anything that slips past that quick and convenient check. With a system like this in place, you don’t have to be an experienced or paranoid software engineer to write beautiful async code - to serve or scrape a website, control a bundle of processes, or write a game - it just reads like normal Python, and your tools will catch you if you fall.
Watch
Talks - Reuven M. Lerner: Generators, coroutines and nanoservices
Generator functions have been a part of Python for many years already, and are a well known technique for creating iterators. But generators have a few lesser-known aspects, including their “send” method and the “yield from” syntax. Many Python developers shy away from using them, unsure of what they would do, or how they would be useful — seeing coroutines as a solution looking for a problem. In this talk, I’ll tell you why coroutines can be useful, and how thinking about them as in-process “nanoservices” puts us in the right frame of mind to determine when they would and wouldn’t be appropriate.
Watch
Talks - Valerio Maggio: Pythonic `functional` (`iter`)tools for your data challenges
Nowadays Python is very likely to be the first choice for developing machine learning or data science applications. Reasons for this are manifold, but very likely to be found in the fact that the Python language is amazing (⚠️ opinionated), and the open source community in the PyData ecosystem is absolutely fantastic (💙 that's a fact 1 2 3). In this context, one of the most remarkable features of the Python language is its ability in supporting multiple programming styles (from imperative to OOP and also functional programming). Thanks to this versatility, developers have their freedom to choose whichever programming style they prefer. Functional programming is indeed very fascinating, and it is great for in-demand tasks such as data filtering or data processing. Of course, this doesn't say anything about other paradigms, but sometimes the solution to a data problem could be more naturally expressed using a functional approach. In this talk, we will discuss Python's support to functional programming, understanding the meaning of pure functions (also why mutable function parameters are always a bad idea), and Python classes and modules that would help you in this style, namely itertools, functools, map-reduce data processing pattern. As for reference data challenges, we will discuss functional-style solutions to Advent of Code coding puzzles, to make it fun, and interactive.
Watch
Talks - Łukasz Langa: Working Around the GIL with asyncio
You've heard it many times: the GIL is a problem for using all your CPU cores in one program. Among the generally accepted solutions there's multiprocessing, a way to orchestrate a group of worker processes to spread CPU load over many cores. This solves the problem for many use cases but if you have a lot of data to pass around there and back again, it's much less efficient. In this short talk we'll go through two examples of data processing with Python 3.11 and how asyncio with shared memory helps speed things up. To cover all bases, one example will run on macOS, the other on Windows Subsystem for Linux. You'll see how the built-in building blocks of Python allow to compose scalable systems. Our focus is on the base programming language. We won't be reimplementing data pipelines or covering any MLops best practices.
Watch
Talks - Maria Jose Molina Contreras: Next level Machine Learning with TinyML and Python
We usually associate the future of computing as large clusters being able to perform tasks in a fraction of a second, but is it really the only scenario on how computational hardware will evolve? Machine learning has become an important component in our societies, we see how people, communities, and global companies are focusing their resources into improving their technological stack, and being the leader into the next generation of AI. At the same time that we see clusters getting larger, GPUs more powerful, and our phones are practically computers being capable of doing almost everything. We see that some of the smart devices are becoming smaller. The Internet of Things has been flourishing for many years, and Python has been playing an important role on the “easy to automate” topic for many devices, but can Python help us in all scenarios? One of the challenges for the next generation ML is to think small, you read that right “thinking small”. It’s time to start being able to have mechanisms with super well-trained ML models in small-devices: ML on microcontrollers. We are going to dive into TinyML and evaluate different setups to interact with sensors on microcontrollers. We will discuss the different hardware options and frameworks to start with, while checking different use cases that TinyML can solve, like: agriculture, conservation, health issues detection, ecology monitoring, autonomous vehicles, etc. In this talk, you will learn about Tiny Machine Learning (TinyML), which is an approach that explores machine learning deployed in embedded systems on microcontrollers. Similarly, I will talk about Micropython and CircuitPython, and how they have been conquering the microcontroller scene. Lastly, we will discuss a real use-case, predictive machine learning model to predict anomalies for predictive maintenance problems.
Watch
Talks - Dave Aronson: Kill All Mutants! (Intro to Mutation Testing)
How good is your test suite? Would it all still pass if the tested code was changed? If so, there may be problems with your code, your tests, or both! Mutation Testing reveals these cases. It makes lots of slightly altered versions of your code, called "mutants." If any mutants let all of the code's tests pass, you probably have gaps in your test suite, ineffective code, or both. This talk will tell you what mutation testing is, how it works, how to use it, and its benefits, drawbacks, inner workings, and history. There will be several examples, and a list of tools for many popular languages. You will come away equipped with a powerful new technique for making sure your tests are strict and your code is meaningful!
Watch
Talks - A. Jesse Jiryu Davis: Consistency and isolation for Python programmers
When you use a SQL database like Postgres, you have to understand the subtleties of isolation levels from "read committed" to "serializable." And distributed databases like MongoDB offer a range of consistency levels, from "eventually consistent" to "linearizable" and many options in between. Plus, non-experts usually confuse "isolation" with "consistency!" If we don't understand these concepts we risk losing data, or money, or worse. So what's the bottom line? Isolation: in a simple world, your database runs on one machine and executes each request one-at-a-time. In reality, databases execute requests in parallel, leading to weird phenomena called "anomalies". To see why anomalies happen, we'll look at Python code that simulates how a database executes operations. The various isolation levels make different tradeoffs between the anomalies they allow, versus the parallelism they can achieve. Consistency: distributed databases keep copies of your data on several machines, but these copies go out of sync. This leads to new anomalies: weird phenomena that reveal the out-of-sync data, and make your application feel like it's in a time warp. The various consistency levels make tradeoffs between anomalies versus latency. It depends how long you're willing to wait for your data changes to be synced across all the machines. Again, we'll look at a Python simulation to understand these anomalies. You don't need to know all the names and details of every consistency and isolation level. You can refer to this handy chart. And you don't need to read all the academic papers, but I'll name four or five that are worth your time. Now, make informed decisions about consistency and isolation, and use your database with confidence!
Watch
Talks - Rob de Wit: Transforming a Jupyter Notebook into a reproducible pipeline for ML experiments
Jupyter Notebooks are part of every data scientist's arsenal and for good reason. But while they're great for prototyping in data science projects, they are not ideal for experimenting with different configurations. I have been guilty of running experiments with changing parameters while keeping track on a notepad, and the result has always been messy. In this session, we will explore how we can transform our notebook prototype into a reproducible pipeline. We will discuss what goes wrong without proper experiment tracking, why reproducibility is the key to solving this, and how we can achieve that with Git and DVC. I will discuss this topic using a text2image project with Stable Diffusion. I'll show how to break up a notebook into modules, create a pipeline from them, run experiments through the pipeline, and compare their results to find the best possible outcomes. The target audience will be data scientists that don't have a strong engineering background but would like to move beyond messing about in notebooks. Much like myself a year or two ago.
Watch
Talks - Dan Craig: Testing Spacecraft with Pytest
Much of the industry discussion around software testing over the last couple of decades has been focused on web services, but there are lots of different types of software systems that have different testing needs. This talk will first explore the differences and similarities between testing web services and testing safety-critical and mission-critical software systems, such as those used on spacecraft. We will then consider a rubric for thinking about the verification needs of different types of software based on attributes of the software and the environments in which it runs. Finally, we will examine a real-world example of using pytest to test Varda Space Industries' spacecraft software, showcasing many of pytest's power features, such as its fixtures and extensive hook system, as well as Python language features such as generators, context managers, and threading, that enable easy-to-use tools for testing against real-time telemetry streams and generating rich test report output.
Watch
Talks - Cheuk Ting Ho: Trying No GIL on Scientific Programming
Last year, Sam Gross, the author of nogil fork on Python 3.9, demonstrated the GIL can be removed. For scientific programs which use heavy CPU-bound processes, it could be a huge performance improvement. In this talk, we will see if this is true and compare the no-gil version to the original. In this talk, we will have a look at what is no-gil Python and how it may improve the performance of some scientific calculations. First of all, we will touch upon the background knowledge of the Python GIL, what is it and why it is needed. On the contrary, why it is stopping multi-threaded CPU processes to take advantage of multi-core machines. After that, we will have a look at no-gil Python, a fork of CPython 3.9 by Sam Gross, and how it provides an alternative to using Python with no GIL, demonstrating it could be the future of the newer versions of Python. With that, we will try out this version of Python in some popular yet calculation-heavy algorithms in scientific programming and data sciences e.g. PCA, clustering, categorization and data manipulation with Scikit-learn and Pandas. We will compare the performance of this no-gil version with the original standard CPython distribution. This talk is for Pythonistas who have intermediate knowledge of Python and are interested in using Python for scientific programming or data science. It may shine some light on having a more efficient way of using Python in their tasks and interest in trying the no-gil version of Python.
Watch
Talks - Iván Pulido: Reproducible molecular simulations with Python
In this talk the audience will be briefly introduced to the field of molecular dynamics simulations and its challenges. Special attention will be given to how the features found in Python and its scientific ecosystem are boosting the research in the area, especially in times where Machine Learning and AI methods are revolutionizing the field. Examples using the OpenMM and its ecosystem (openmmtools, perses, among others) will be featured.
Watch
Talks - Eric Snow: A Per-Interpreter GIL: Concurrency and Parallelism with Subinterpreters
We live in a world of concurrent code and multi-core computing, so come learn about a new solution for both in Python 3.12. We'll quickly explain the new feature (and an old one), and then show you how to take advantage of it, for simpler concurrency and faster code.
Watch
Talks - Russell Keith-Magee: You *can* take it with you: Packaging your Python code with Briefcase
Once you're written your amazing new application using Python, the next problem you'll face is how to get that application into the hands of users. If your users are familiar with pip and and venv, you can put pip install instructions into a README, and leave it at that. But what if your audience aren't Python programmers? What if your app needs to be used by people who don't write Python at all? How do you distribute your code so that others can use it? In this talk, you'll learn about Briefcase, a tool that can convert a Python project into platform-native apps on macOS, Windows, Linux - and can also target iOS, Android, and the web. You'll learn how to use Briefcase to start a new project, or convert an existing project for distribution. You'll learn about the features of Briefcase that can support you while you develop your application. Finally, you'll learn how to generate installers and standalone applications for multiple platforms from a single Python codebase.
Watch
Talks - Adrian Garcia Badaracco: Inside web framework: intro to the ASGI spec, middleware and apps
What do FastAPI and Django have in common? They both use ASGI under the hood. ASGI, which stands for Asynchronous Server Gateway Interface, is a specification and API for asynchronous, event-driven web applications. The goal of this talk is to peel back the curtain on the internals of this specification and empower you to debug ASGI apps, write custom ASGI middleware, and simplify application lifecycles and serving. We will begin by discussing the basics of the ASGI specification and how it works. Then, we will move on to writing a simple ASGI app using pure, hand-crafted Python, without any frameworks or libraries. After that, we will cover ASGI middleware, which is a powerful tool that allows us to modify the behavior of our ASGI apps without changing the underlying code. We will show how to write custom middleware and how to use it to add features such as authentication or request body processing. Finally, we will discuss the serving of ASGI applications, focusing on how to use Uvicorn programmatically and take control of your event loop.
Watch
Talks - Pradeep Kumar Srinivasan: Catching Tensor Shape Errors without Running Your Code
ML developers are often slowed down by errors because of long iteration times and difficulty in debugging ML code. Tensor shape mismatches are some of the most common errors for both new and experienced ML developers, occurring when an operation is fed a multi-dimensional array (tensor) with the wrong dimensions (shape). In this talk, we will show that it is possible to catch Tensor shape mismatches without running your code by (a) representing the symbolic shape of a tensor (e.g., H x W x B) with explicit type annotations, called shape types, and (b) using a type checker to catch mismatches. We will also show how shape types can help us understand code faster by allowing us to see the shape of a tensor variable right in the IDE. Finally, we will describe how shape types can be adopted gradually in an existing ML project, talk about support for features such as broadcasting (in NumPy, PyTorch, etc.), and walk through the limitations of this new concept of shape types.
Watch
Talks - Pablo Galindo Salgado: How memory profilers work
These days, it is very easy for applications to run out of memory due to the vast amounts of data they need to process. While Python makes it very easy to get something up and running, the highly dynamic nature of the language abstracts memory management away from us and makes it very difficult to understand what is going on when we run out of memory or when we have memory leaks. This is where memory profilers come into play. Memory profilers are tools that allow us to understand how our applications are using memory. Not only can they help us diagnose why our programs are using so much memory, but also they can also help us optimize our code to be faster by using smarter allocation patterns. Being able to understand how to use memory profilers effectively is an essential skill for any Python developer, especially those working on programs that involve the transformation of large amounts of data, large-scale applications, or long-running processes. This talk will cover the basics of memory profilers, how they work, and how to use them effectively. We will cover the different types of memory profilers, the different kinds of allocations a Python program can perform, and how to use memory profilers effectively to understand what is going on in our programs.
Watch
Talks - Bruce Eckel: Rethinking Objects
This presentation revisits two core concepts of Object-Oriented programming: encapsulation and code reuse. Using a series of examples, we'll ask whether these concepts have satisfied their promises, and how functional approaches can do a better job. We'll also see that objects still have value in making library use easy.
Watch
Talks -Algorithmic ideas, engineering tricks, and trivia behind CPython's new sorting algorithm
Writing a sorting function is easy - coding a fast and reliable reference implementation less so. In this talk, I tell the story behind CPython's latest updates of the list sort function. Aims: entertain people with twists of history and algorithmic puzzles, which tell a lovely story of how a seemingly useless piece of theory lead to the fastest and most elegant solution of a practical challenge. Target audience: geeks believing in the power of solid algorithmic thinking; programmers interested in engineering performance-critical code; all Python enthusiast curious about what makes (sorting lists in) Python fast. Content: After using Quicksort for a long while, Tim Peters invented Timsort, a clever Mergesort variant, for the CPython reference implementation of Python. Timsort is both effective in Python and a popular export product: it is used in many languages and frameworks, notably OpenJDK, the Android runtime, and the V8 JavaScript engine. Despite this success, algorithms researchers eventually pinpointed two flaws in Timsort's underlying algorithm: The first could lead to a stack overflow in CPython (and Java); although it has meanwhile been fixed, it is curious that 10 years of widespread use didn't bring it to surface. The second flaw is related to performance: the order in which detected sorted segments, the “runs” in the input, are merged, can be 50% more costly than necessary. Based on ideas from the little known puzzle of optimal alphabetic trees, the Powersort merge policy finds nearly optimal merging orders with negligible overhead, and is now (Python 3.11.0) part of the CPython implementation.
Watch
Charlas - Laura Funderburk: XGBoost para clasificación: construyendo modelos precisos y eficientes
En esta charla, nos centraremos en el uso de XGBoost para problemas de clasificación. Comenzaremos explicando los conceptos básicos de la clasificación y cómo difiere de la regresión. Luego, demostraremos cómo usar XGBoost para construir y evaluar modelos de clasificación, y discutiremos algunas de las características y ventajas clave de XGBoost para tareas de clasificación. Al final de la charla, los asistentes tendrán una comprensión sólida de cómo usar XGBoost para construir modelos de clasificación precisos y eficientes.
Watch
Charlas - Daniel Hernández Méndez: Interfaces Low-code con QT y su integración con Python.
Esta charla tratara de como diseñar interfaces low-code del framework QT de manera visual, por medio de Qt Designer, y como transformar los archivos generados por este programa (.UI) a código Python (.py), sí que existe necesidad de modificar algo a nivel de código, o simplemente integrar estos archivos UI, con el código Python, por medio de un ejemplo de una aplicación control de desempeño empresarial, programada 100% en Python. La charla esta dedicadas a todas esas personas, que como yo necesitamos tener el control de las interfaces totalmente visual, que le dificulta mucho programar a nivel de código dichas interfaces (por tema de control de pixeles, botones, funcionalidades, etc). La charla será, para todo tipo de público, debido a la utilización de poco código, para tener resultados satisfactorios, pero igual se incluirá, modificaciones de esté para usuarios un poco mas avanzados.
Watch
Charlas - Iván Pulido: Simulaciones moleculares reproducibles con la ayuda de Python
Esta charla introducirá a un público no experto rápidamente en el mundo de las simulaciones de dinámica molecular y algunos de sus retos. Se hará especial énfasis en cómo las características y funcionalidades de Python y su ecosistema científico aceleran la investigación en el área, especialmente en los tiempos actuales en donde la aplicación de técnicas de Machine Learning están revolucionando el campo. Lo anterior se demostrará con ejemplos que hacen uso de la herramienta de simulación OpenMM y su sistema de librerías y herramientas (openmmtools, perses, entre otras).
Watch
Talks - natalie serebryakova: Manage your SCM security using Python Open Policy Agent (OPA) Client
The talk will explain using an Open Policy Agent (OPA) to ensure that governance, compliance, and security controls are implemented in the development process. The domain-agnostic nature of Open Policy Agent makes it well-suited for policy management and evaluation for tasks like that. The Implementation example will be developing a solution for managing SCM (Source Control Management) security at any organization or project's whole CI/CD pipeline. This part of the talk aims to demonstrate how to use Python Open Policy Agent (OPA) Client and build policies to verify the security of SCM (Gitlab or Github) organization/repositories/user accounts. The good practices to automate those Policies to Satisfy Common Concerns will be covered in the presentation.
Watch
Talks - Bianca Henderson: Plug life into your codebase: Making established Python codebase pluggable
You will learn about the pluggy Python framework and how it can be used to make your codebase plugin-friendly. As a real-life example, you will also learn about how the 10 year old conda codebase has recently gotten new life injected into it via a plugin API.
Watch
Talks - Antonio Cuni: The CPU in your browser: WebAssembly demystified
In the recent years we saw an explosion of usage of Python in the browser: Pyodide, CPython on WASM, PyScript, etc. All of this is possible thanks to the powerful functionalities of the underlying platform, WebAssembly. In this talk we will examine what is exactly WebAssembly, what are the strong and weak points, what are the limitations and what the future will bring us. We will also see why and how WebAssembly is useful and used outside the browser. This talk is targeted to an intermediate/advanced audience: no prior knowledge of WebAssembly is required, but it is required to have a basic understanding of what is a compiler, an interpreter and the concept of bytecode. The introduction will cover the basics to make sure that the talk is understandable also by people who are completely new to the WebAssembly world, but after that we will dive into the low-level technical details, with a special focus on those who are relevant to the Python world, such WASI vs emscripten, dynamic linking, JIT compilation, interoperability with other languages, etc.
Watch
Talks - C.A.M. Gerlach, Erlend Aasland: Iteration Toward Transformation of the Python Documentation
With the tremendous growth of the Python ecosystem, attracting an ever-wider audience of users with a variety of backgrounds and experience levels, it is more critical than ever that its documentation better serve the needs of its diverse array of readers. We formally introduce the Python Docs Community—the self-organized, Python Steering Council-endorsed collective working toward this goal—and provide a look at the major user-facing improvements implemented, underway and coming soon for the core documentation, devguide, PEPs and more. Along the way, we'll share the key insights and lessons learned from our ongoing projects, and how they can help you improve the documentation of your own projects. And if this sounds like something you might want to be a part of, we'll share how you can engage with us and your fellow documentarians through our community platforms and resources.
Watch
Talks - Moshe Zadka: pyproject.toml, packaging, and you
What is pyproject.toml? What is it good for? The talk will cover the basic format and extensibility of pyproject.toml. It will show how it is extensible by showing how a couple of tools integrate with it. Then the talk will cover how to use pyproject.toml as the source of truth for packaging your Python project with setuptools. Special attention will be given to integration with setuptools plugins.
Watch
Talks - Shai Geva: 10 Ways To Shoot Yourself In The Foot With Tests
Tests are great. Except when they’re not. Almost every developer who’s worked with tests has encountered a test suite that caused a lot of pain. Some of them just don’t protect us when we need them, some are flaky, some keep breaking because of unrelated changes, some take hours to debug whenever they fail. And while every company is different, there are definitely patterns. A lot of these problems are the result of some common pitfalls that trap many teams. These pitfalls might be common, but they're not easy to spot - I’ve seen all of them happen in strong, capable, experienced teams. Most of these I fell into myself at least once. In this session, we'll take a look at a selection of problematic testing choices, with examples that show these in the context of common Python frameworks and libraries. We'll discuss how to identify them, what problems they might cause and what alternatives we have so we can save ourselves the pain.
Watch
Talks - Jimmy Lai: Python Linters at Scale
Black, Flake8, isort, and Mypy are useful Python linters but it’s challenging to use them effectively at scale in the case of multiple codebases, in a large codebase, or with many developers. Manually managing consistent linter versions and configurations across codebases requires endless effort. Linter analysis on large codebases is slow. Linters may slow down developers by asking them to fix trivial issues. Running linters in distributed CI jobs makes it hard to understand the overall developer experience. To handle these scale challenges, we developed a reusable linter framework that releases new linter updates automatically, reuses consistent configurations, runs linters on only updated code to speedup runtime, collects logs and metrics to provide observability, and builds auto fixes for common linter issues. Our linter runs are fast and scalable. Every week, they run 10k times on multiple millions of lines of code in over 25 codebases, generating 25k suggestions for more than 200 developers. Its autofixes also save 20 hours of developer time every week. In this talk, we’ll walk you through popular Python linters and configuration recommendations, and we will discuss common issues and solutions when scaling them out. Using linters more effectively will make it much easier for you to apply best practices and more quickly write better code.
Watch
Talks - Brett Cannon: Python's syntactic sugar
Did you know that it only takes 11 pieces of syntax and some special functions to implement all the rest of the syntax of Python 3.8? It turns out you can take something like + and unravel it into Python code, letting you implement what Python does for a certain piece of syntax all on your own! This talk will cover what the minimum bits of Python syntax are needed to implement all the other pieces of syntax that Python supports. We will also cover how various pieces of syntax unravel into code to help you have a better understanding of how Python actually works.
Watch
Talks - Hynek Schlawack: Subclassing, Composition, Python, and You
Ever seen a code base where understanding a simple method meant jumping through tangled class hierarchies? We all have! And while "Favor composition over inheritance!" is almost as old as object-oriented programming, strictly avoiding all types of subclassing leads to verbose, un-Pythonic code. So, what to do? The discussion on composition vs. inheritance is so frustrating because far-reaching design decisions like this can only be made with the ecosystem in mind – and because there's more than one type of subclassing! Let's take a dogma-free stroll through the types of subclassing through a Pythonic lens and untangle some patterns and trade-offs together. By the end, you'll be more confident in deciding when subclassing will make your code more Pythonic and when composition will improve its clarity.
Watch
Charlas - Débora Azevedo: Cooperación internacional en la comunidad de Python
Como dice la famosa frase de Brett Cannon, algunas personas “vienen por el lenguage, pero se quedan por la comunidad”. Por lo general, empezamos participando en meetups locales o grupos enfocados como PyLadies, y a veces queremos ayudar, pero no estamos exactamente seguros de cómo hacerlo. Y yendo aún existe la posibilidad de extender nuestro trabajo no solo localmente sino también ayudando a las comunidades de todo el mundo. En esta charla, discutiremos las formas en que podemos cooperar dentro de la comunidad de Python en movimientos que se pueden realizar tanto a nivel local como fuera de nuestro país de forma voluntaria. Para empezar, discutiremos qué es contribuir a la comunidad y las diferentes formas en que puede contribuir. Además, explicaremos un poco sobre Python Software Foundation y sus grupos de trabajo, el papel de estos grupos y cómo proceder si está interesado en ponerse en contacto y ayudar. Destacaremos el trabajo del grupo de Diversidad e Inclusión y también el grupo de trabajo de traducción y su importancia para la comunidad en general. También se presentarán algunos trabajos que están en proceso, como el trabajo masivo de nuestros colegas latinoamericanos con Python en Español, que tiene un grupo de Discord y un grupo de Telegram para estudiar y cooperar juntos. Otro caso de éxito que se presentará es que el encuentro brasileño Python Python Brasil se esforzó durante 2020 y 2021 debido a la cooperación internacional: una mujer brasileña que coopera con EuroPython 2020 nos abrió el camino. Hablaremos de lo importante que es ver a alguien como nosotros, que habla el mismo idioma que nosotros, ocupando estos espacios y llevando nuestras inquietudes a otras mesas de discusión. Y que si no hay alguien que se parezca a nosotros, hay un lugar que podemos ocupar.
Watch
Charlas - Javier, Miguel: Orcha 🐳: Procesamiento Masivo Paralelo (MPP) y diseño de APIs
El CI es fundamental en el desarrollo de productos hoy en día y uno de sus pilares básicos es la ejecución de tests. Sin embargo, a medida que el producto madura la cantidad de tests aumenta y con ello el tiempo que tardan en completarse. Para tener feedback lo antes posible, ¿cómo se maquetan las pruebas de forma eficiente? El problema es aún más acusado cuando se cuenta con multitud de dispositivos y versiones en desarrollo. En particular, con dos servidores distriubyéndose carga de tests, el tiempo total de ejecución alcanzaba las 15 horas probando únicamente dos versiones. Es imperativo contar con una herramienta que permita paralelizar las pruebas de forma masiva, aprovechando al máximo los recursos disponibles. Además, es necesario que dicha herramienta sea lo suficientemente flexible como para soportar la infraestructura actual y permita expandir el tipo de infraestrucuras de tests. En esta charla vamos a explorar el diseño de Orcha (la herramienta de orquestración), la API para extender su funcionalidad y la necesidad de tener un usuario dedicado. La charla está orientada a usuarios intermedio-avanzados con familiaridad con el módulo multiprocessing.
Watch
Talks - Sumana, Jacob: Argument Clinic: What Healthy Professional Conflict Looks Like
What does healthy disagreement look like? Many of us have never experienced healthy conflict at work, and so assume our only options are to either avoid conflict or have a nasty fight. But it doesn't have to be that way: professional disagreement can be direct without being nasty. We want to show what that looks like. In this model argument, presented as a play, watch two engineering managers disagree about something. How do they work through their disagreement -- politely and effectively? Watch our the characters figure out what they're really clashing about, learn about each other's perspectives, and come to a better decision than either could alone.
Watch
Talks - Blake Rayfield: Pyscript for Education
Python is one of the more accessible programming languages and has been adopted by a broad community of users. For educators of all levels, Python has become a go-to programming language. However, while there are ways to distribute creations in Python, they tend to be notoriously complex, unreliable, or require additional services like web hosting. With the creation of Pyscript, Python projects can be distributed with little to no web hosting or even internet connectivity. This change can potentially bring previously inaccessible topics or tools to a broader community while increasing the popularity of Python. This talk will describe and demonstrate Python and Pyscript's potential opportunities in the education space. We will talk about what makes these tools different than those previously available and how the future development of Pyscript can drive additional education changes in the near future.
Watch
Talks - Neeraj Pandey and Aashka Dhebar: Python Meets UX: Enhancing User Experience with Code
The intersection of UX and Python programming is a powerful combination for building great products and enhancing user experience. Python is a versatile and popular programming language that is widely used for a variety of tasks, including web development, data analysis, and machine learning. UX, or user experience, is the process of designing products that provide a seamless and intuitive experience for users. Learn about this powerful intersection of UX design and Python programming by understanding how Python can be used to enhance the user experience and provide practical examples on how UX designers can automate tasks, gather and analyze data, develop personalized experiences, and continually improve their own skills and processes.
Watch
Talks - Andrew Godwin: Reconciling Everything
Queues. The backbone of distributed systems, our old friends that we can rely on, and the cause of a lot of grief and on-call worries as they inevitably back up, overflow, replay, or duplicate items. There is a different (and sometimes better) way to build distributed systems, though - the reconciliation loop, a system where stateless programs talks to a central datastore and try to progress the state in small, incremental actions. We'll take a look at what reconciliation loops are, exactly, how they compare to both queues and other distributed system messaging options, and then dive into their active use as part of the Takahē ActivityPub/Fediverse server - and see the good, the bad, and the strange behaviours that can result.
Watch
Talks - Fabio Pliger: PyScript and the magic of Python in the browser
This presentation revisits two core concepts of Object-Oriented programming: encapsulation and code reuse. Using a series of examples, we'll ask whether these concepts have satisfied their promises, and how functional approaches can do a better job. We'll also see that objects still have value in making library use easy.
Watch
Talks - Dustin: Software Security and Slippery Slopes: How to elevate an entire ecosystem at scale
Software security is a critical aspect of developing and maintaining reliable and safe systems. In the case of large and popular open source ecosystems, such as Python, ensuring security across a wide and diverse set of users and use cases can be a daunting task. In this talk, we will discuss the challenges of applying security improvements to a widely used open source ecosystem like Python, and explore strategies for addressing these challenges at scale. We will discuss the importance of community involvement and collaboration, and the role of automation and tools in facilitating the adoption of security best practices. By the end of this talk, attendees will have a better understanding of the challenges with and opportunities for improving software security in the Python ecosystem, and will have some practical takeaways for adopting and facilitating these changes in their own work.
Watch
Talks - Michał Gałka: Creating USB gadgets with Python
USB is with us for 26 years. Connecting USB devices to our computers, TVs, phones and many other devices became as natural as breathing. Throughout these years several mechanism have been developed in Linux to facilitate the process of creating USB devices. In this talk I'd like to take you to the other end of the USB plug and show you how to create your own USB device with Python. I'll take you through the process of turning RaspberryPi Zero into a USB keyboard. I'll show you how to use Python to interact with Linux system internals. We'll find out how to use Python to facilitate and automate the process of device creation and configuration. Finally, I'll present the implementation of the logic of a Linux based USB keyboard-like device in Python.
Watch
Talks - Mark Shannon: How we are making CPython faster. Past, present and future.
Many of you will will have heard that Python 3.11 is considerably faster than 3.10. How did we do that? How are we going to make 3.12 and following releases even faster? In this talk, I will present a high level overview of the approach we are taking to speeding up CPython. Starting with a simple overview of some basic principles, I will show how we can apply those to streamline and speedup CPython. I will try to avoid computer science and software engineering terminology, in favor of diagrams, a few simple examples, and some high-school math. Finally, I make some estimates about how much faster the next few releases of CPython will be, and how much faster Python could go.
Watch
Talks - Pandy Knight: What framework should I use for web testing?
Web apps these days are bigger than ever! With added complexity comes higher risk for bugs and poor user experience. One of the best ways to improve quality during development is to build feedback loops with automated test suites that run continuously. These days, there are three major open source test frameworks for automating web UI tests: - Selenium, the old-school browser automation tool - Cypress, the darling framework for frontend developers - Playwright, the dark horse rising in popularity These three can test any kind of web app, including ones developed in Python, but which one is best? In this talk, I’ll give a brief overview of each one, including example code for a basic search engine test. We will compare and contrast their features head-to-head so that you can make the right decision for your team.
Watch
Talks - Guillaume and Quazi: Oh no! My regex is causing a Denial of Service! What can I do about it?
Every modern programming language supports regular expressions. Python uses a backtracking engine to match developer-defined expressions against a wide range of input. Under certain circumstances, backtracking can lead to performance issues, and in extreme cases a denial of service (ReDoS). We will use descriptive examples to demonstrate the core issue, what to look for to detect problematic expressions, as well as how static analysis can help in this context. We will look at techniques to improve regular expression performance and defend against malicious inputs.
Watch
Talks - Angana Borah: Approaches to Fairness and Bias Mitigation in Natural Language Processing
With the advent of large pre-trained language models like GPT, BERT, etc., and their usage in almost all natural language understanding and generation applications, it is important that we evaluate the fairness and mitigate biases of these models. Since these models are fed with human-generated data (mostly from the web), they are exposed to human biases. Hence, they carry forward and also amplify these biases in their results. In this talk, we will discuss the motivation for fairness and bias research in NLP and discuss different approaches used to detect and mitigate biases. We will also explore some available tools to include in your models to ensure fairness.
Watch
Talks - Mario Munoz: so much depends upon... your python app's dependencies
How do you keep track of your project's building blocks? Is it enough just pinning your dependencies in a requirements.txt file? Or is there any reason to learn one (or any) in a myriad of dependency management tools? It depends. Untangling the complexity of this topic might not be worth it for certain small projects. But there are a lot of reasons why learning about (and using) a dependency management tool will help you in the future. Find out why embracing proper dependency management can help your project's predictability, sustainability, security, and yes, even simplicity. Learn how you can use a tool like pdm to help accomplish these goals.
Watch
Talks - Allan Campopiano: It might look normal but this distribution will ruin your stats
Some refer to the normal distribution as "God's curve" because of its supposed presence in nature when enough observations are collected. But what if I told you that there is a non-normal distribution that looks so normal that even experts can't see the difference? And beyond looks, it's a curve that is both prevalent in nature and likely to cause false negatives when testing hypotheses. 📊 Interactive notebook app: https://deepnote.com/workspace/allan-campopiano-4ca00e1d-f4d4-44a2-bcfe-b2a17a031bc6/project/robust-stats-simulator-7c7b8650-9f18-4df2-80be-e84ce201a2ff/notebook/notebook-d33fc4dd05304e8c8bca8d6e2a8ca3c9 🐍 Hypothesize (robust statistics library for Python): https://github.com/Alcampopiano/hypothesize
Watch
Talks - Marie Roald, Yngve Mardal Moe: The creative art of algorithmic embroidery
For thousands of years, people have created beautiful patterns through intricate needlework. Many of these patterns utilize algorithmic concepts like repetition, recursion and variation to build complex motives from simple rules. In this talk, we explore the art of embroidery through Python programming and show how you can create your own patterns with code. We will turn straightforward commands into elaborate and intricate artworks with loops, randomness and recursive functions using only the built-in turtle library in Python. Then, we will show how you can turn your art into embroidery patterns that are readable by an embroidery machine using the TurtleThread library and how you can use Python to create decorative ornaments for your Christmas tree. This talk is for anyone interested in the intersection between Python programming, creative coding and arts and crafts!
Watch
Talks - Tadeh Hakopian: The Lost Art of Diagrams: Making Complex Ideas Easy to See with Python
This talk is about communicating with visuals to make complex ideas simple to understand. Over the years I have produced diagrams, charts, illustrations and other graphics to help people understand sophisticated project concepts. This includes project architecture, standard operating procedures, coding concepts, sprints and roadmaps. You will be guided through ways of how to make stylized examples of your project code and workflows in easy to follow examples. By using common software for illustrations along with some simple guidelines you too can make easy to follow visual content for your next project. Key Takeaways: Learn methods to visually communicate with your team including with color, shapes, images, gifs and even memes to help get a point across Understand how to make your technical documentation into visual graphics with diagram design style guides See examples of how to take technical documentation and create an intuitive diagram of it to share Come away with an ability to execute a simple (or sophisticated) graphic with essential steps and key requirements.
Watch
Talks - Glyph: How To Keep A Secret
API keys, passwords, auth tokens, cryptographic secrets… in the era of cloud-based development, we've all got a bunch of them. But where do you put them? How do you keep them safe? And how can you access them conveniently from your Python code, both in development and production, without putting them at risk? In this talk, I'll review information security best practices for managing secrets as well as Python-specific tips and tricks.
Watch
Talks - Jodie Burchell: Vectorize using linear algebra and NumPy to make your Python code fast
Have you found that your code works beautifully on a few dozen examples, but leaves you wondering how to spend the next couple of hours after you start looping through all of your data? Are you only familiar with Python, and wish there was a way to speed things up without subjecting yourself to learning C? In this talk, you'll see some simple tricks, borrowed from linear algebra, which can give you significant performance gains in your Python code, and how you can implement these in NumPy. We'll start exploring an inefficient implementation of an algorithm that relies heavily on loops and lists. Throughout the talk, we'll iteratively replace bottlenecks with NumPy vectorized operations. At each stage, you'll learn the linear algebra behind why these operations are more efficient so that you'll be able to utilize these concepts in your own code. You'll see how straightforward it can be to make your code many times faster, all without losing readability or needing to understand complex coding concepts.
Watch
Talks - Calvin Hendryx-Parker: Too Big for DAG Factories?
You’re working on a project that needs to aggregate petabytes of data, and it doesn’t make sense to manually hard-code thousands of tables, DAGs (Directed Acyclic Graphs) and pipelines. How can you transform, optimize and scale your data workflow? Developers around the world (especially those who love Python) are using Apache Airflow — a platform created by the community to programmatically author, schedule and monitor workflows without limiting the scope of your pipelines. In this talk, we’ll review use cases, and you’ll learn best practices for how to: - use Airflow to transfer data, manage your infrastructure and more; - implement Airflow in practical use cases, including as a: - workflow controller for ETL pipelines loading big data; - scheduler for a manufacturing process; and/or - batch process coordinator for any type of enterprise; - scale and dynamically generate thousands of DAGs that come from JSON configuration files; - automate the release of both the DAGs and infrastructure updates via a CI/CD pipeline; - run all tasks simultaneously using Airflow. Both beginner and intermediate developers will benefit from this talk, and it is ideal for developers wanting to learn how to use Airflow for managing big data. Beginners will learn about dynamic DAG factories, and intermediate developers will learn how to scale DAG factories to thousands of DAGS — which is something Airflow can’t do out of the box. After this talk and live demo, people will learn best practices (including access to a code repo) that will allow them to scale to thousands of DAGs and spend more time having fun with big data.
Watch
Talks - Ludovico Bianchi: Using Python's import machinery to handle API deprecations
For any software project with an established user base, introducing breaking changes in its API can be daunting. To minimize disruptions for users, projects are incentivized to plan these transitions carefully, which may include API deprecations, where messages warning users of upcoming changes are added to the affected APIs while they’re still functional. However, this imposes extra workload for the project’s maintainers, as both old and new versions of the API must be kept functional throughout the transition period. As a maintainer of a software project undergoing preparations for a major version release, I recently found myself in a similar situation: our goal was to provide backward compatibility with the previous version for as long as possible, without impacting the development of new features. Practically, this included dealing with a radical restructuring of the Python codebase, resulting in hundreds of modules being relocated, split, or removed. Was there any way to ensure that the deprecated import paths could still be used without errors, without having to maintain two separate versions of the package? Fortunately, the answer to “can you do that in Python?” is more often than not “yes!”; for this particular case, the path to success turned out to be through the importlib package of the standard library. For something so close to Python’s internals, importlib is both accessible and extensible, allowing ordinary code to customize almost completely how and what modules can be imported---including modules that are not there anymore! This intermediate-level talk will present a complete solution based on Python’s importlib machinery that allows to redirect modules or module attributes with deprecations in a simple, robust, and scalable way. While the context of the solution is especially relevant for project maintainers, the focus is on importlib techniques that are generally applicable.
Watch
Talks - Sanskar Jethi: Robyn: An async Python web framework with a Rust runtime
With the rise of Rust bindings being used in the Python ecosystem, we know that throughput efficiency is one of the top priority items in the Python ecosystem. Inspired by the extensibility and ease of use of the Python Web ecosystem and the increase of performance by using Rust as a core, Robyn was created. Robyn is one of the fastest Python web frameworks in the current Python web ecosystem. With a runtime written in Rust, Robyn achieves near-native rust performance while still having the ease of writing Python code. This talk will focus on the increased involvement of Rust in the Python ecosystem. It will also demonstrate why Robyn was created, the technical decisions behind Robyn, the increased performance by using the Rust runtime, how to use Robyn to develop web apps, and most importantly, how the community is helping Robyn grow! I will briefly demonstrate my experience and challenges of building a community around the project and how it allowed Robyn to ensure a smooth sail even in turbulent situations. I shall also share my future plans for Robyn.
Watch
Talks - Alireza Farhidzadeh: Getting Around the GIL: Parallelizing Python for Better Performance
One of the ever-present banes of a data scientist’s life is the constant wait for the data processing code to finish executing. Slow code affects almost every step of a typical data pipeline: data collection, data pre-processing/parsing, feature engineering, etc. Many times, the lengthy execution times force data scientists to work with only a subset of data, depriving him/her of the insights and performance improvements that could be obtained with a larger dataset. One of the tools that can mitigate this problem and speed up data science pipelines (and CPU-bound programs) is parallelization. Parallelization is a useful way to work around the limitations of the Global Interpreter Lock (GIL), a key feature of Python that prevents code from fully utilizing multiple processor cores and can impact performance. In this session, we’ll walk through several ways to parallelize Python code, depending on the specific needs of your program and the type of parallelism you want to achieve.
Watch
Talks - Josh Weissbock, Sheila Flood: Using Python to Help the Unhoused
How a group of volunteers from around the globe use Python to help an NGO in Victoria, BC, Canada to help the unhoused. By building a tool to find social media activity on unhoused in the Capitol Region, the NGO can use a dashboard of results to know where to move their limited resources.
Watch
Talks - E. Johnson: Skynet 101 How to Keep Your Machine Learning Code From Getting Away From You
Machine learning can feel pretty mysterious at times, but as python developers you have so many of the tools you need to be a part of it! With basic python experience you can use libraries like pandas and tools like Jupyter Notebooks to analyze and manipulate data sets. By apply Test-Driven Development practices to you analysis you can feel confident about what your building. You can build well developed and well tested cleaning scripts and functions using pytest and use these functions in your notebooks and scripts. You can even build simple recommendation engines using libraries such as Scikit Learn! As a part of this talk we will walk through the process of data analysis, data cleaning, feature preparation, and building a simple movie recommendation engine. As we move through those steps, my main focus is to teach engineers how they can incorporate Test-Driven Development into the data cleaning process and the building of our engine. I will also walk through strategies for data analysis and explain at a high level a couple ML concepts that we can use. As participants get the chance to see live examples of how to use Test Driven Development in data analysis and machine learning they can get a handle on some core concepts and learn how to ensure quality in the code that they produce.
Watch
Talks - Paolo Melchiorre: A pythonic full-text search
A full-text search on a website is the best way to make its contents easily accessible to users because it returns better results and is in fact used in online search engines or social networks. The implementation of full-text search can be complex and many adopt the strategy of using dedicated search engines in addition to the database, but in most cases this strategy turns out to be a big problem of architecture and performance. In this talk we'll see a pythonic way to implement full-text search on a website using only Django and PostgreSQL, taking advantage of all the innovations introduced in latest years, and we'll analyze the problems of using additional search engines with examples deriving from my experience on djangoproject.com. Through this talk you can learn how to add a full-text search on your website, if it's based on Django and PostgreSQL, or you can learn how to update the search function of your website if you use other search engines.
Watch
Talks - Malcolm Smith: Python on Android
By many measures, Android is the most widely-used operating system in the world. But Python development on the platform remains quite rare. Fortunately there are several active projects working to improve this. In this talk, you'll learn about: - Why Android support is important for the future of Python. - How Android compares to other platforms, and the unique challenges it presents. - What's needed to make mobile Python development practical, including build tools, GUI libraries, and binary package support. - The available options for running Python on Android, and how to choose which one is best for you.
Watch
Talks - Boaz Wiesner, Keren Meron: Supercharging Pipeline Efficiency with ML Performance Prediction
To process our customers' data, Singular's pipeline runs hundreds of thousands of daily tasks, each with a different processing time and resource requirements. We deal with this scale by using Celery and Kubernetes as our tasks infrastructure, letting us allocate dedicated workers and queues to each type of task based on its requirements. Originally, this was configured manually. As our customer base grew, we noticed that heavier and longer tasks were grabbing all the resources and causing unacceptable queues in our pipeline. Moreover, some of the heavier tasks required significantly more memory, leading to OOM kills and infrastructure issues. If we could classify tasks by their expected duration and memory requirements, we could have segregated tasks in Celery based on these properties and thus minimized interruptions to the rest of the pipeline. However, the variance in the size and granularity of the fetched data made it impossible to classify if a task was about to take one minute or one hour. Our challenge was: how do we categorize these tasks, accurately and automatically? To solve the issue we implemented a machine-learning model that could predict the expected duration and memory usage of a given task. Using Celery’s advanced task routing capabilities, we could then dynamically configure different task queues based on the model's prediction. This raised another challenge - how could we use the classified queues in the best way? Configuring workers statically for each queue would be inadequate at scale. We utilized Kubernetes’ vertical and horizontal autoscaling capabilities to dynamically allocate workers for each classified queue based on its length. This improved our ability to respond to pipeline load automatically, increasing performance and availability. Additionally, we were able to deploy shorter-lived workers on AWS Spot instances, giving us higher performance while lowering cloud costs.
Watch
Talks - Laszlo Kiss Kollar: The wheelhouse of horrors
You might be surprised to learn that, besides naming and cache invalidation problems, building a binary wheel for a Python extension is one of the hardest problems in computer science. Or more precisely, building that binary wheel correctly. Lucky for us, a few amazing community-led projects hide all that complexity from us, so we can instead focus on shipping and using Python code. One of Python's strong suits is its ability to use native C and C++ code, which is a big reason why it’s the number one language for date science and machine learning applications. However, distributing native code in Python libraries is far from trivial: subtle issues in the build process can result in runtime issues that are extremely difficult to track down. This talk will showcase some notable examples of how things can go wrong, while also helping users and maintainers recognise these typical error scenarios. We will learn how to avoid these issues and what users can do when they encounter such issues when using a library. The audience will learn about the manylinux standard and its role in standardizing Linux platform wheels. We will also take a look at the cibuildwheel project, which offers library authors a simple solution to automate the building and distribution of manylinux wheels.
Watch
Charlas - Alison Orellana Rios: OCR, Reconocimiento y obtención de información a través de imágenes
Se verá el área de reconocimiento de patrones y texto en imágenes diversas, el procesamiento que requiere capturar, decodificar y analizar para finalmente obtener texto a partir de imágenes o archivos digitales. A partir de estas premisas, se verá el uso de la librería OpenCV y su complementación con Tesseract, (en conjunto con Python) ya que ambas permiten obtener datos visuales con facilidad, para posteriormente generar información textual que es de gran utilidad para funciones complejas dentro de la industria automotriz, conducción autónoma, registro de actividades, señalización y sensores, robótica entre otros muchos campos de aplicación. La recuperación de texto a partir de imágenes es un pilar fundamental para la ejecución de múltiples categorías de tratamiento de datos, lo cual demuestra su gran importancia como factor base para una gran variedad de aplicaciones. El uso de librerías de Python permite contrastar la facilidad y manejo de información gráfica, su complementación permitirá entender un poco mejor las ramas de aplicación que posee el estudio de las imágenes y la visión artificial.
Watch
Talks - Al Sweigart: An Overview of the Python Code Tool Landscape 2023
Linters, type checkers, style formatters, package linters, security analysis, dead code removers, docstring formatters, code complexity analyzers: There is a wealth of static code analysis tools in the Python ecosystem. It's intimidating to start looking at them and easy to get lost. What's the difference between Pyflakes, flake8, and autoflake? Or between autopep8 and pep8-naming? This overview explains the different kinds of static code analysis tools, what tools are out there (as of 2023), and how beginners can get started using these tools to write code like pros. This talk also provides a beginner's introduction to type hints in Python and the type checker tools available. There are too many tools to describe in detail, but this talk does introduce the promising newcomer Ruff, an extremely fast Python linter written in Rust.
Watch
Charlas - Judite Cypreste: Cómo Python puede ayudar a monitorear gobiernos
Con el riesgo inminente de la caída de las democracias y los constantes ataques a los medios de comunicación, el acceso a la información se ha vuelto cada vez más difícil. Como resultado, la sociedad civil y los periodistas han estado buscando formas de garantizar que la sociedad no se quede en la oscuridad y que el monitoreo del gobierno continúe. Con la popularización de Python en varias áreas profesionales, el lenguaje se volvió cada vez más presente en la lucha por un gobierno más abierto en Brasil, ya sea en la construcción de herramientas de monitoreo o en el análisis de datos de una agencia gubernamental. Las iniciativas provenientes de entidades gubernamentales también están ayudando a hacer posible la transparencia. En esta charla, veremos ejemplos del uso de Python para monitorear al gobierno brasileño y cómo el lenguaje fue fundamental para que la sociedad brasileña no permaneciera en la oscuridad de la desinformación.
Watch
Talks - Samuel Colvin: How Pydantic V2 leverages Rust's Superpowers
Pydantic is a data validation library for Python that has seen massive adoption over the last few years - it is estimated that Pydantic is now used by about 10% of professional web developers! Over the last year I've been working full time to rebuild Pydantic from the ground up, using Rust for virtually all the validation and serialization logic. Pydantic V2, with these changes included, has recently been released. In this talk I will give a brief introduction to Pydantic and the new features in Pydantic V2 before diving into how the use of Rust has allowed us to completely change the architecture of Pydantic to make it easier to extend and maintain while also improving performance significantly. The majority of the talk will be devoted to using examples from the pydantic V2 code base to demonstrate the advantages (and disadvantages) of writing libraries like Pydantic in Rust. I'll cover the real life trade-offs and design decisions you might face while implementing logic in Rust rather than Python. This talk should be interesting to any Python developer who's interested in combining Python and Rust - no knowledge of Rust or Pydantic is required. However if you'd like to get some context or learn more about the topics discussed, here are some useful resources: Pydantic V2 Plan - blog post about the plan for Pydantic V2 pydantic-core - the python package that provides Rust logic in pydantic PyO3 docs - the amazing library that allows Rust to be embedded in Python Build your Python Extensions with Rust! by Paul Ganssle - good intro to building Python extensions in Rust
Watch
Talks - Dawn Wages: Supercharge your Python Development Environment with VS Code + Dev Container
Supercharge your Python Development Environment with VS Code + Dev Container by Dawn Wages
Watch
Talks - Jonas Neubert: MQTT: A Pythonic introduction to the protocol for connected devices
MQTT is to connected devices what HTTP is to web applications. It is a publish-subscribe protocol specifically designed for devices with limited bandwidth and CPU. MQTT is widely used in home automation, industrial automation, remote monitoring, and everywhere else where machines talk to each other or to a server. This talk is an introduction to MQTT for Pythonistas. I’ll start with a brief overview of basic concepts of the protocol. The rest of the presentation will be a sequence of code examples in CPython and Micropython/CircuitPython, building up to a demo with several devices publishing data to each other. Along the way, you will see a few of the most common tools for debugging MQTT communications. After attending this talk you will have a high level idea of Python use cases in automation, seen some examples of coding Python for microcontrollers, and know a whole lot more about four letters that look like an acronym but aren’t actually one. No prior experience with automation or microcontrollers is assumed.
Watch
PyCon 20 Years Throwback
To highlight the past 20 years, we asked you to submit your favorite memories and stories of PyCon US or something you'd hope to learn during your time with us this year. Check out the complete memory recap video highlighting our community members' favorite memories of PyCon US!
Watch