List of videos

Lim H. - Reproducible and Deployable Data Science with Open-Source Python

Reproducible and Deployable Data Science with Open-Source Python [EuroPython 2021 - Talk - 2021-07-30 - Parrot [Data Science]] [Online] By Lim H. Data scientists, data engineers and machine-learning engineers often have to team together to create data science code that scales. Data scientists typically prefer rapid iteration, which can cause friction if their engineering colleagues prefer observability and reliability.  In this talk, we'll show you how to achieve consensus using three open-source industry heavyweights: Kedro, Apache Airflow and Great Expectations. We will explain how to combine rapid iteration while creating reproducible, maintainable and modular data science code with Kedro, orchestrate it using Apache Airflow with Astronomer, and ensure consistent data quality with Great Expectations.  Kedro is a Python framework for creating reproducible, maintainable and modular data science code. Apache Airflow is an extremely popular open-source workflow management platform. Workflows in Airflow are modelled and organised as DAGs, making it a suitable engine to orchestrate and execute a pipeline authored with Kedro. And Great Expectations helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch
Miki Lombardi - A crowdsourced map for checking supermarket wait times worldwide

A crowdsourced map for checking supermarket wait times worldwide [EuroPython 2021 - Talk - 2021-07-30 - Ni] [Online] By Miki Lombardi In March 2020 the world is completely blocked and people are lining up to shop or to the pharmacy or to buy basic necessities. There have been many initiatives and among these I have created a worldwide map that allows anyone to check the estimated waiting times of supermarkets, pharmacies and places of interest. In addition to this, I gave people the opportunity to check waiting times and correct them through a crowdsourcing mechanism. All this, to be fast in development and in responding to requests, has exploited Redis with its geospatial indexes. The opensource project has obtained more than 2Mln visits in about 3 months of life, until June 2020 when the pandemic slowed down. In this talk we will see the architecture and the problems I encountered and solved with Redis, uWSGI, Flask and how I scaled up the project from a Raspberry to 4 VPS dislocated in Europe and America region. The main objective of this talk is to introduce problem solving through the various technologies of Python and other tools (such as Redis) and the logical / creative process that I had to go through. Other objectives are to encourage collaboration and the sharing of ideas or open source projects to allow you to collaborate between different communities and people without fear of making mistakes or showing your code. It is thanks to the open source community that this project has been possible. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch
Roberto Polli - Designing secure APIs

Designing secure APIs [EuroPython 2021 - Talk - 2021-07-30 - Brian] [Online] By Roberto Polli Goal Improve the security design of APIs using provided tools and guidelines. Audience Developers and designers with a basic knowledge of HTTP and OpenAPI Agenda 2 slide introduction towards API security; API security rules overview: a short json is not simple (i-json, structured fields, ...); look at that (json-)schema; What The ... JWT; rate-limiting. Enforcing rules with OpenAPI and static analysis tools License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch
Harshit Prasad - High Performance Data Processing with Python, Kafka and Elasticsearch

High Performance Data Processing with Python, Kafka and Elasticsearch [EuroPython 2021 - Talk - 2021-07-30 - Parrot [Data Science]] [Online] By Harshit Prasad In the current technology era, all kind of applications work on data. Data is used to represent a set of information. The healthcare apps, e-commerce apps etc works on data. Sometimes, this data needs to be get updated to reflect new changes across the platform. This action can be performed manually but what if platform data is getting updated in realtime or let’s say in every 1 hour? Such kind of problem can be solved by implementing a service based on Producer Consumer model. In this talk, I will be covering how Producer Consumer models work and how such design pattern can be implemented with Python. I will be explaining the whole implementation process using other tools such as Kafka as data streamer and Elasticsearch as data store. Talk Outline: 1. Problem Statement (2 mins) Introduction to problem statement. 2. Introduction to Producer Consumer Model (3 mins) Basics of Producer Consumer Model Applications 3. Deep-dive explanation of Producer Consumer model using example (5 mins) Elasticsearch Kafka 4. Explaining parts of our Producer Consumer model (5 mins) What kind of data are we updating in our data store? Why it’s a high performance solution? Implementation in Python as end-to-end framework. 5. Code walkthrough (5 mins) Produce data Stream data Consume data 6. Conclusion and Learnings (5 mins) Learnings Performance Pros and Cons 7. Q/A Session (5 mins) Target Audience - Beginner / Intermediate Proposal Section - Web based Systems Prerequisites - Python & System Design License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch
James Nightingale - PyAutoFit: A Classy Probabilistic Programming Language For Data Science

PyAutoFit: A Classy Probabilistic Programming Language For Data Science [EuroPython 2021 - Talk - 2021-07-30 - Optiver] [Online] By James Nightingale A major trend in data science is the rapid adoption of Bayesian statistics for data analysis and modeling. With modern data-sets growing by orders of magnitude in size, the focus is now on developing methods capable of applying contemporary inference techniques to extremely large datasets. To this aim, I present PyAutoFit (https://github.com/rhayes777/PyAutoFit), an open-source probabilistic programming language for automated Bayesian inference that was recently published in the Journal of Open Source Software (https://joss.theoj.org/papers/10.21105/joss.02550). I will begin by giving an overview of PyAutoFit’s core features, in particular how it: Makes it simple to compose and fit probabilistic models using a range of Bayesian inference libraries, such as emcee (https://github.com/dfm/emcee) and dynesty (https://github.com/joshspeagle/dynesty). Handles the 'heavy lifting' that comes with model-fitting, including model composition & customization, outputting results, model-specific visualization and posterior analysis. Is built for big-data analysis, whereby results are output as a sqlite database which can be queried after model-fitting is complete. PyAutoFit was developed by Astronomers seeking to fit large libraries of galaxy images to better understand the nature of dark matter. Using this science-case, I will describe PyAutoFit’s advanced features, such as multi-level models, automated model-fitting pipelines and support for massively parallel computing infrastructures. The goal of this talk is to introduce the audience to PyAutoFit so they can adopt it for their use-case. The only prerequisite is a basic understanding of object oriented programming in Python. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch
Pierre Clisson - Building Brain-Computer Interfaces with Timeflux

Building Brain-Computer Interfaces with Timeflux [EuroPython 2021 - Talk - 2021-07-30 - Ni] [Online] By Pierre Clisson Brain-Computer Interfaces (BCIs) enable people to interact with physical devices by using their mind only. For a long time, building such systems has been the prerogative of the academic world, and has required cumbersome tools and expensive hardware. It is now time to get BCIs out of the lab and into the hands of programmers. The field of BCIs is currently experiencing a momentum, attracting not only researchers, but also companies and hackers. At the same time, a growing number of people rely on the thriving Python datascience and machine learning ecosystem. Yet, until recently, there was not any real Python solution for building BCIs. Timeflux (https://timeflux.io), a free and open-source framework for the acquisition and real-time processing of biosignals, aims to fill this gap. During this talk, we will cover all you need to get started: you will learn the basics of neurophysiology and how BCIs actually work, where to get the hardware to measure the electrical activity of your brain, the general architecture of Timeflux and how you can use this framework to build processing pipelines and interfaces available from a web browser. Finally, we will demonstrate a virtual keyboard on which you can type with nothing else but your mind. Beginners are welcome to this presentation. Data scientists, machine learning enthusiasts and web developers will be able to leverage their skills. More broadly, we invite anyone curious to know why the future of BCIs is written in Python. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch
Marc Päpper - Speeding up the deep learning development life cycle for cancer diagnostics

Speeding up the deep learning development life cycle for cancer diagnostics [EuroPython 2021 - Talk - 2021-07-30 - Parrot [Data Science]] [Online] By Marc Päpper An important, but often overlooked aspect of developing a high-quality deep learning model is the iteration speed. If you can iterate faster, you can try out more ideas and over time you get better results. In this talk, you will learn about the different tricks you can use to train a great machine learning model in a shorter amount of time. In particular, I will discuss how we optimized our deep learning development life cycle at Mindpeak to create robust deep learning models for cancer diagnostics that work in vastly different laboratory settings. The goal of this talk is to point to the most important aspects which you can adjust to speed up the time it takes to go from idea to validated result. I will talk about many different aspects like task prioritization, data processing, communication, GPU parallelization, code quality, unit tests, continuous integration, data fit and profiling for speed. So hopefully, after the talk, you should be able to point to some items that you could do to improve the iteration speed when developing machine learning models. There are no strict requirements for the talk, but you probably obtain the highest benefit if you have gained some initial experience in developing machine learning models. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch
Luke Leighton - The Libre-SOC Project

The Libre-SOC Project [EuroPython 2021 - Talk - 2021-07-30 - Brian] [Online] By Luke Leighton The Libre-SOC Project aims to bring to market a mass-volume System-on-a-Chip suitable for use in smartphones netbooks tablets and chromebooks, which is end-user programmable right to the bedrock. No spying backdoors, no treacherous DRM. Python and standard Libre Project Management is used throughout: * nmigen, a python-based HDL, is a fundamental and critical strategic choice in creating the hardware. * An IEEE754 FP hardware library has been developed using nmigen/python, as are hundreds of thousands of unit tests * An OpenPOWER ISA simulator is written in python, and is actually a PLY compiler based on the GardenSnake example * Several thousand unit tests for the HDL and simulator are written in python * coriolis2, the VLSI ASIC layout toolchain, is a mixed c++ python application * cocotb, a hardware co-simulation system, is used to verify the HDL is correct, down to the gate level * a JTAG "remote" system has been written in python which allows the simulated processor to be connected to with openocd. * Even the Standard Cell Library being used, called FlexLib, by Chips4Makers, is in python. To say that python is critical to the project would be a massive understatement. This talk will give a brief overview of the above areas and give a glimpse into why python was chosen for each. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch
Vincent D. Warmerdam - Why Transformers Work

"Why Transformers Work EuroPython 2020 - Talk - 2020-07-24 - Microsoft Online By Vincent D. Warmerdam This will be a technical talk where I'll explain the inner workings of the machine learning algorithms inside of Rasa. In particular I'll talk about why the transformer has become a part in many of our algorithms and has replaced RNNs. These include use-cases in natural language processing but also in dialogue handling. You'll see a live demo of a typical error that an LSTM would make but a transformer wouldn't. The algorithms are explained with calm diagrams and very little maths. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2020.europython.eu/events/speaker-release-agreement/ "

Watch