List of videos

EuroPython 2021 - Lightning Talks 07/28

Lightning Talk 1 [EuroPython 2021 - Talk - 2021-07-28 - Optiver] [Online] License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch

Claudia Comito - Connecting Communities: the Helmholtz Analytics Framework and the making of Heat

Connecting Communities: the Helmholtz Analytics Framework and the making of Heat [EuroPython 2021 - Keynote - 2021-07-29 - Optiver] [Online] By Claudia Comito HPC, Scientific Big Data, co-design, Python: beneath the buzzwords, bringing together academics from the most disparate research fields to work on a common product is no easy feat. What worked, what didn't, and lessons learned from the Helmholtz Analytics Framework experience. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch

Chin Hwee Ong - Designing Functional Data Pipelines for Reproducibility and Maintainability

Designing Functional Data Pipelines for Reproducibility and Maintainability [EuroPython 2021 - Talk - 2021-07-29 - Parrot [Data Science]] [Online] By Chin Hwee Ong When building data pipelines at scale, it is crucial to design data pipelines that are reliable, scalable and extensible according to evolving business needs. Designing data pipelines for reproducibility and maintainability is a challenge, as testing and debugging across compute units (threads/cores/computes) are often complex and time-consuming due to dependencies and shared states at runtime. In this talk, Chin Hwee will be sharing about common challenges in designing reproducible and maintainable data pipelines at scale, and exploring the use of functional programming in Python to build scalable production-ready data pipelines that are designed for reproducibility and maintainability. Through analogies and realistic examples inspired by data pipeline designs in production environments, you will learn about: What is Functional Programming, and how it differs from other programming paradigms Key Principles of Functional Programming How "control flow" is implemented in Functional Programming Functional design patterns for data pipeline design in Python, and how they improve reproducibility and maintainability Whether it is possible to write a purely functional program in Python This talk assumes basic understanding of building data pipelines with functions and classes/objects. While the main target audience are data scientists/engineers and developers building data-intensive applications, anyone with hands-on experience in imperative programming (including Python) would be able to understand the key concepts and use-cases in functional programming. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch

Griffith Rees - From Research Project to PyPI Release

From Research Project to PyPI Release [EuroPython 2021 - Talk - 2021-07-29 - Brian] [Online] By Griffith Rees Halfway through my first postdoc it was clear it would be very difficult to submit a paper to a journal before my contract ended. How do I make something useful in the time allotted that keeps me motivated enough to finish a paper after my contract ends (and useful on a CV)? Answer: package my code into a tested library via GitHub, The Python Package Index (PyPI) and Zenodo for citations. Goals: - Pros and cons of rearranging a project for public release (5 min) - Python cookiecutter templates (5 min) - Options for testing (standard library unittest vs pytest) (5 min) - Continuous Integration (Travis vs GitHub Actions) (5 min) - Documentation (5 min) - Release on Zotero for citation (5 min) Prerequisites: - Intermediate Python - Command line License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch

Eyal Kazin - A Gentle Introduction To Causal Inference

A Gentle Introduction To Causal Inference [EuroPython 2021 - Talk - 2021-07-29 - Parrot [Data Science]] [Online] By Eyal Kazin Correlation does not imply causation. It turns out, however, that with some simple ingenious tricks one can unveil causal relationships within standard observational data, without having to resort to expensive random control trials. I introduce the basic concepts of causal inference demonstrating in an accessible manner using visualisations. My main message for data analysts is that by adding causal inference to your statistical toolbox you are likely to conduct better experiments and ultimately get more from your data. E.g, by introducing Simpson’s Paradox, a situation where the outcome of all entries is in conflict with that of its cohorts, I shine a light on the importance of using graphs to model the data which enables identification and managing confounding factors. This talk is targeted towards anyone making data driven decisions. The main takeaway is the importance of the story behind the data is as important as the data itself. My ultimate objective is to whet your appetite to explore more on the topic, as I believe that it will enable you to go beyond correlation calculations and extract more insights from your data, as well as avoid common misinterpretation pitfalls like Simpson’s Paradox. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch

Alyona Galyeva - We build a ML pipeline after we deploy

We build a ML pipeline after we deploy [EuroPython 2021 - Talk - 2021-07-29 - Parrot [Data Science]] [Online] By Alyona Galyeva This talk covers the importance of building end-to-end machine learning pipelines from day one. What you will learn: - why we need a machine learning pipeline and when to use it; - ML pipeline building blocks covering training and inference; - engineering around failures and engineering for performance; - ML pipelines debugging and monitoring; - open-source Python libraries to save your time. For whom: - data scientists, data analysts, data engineers, machine learning engineers, data product owners, Python developers, working or willing to work with machine learning. Prerequisites: to get the most out of this talk, Data Science, ML, and Python experience is recommended License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch

Kautilya Katariya - Computational Complexity Theoretical Foundation on How Long Will Program Run

Computational Complexity Theoretical Foundation on How Long Will Program Run [EuroPython 2021 - Talk - 2021-07-29 - Parrot [Data Science]] [Online] By Kautilya Katariya Online teaching materials enable learning programming at any age. Kautilya, a 7-year-old programmer who has learned his way through multiple courses in Python and Artificial Intelligence, shares his takeaways and examples so that we can learn what he has learned about Computational Complexity. In this talk, he focuses on the time complexity in computational/algorithmic complexity. As he lays out the theoretical foundations of how we formally measure how fast a program or algorithm runs, he teaches us to analyse the amount of time our program creation choices have on how much time program will take to finish. In computer science, the time complexity is the computational complexity that describes the time it takes to run an algorithm. With strong theoretical computer science foundation, he walks us through Big-O, Big-θ, Big-Ω notations for complexity, helps us make sense of it and shows with examples of searching (linear, binary, exponential) and sorting (merge, insertion, selection) how our choices of design of algorithm impact what we experience as time to execute the program. Learn foundations and draw inspiration for learning from the Guinness World record holder on ‘Youngest Computer Programmer’ at the age of 6. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch

Oliver Cobb - Protecting Your Machine Learning Against Drift: An Introduction

Protecting Your Machine Learning Against Drift: An Introduction [EuroPython 2021 - Talk - 2021-07-29 - Parrot [Data Science]] [Online] By Oliver Cobb Deployed machine learning models can fail spectacularly in response to seemingly benign changes to the underlying process being modelled. Concerningly, when labels are not available, as is often the case in deployment settings, this failure can occur silently and go unnoticed. This talk will consist of a practical introduction to drift detection, the discipline focused on detecting such changes. We will start by building an understanding of how drift can occur, why it pays to detect it and how it can be detected in a principled manner. We will then discuss the practicalities and challenges around detecting it as quickly as possible in machine learning deployment settings where high dimensional and unlabelled data is arriving continuously. We will finish by demonstrating how the theory can be put into practice using the alibi-detect Python library. There are no hard prerequisites for understanding this talk, although background knowledge on machine learning and statistical hypothesis testing might be useful. License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch

Ondrej Urban - Automated Machine Learning With Keras

Automated Machine Learning With Keras [EuroPython 2021 - Talk - 2021-07-29 - Parrot [Data Science]] [Online] By Ondrej Urban During my first steps in the field I was promised that machine learning would be automated from the beginning. Unfortunately, once I’ve outsourced looking for the parameters that best matched my data to the machines, I was instead left with having to look for the hyperparameters that define the best model architecture, all by myself. This often ends up being a lengthy manual process. Is there a way to outsource this bit too? In this talk we will take a look at the automated machine learning libraries Keras Tuner and AutoKeras, which allow the user to create high level templates of deep learning models and use them in automated search for the best hyperparameters. They not only enable speedier development of better models but also make deep learning accessible to a wider pool of people thanks to the abstractions they offer. In the presentation we will go through several iterations of pretending to know progressively less and less about both our data and machine learning in general, and see how these libraries come to our help in creating highly performant deep learning models with a fraction of the effort. It is aimed at a general audience familiar with Python. Knowledge of Keras is a plus but not a requirement - that is kind of the whole point! The repository with a notebook with code examples is available at: https://github.com/ondrejiayc/ep2021automlkeras License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/

Watch