List of videos

Native Packaging of GUI Apps on Windows and macOS - presented by Tiago Montes

EuroPython 2022 - Native Packaging of GUI Apps on Windows and macOS - presented by Tiago Montes [Liffey Hall 1 on 2022-07-15] Distributing Python GUI applications to end users is a challenge: will they need to install Python? If so, which version? If not, how do they install the application? From a random ZIP file? How native does the process feel? Will their system trust your code? For a fluid experience, it needs to be signed and (on macOS) notarized beforehand. Welcome to `pup` (https://pypi.org/project/pup/), the tool that the Mu Editor (https://codewith.mu/) development team has created to package and distribute it in platform-native formats to Windows and macOS users around the world. In this session I will show how `pup` can be used to package GUI Applications for distribution: natively on Windows and macOS, and in early stages of development for distribution-agnostic Linux artifacts. In short, if it's `pip`-installable it is `pup`-packageable! I will then describe the way `pup` works (and how it differs from comparable tools) leading on to a call-for-action moment, where I'll share its current state of development, what's good, what's bad, and where I'd like it to be headed to. I'll wrap up the talk with a set of future-looking thoughts that `pup` has helped identify not only on the specifics of CPython's distribution, but also on the Python ecosystem as whole. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch

Packaging Python in 2022 - presented by Jeremiah Paige

EuroPython 2022 - Packaging Python in 2022 - presented by Jeremiah Paige [Liffey Hall 1 on 2022-07-15] The Python packaging landscape is experiencing a renaissance but along with new standards and new tools comes a lot of new choices when publishing. setup.cfg or pyproject.toml? Do you need a setup.py instead or in addition? There can be a lot of confusion but understanding modern trends can make sharing your code easier than ever before. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch

Machine Translation engines evaluation framework - presented by Anton Masalovich

EuroPython 2022 - Machine Translation engines evaluation framework - presented by Anton Masalovich & Sahil Manchanda [Liffey Hall 1 on 2022-07-15] Task of Machine Translation engine evaluation may be very challenging. Quality of Machine Translation varies greatly depending on domain and language pair. Different MT engines may have different interfaces or APIs and different requirements to run. To add to that, even definition of a good translation may be debatable, with any automatic MT quality metric providing only approximation of actual translation quality. That's why having universal evaluation framework for this task is very important. In our work we tried to create such framework. 1) We defined base translation class that unified all file handling, batch creation and result processing. As a result of that, only work needed to support new MT engine was creation of small child class that implemented couple of simple functions. That allows us to easily extend our framework to MT engines and new language pairs. 2) We defined set of test datasets and provided a way to add new datasets to this set. For our evaluation our aim was to create test data that covers both general and healthcare domains EMEA dataset (https://opus.nlpl.eu/EMEA.php), OPUS-100 (https://opus.nlpl.eu/opus-100.php), Paracrawl (https://paracrawl.eu/) and several others. But our data preparations scripts can be easily extended to other domains and datasets as well. 3) We defined a set of quality metrics to evaluate results of MT engines. Metrics that we used included BLEU (https://github.com/mjpost/sacrebleu), BERTScore (https://github.com/Tiiiger/bert_score), ROUGE (https://github.com/pltrdy/rouge), TER and CHRF (both also from sacrebleu implementation). Beside MT evaluation framework we will present our own evaluation results. For our evaluation we used cloud based engines - Azure Translator (https://azure.microsoft.com/en-us/services/cognitive-services/translator/), Google Translate (https://cloud.google.com/translate/), as well as open-source engines - Marian MT (https://huggingface.co/transformers/model_doc/marian.html), NVIDIA's NeMo (https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/machine_translation.html), Facebook's MBart 50 (https://huggingface.co/facebook/mbart-large-50-one-to-many-mmt), Facebook's M2M100 (https://huggingface.co/facebook/m2m100_418M). For open source engines we tried to use Huggingface's transformer implementation whenever possible. But as we mentioned our framework was designed in a way to be easily extendable to other MT engines and underlying frameworks. We also will present evaluation results for NeMo and MarianMT engines that we fine-tuned specifically for healthcare domain. While these particular results may rather specific to our use case, they help to highlight how our framework can be extended to custom MT engines as well. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch

AI for Content Moderation at PayPal - presented by Raghotham Sripadraj & Ryan Roggenkemper

EuroPython 2022 - AI for Content Moderation at PayPal - presented by Raghotham Sripadraj & Ryan Roggenkemper [Liffey Hall 1 on 2022-07-15] Online content moderation at scale is a non trivial task especially with an ever changing landscape of hate, hate speech with changing geopolitical scenarios. Moderation platforms need to support multiple typologies like - hate, sexually explicit, violence, bullying, spam and other toxic material. Add multi-language support for all typologies and it becomes an uphill task. In this talk we will cover the below topics: 1. Why is Text Content Moderation is hard? Why we need AI? 2. What are the available open-source datasets to train models? 3. What are the available pre-trained models for content moderation? 4. Why pre-trained models do not always work? 5. Data labelling strategies and how to leverage open data and models? 6. How to build multi-language support and challenges? This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch

Handling Errors the Graceful Way in Python - presented by Riya Bansal

EuroPython 2022 - Handling Errors the Graceful Way in Python - presented by Riya Bansal [Liffey B on 2022-07-15] In the process of programming, we are always going to encounter various errors. Things rarely go as planned, especially in the world of programming. Errors are unavoidable when writing code, which can be frustrating at times. Every single one of us has faced this issue and emerged from it a better programmer. Dealing with bugs and errors is what builds our confidence in the long run and teaches us valuable lessons along the way. So, in this talk, we'll discuss different ways of handling errors and making our lives a little better. We'll talk about how code written with effective exception handling strategies can help us to catch bugs early in the software developmental cycle. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch

Secure Python ML: The Major Security Flaws in the ML Lifecycle - presented by Alejandro Saucedo

EuroPython 2022 - Secure Python ML: The Major Security Flaws in the ML Lifecycle (and how to avoid them) - presented by Alejandro Saucedo [Liffey B on 2022-07-15] - Abstract Every phase across the end-to-end machine learning lifecycle exposes a plethora of security risks that often go unnoticed by machine learning practitioners. In this talk we uncover the most critical (and common) security risks in the machine learning lifecycle, covering in-depth concepts as well as practical examples of ways in which these can be exploited as well as resolved and mitigated (analogous to the OWASP Top 10 industry standard). Throughout the talk we will be using a hands on example, where we will be training, packaging and deploying a model from scratch, outlining key risk areas for each step together with tools and practices that can be used to mitigate these risks. By the end of this talk, machine learning practitioners will have a robust intuition of the importance of security best practices throughout the machine learning lifecycle, together with tools and frameworks that can help mitigate undesirable outcomes due to security flaws. - Overview The operation and maintenance of large scale production machine learning systems has uncovered new challenges which have required fundamentally different approaches to that of traditional software. The area of security in MLOps has seen a rise in attention as machine learning infrastructure expands to further critical usecases across industry. In this talk we introduce the conceptual and practical topics around MLSecOps that data science practitioners will be able to adopt or advocate for. We will also provide an intuition on key security challenges that arise in production machine learning systems as well as best practices and frameworks that can be adopted to help mitigate security risks in ML models, ML pipelines and ML services. We will cover a practical example showing how we can secure a machine learning model, and showcasing the security risks and best practices that can be adopted during the feature engineering, model training, model deployment and model monitoring stages of the machine learning lifecycle. - Benefits to the ecosystem This talk will provide practitioners with the intuition and tools to secure production machine learning systems, as well as further the discussion around best practices reinforcing SecOps into MLOps. It will provide best practices on a critical area of machine learning operations which is of paramount importance in production. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch

Rapid prototyping in BBC News with Python and AWS - presented by Ben Nuttall

EuroPython 2022 - Rapid prototyping in BBC News with Python and AWS - presented by Ben Nuttall [Liffey B on 2022-07-15] BBC News Labs is an innovation team within BBC R&D, working with journalists and production teams to build prototypes to demonstrate and trial new ideas for ways to help journalists or bring new experiences to audiences. We work in short project cycles to research and build prototypes. We have worked with the BBC's flagship radio news programme production team to enrich programme timelines with metadata to provide enhanced experiences to the audience. We are currently working with local radio teams around the UK to provide the means to capture highlights in live radio for re-use and for social media, reducing the workload for producers, and getting more mileage from linear broadcast programmes. Working in short cycles, it's important for us to be able to quickly build processing pipelines connected to BBC services, test and iterate on ideas and demonstrate working prototypes. We make use of modern cloud technologies to accelerate delivery and reduce friction. In this talk I will share our ways of working, our ideation and research methods, and the tools we use to be able to build, deploy and iterate quickly, the BBC's cloud deployment platform, and our use of serverless AWS services such as Lambda, Step Functions and Serverless Postgres. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch

WIP: Implementing PEP 458 to Secure PyPI downloads - presented by Kairo de Araujo, Lukas Pühringer

EuroPython 2022 - Work in Progress: Implementing PEP 458 to Secure PyPI downloads - presented by Kairo de Araujo, Lukas Pühringer [Liffey B on 2022-07-15] Attacks on software repositories are extremely common and can have a vast impact. A single successful compromise of the content distribution infrastructure can affect millions of users, voluntarily installing the infected packages. PEP 458 (https://peps.python.org/pep-0458/) was designed to protect PyPI against a variety of possible attacks on PyPIs own content distribution network and PyPI mirrors, while giving administrators a mechanism to recover from a compromise if it happens. In addition, PEP 458 is a fundamental stepping stone for more advanced protection described in PEP 480 (https://peps.python.org/pep-0480/). Both PEP 458 and 480 implement a specification called ""The Update Framework"" (TUF) (http://theupdateframework.io/), which introduces a series of roles, keys and metadata formats that are published along with the packages they protect, and can be verified by a client software such as pip. Over the past couple of months we have made an effort to integrate the latest version of the Python TUF reference implementation with PyPI/Warehouse; see draft PR (https://github.com/pypa/warehouse/pull/10870). In this talk we will give an introduction to PEP 458 and TUF, how it works and what it is good for. We will report from the work-in-progress integration with Warehouse, what challenges we face and how Python developer and user workflows are affected, as well as an expected timeline for the integration. And last but not least, we want to give an outlook of what comes after PEP 458, that is full developer-to-user end-to-end protection of Python packages as described by PEP 480. With our talk we also hope to spark interest in software supply chain security and to encourage the community to get involved by reviewing, commenting and contributing to the PEP 458 and PEP 480 integration efforts. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch

Tales of Python Security - presented by Steve Dower

EuroPython 2022 - Tales of Python Security - presented by Steve Dower [Liffey B on 2022-07-15] In this session, you'll learn about recent security issues in CPython and the core parts of our ecosystem. You'll hear about the process by which they were filed, how they were reviewed, analysed, shared (when appropriate), resolved and ultimately disclosed to the public. As well as real stories of security vulnerabilities, you'll learn how you can help by responsibly reporting potential issues, and how to protect yourself against common risks, as well as the best ways to find out about major issues and how to respond. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/

Watch