List of videos

Paolo Galeone - Dissecting tf.function to discover AutoGraph strengths and subtleties

"Dissecting tf.function to discover AutoGraph strengths and subtleties [EuroPython 2019 - Talk - 2019-07-10 - Singapore [PyData track] [Basel, CH] By Paolo Galeone AutoGraph is one of the most exciting new features of Tensorflow 2.0: it allows transforming a subset of Python syntax into its portable, high-performance and language agnostic graph representation bridging the gap between Tensorflow 1.x and the 2.0 release based on eager execution. Using AutoGraph with the code@tf.fuction/code decorator seems easy, but in practice, writing efficient and correctly graph-convertible code requires to know in detail how AutoGraph and tf.function work. In particular, knowing how: A graph is created and when it is re-used; To deal with functions that create a state; To correctly use the Tensorflow codetf.Tensor/code object instead of using the Python native types to speed-up the computation; defines the minimum skill-set required to write correct graph-accelerable code. The talk will guide you trough AutoGraph and codetf.function/code highlighting all the peculiarities that are worth knowing to build the right skill-set. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/

Watch

Gael Varoquaux - Machine learning on non curated data

"Machine learning on non curated data [EuroPython 2019 - Talk - 2019-07-11 - Singapore [PyData track] [Basel, CH] By Gael Varoquaux According to industry surveys [1], the number one hassle of data scientists is cleaning the data to analyze it. Textbook statistical modeling is sufficient for noisy signals, but errors of a discrete nature break standard tools of machine learning. I will discuss how to easily run machine learning on data tables with two common dirty-data problems: missing values and non-normalized entries. On both problems, I will show how to run standard machine-learning tools such as scikit-learn in the presence of such errors. The talk will be didactic and will discuss simple software solutions. It will build on the latest improvements to scikit-learn for missing values and the DirtyCat package [2] for non normalized entries. I will also summarize theoretical analyses in recent machine learning publications. This talk targets data practitioners. Its goal are to help data scientists to be more efficient analysing data with such errors and understanding their impacts. With missing values, I will use simple arguments and examples to outline how to obtain asymptotically good predictions [3]. Two components are key: imputation and adding an indicator of missingness. I will explain theoretical guidelines for these, and I will show how to implement these ideas in practice, with scikit-learn as a learner, or as a preprocesser. For non-normalized categories, I will show that using their string representations to “vectorize” them, creating vectorial representations gives a simple but powerful solution that can be plugged in standard statistical analysis tools [4]. [1] Kaggle, the state of ML and data science 2017 https://www.kaggle.com/surveys/2017 [2] https://dirty-cat.github.io/stable/ [3] Josse Julie, Prost Nicolas, Scornet Erwan, and Varoquaux Gaël (2019). “On the consistency of supervised learning with missing values”. https://arxiv.org/abs/1902.06931 [4] Cerda Patricio, Varoquaux Gaël, and Kégl Balázs. ""Similarity encoding for learning with dirty categorical variables."" Machine Learning 107.8-10 (2018): 1477 https://arxiv.org/abs/1806.00979 License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/

Watch

Sven-Hendrik Haase - Become a command line wizard

"Become a command line wizard [EuroPython 2019 - Talk - 2019-07-10 - Boston] [Basel, CH] By Sven-Hendrik Haase There are many modern terminal tools with vastly improved user experiences as compared to their traditional alternatives. This talk aims to show off some of those modern terminal tools and compare them side by side with the traditional ones. Python is not only used by software developers with fancy IDEs but also by DevOps engineers, administrators, and on remote development machines where using a GUI is impractical. Therefore, many people are stuck with a terminal interface only and have to use tools like vim, grep, find, wc, cloc, less and many others to explore their way around their Python programs. However, thanks to the advent of many new and improved tools, we can do many of the same tasks better, faster and with nicer ergonomics. This talk will show off effective use of vim as an IDE with completions and linting provided by LSP, fd (instead of find) for finding files, ripgrep (instead of grep) for searching strings, tokei (instead of cloc) for counting lines of code, bat (instead of cat) for looking at files, hyperfine for microbenchmarking, httpie (instead of curl) for making HTTP requests, sd (instead of sed) for text replacement. This talk should make terminal work more approachable for all attendees by showing off how to do some everyday tasks on the terminal. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/

Watch

Katherine Kampf - Building a Powerful Pet Detector in Notebooks

"Building a Powerful Pet Detector in Notebooks [EuroPython 2019 - Talk - 2019-07-10 - MongoDB [PyData track] [Basel, CH] By Katherine Kampf Ever wondered what breed that dog or cat is? Let’s build a pet detector service to recognize them in pictures! In this talk, we will walk through the training, optimizing, and deploying of a deep learning model using Azure Notebooks. We will use transfer learning to retrain a MobileNet model using TensorFlow to recognize dog and cat breeds using the Oxford IIIT Pet Dataset. Next, we’ll optimize the model and tune our hyperparameters to improve the model accuracy. Finally, we will deploy the model as a web service in. Come to learn how you can quickly create accurate image recognition models with a few simple techniques! License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/

Watch

Elizaveta Shashkova - Visual debugger for Jupyter Notebooks: Myth or Reality?

"Visual debugger for Jupyter Notebooks: Myth or Reality? [EuroPython 2019 - Talk - 2019-07-10 - MongoDB [PyData track] [Basel, CH] By Elizaveta Shashkova Many Python developers like Jupyter Notebooks for their flexibility: they are very useful for interactive prototyping, scientific experiments, visualizations and many other tasks. There are different development tools which make working with Jupyter Notebooks easier and smoother, but all of them lack very important feature: visual debugger. Since Jupyter Kernel is a usual Python process, it looks reasonably to use one of existing Python debuggers with it. But is it really possible? In this talk we’ll try to understand how Python debugger should be changed to work with Jupyter cells and how these changes are already implemented in the PyCharm IDE. After that we’ll look into the whole Jupyter architecture and try to understand which bottlenecks in it prevent creation of universal Jupyter debugger at the moment. This talk requires a basic knowledge of Jupyter Notebooks and understanding of Python functions and objects. It will be interesting for people who want to learn internals of the tools they use every day. Also it might be an inspiration for people who want to implement a visual debugger in their favourite IDE. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/

Watch

Cheuk Ho - Do we have a diversity problem in Python community?

"Do we have a diversity problem in Python community? [EuroPython 2019 - Talk - 2019-07-10 - PyCharm] [Basel, CH] By Cheuk Ho The diversity statement quoted as follows: “The Python Software Foundation and the global Python community welcome and encourage participation by everyone. Our community is based on mutual respect, tolerance, and encouragement, and we are working to help each other live up to these principles. We want our community to be more diverse: whoever you are, and whatever your background, we welcome you.” Diversity, big deal! As an active members and event organisers (and also on the minority side of the gender) in the Python community, we have alway been concern by the question of: Do we truly have a problem in diversity? Especially, gender diversity. We would like to find out the truth, by data science, and see if we can find a clue why and how we can fix it. First, we will show the research others did regarding the representation of women in the R and Python communities [1]. Then, we will show the research that we did based on our experience and statistic. Including static analysis of the speakers diversity (regarding gender) at major PyCon and PyData conferences. Finally, as we all care about diversity and want improvements, we would like to find out the reason and what we can do about it. We would propose what we, the minorities and allies, could do against this seemingly unbalance situation and make the community better. This talk is for all that who cares about diversity in our community. [1] https://reshamas.github.io/why-women-are-flourishing-in-r-community-but-lagging-in-python/ Update: slides at https://slides.com/cheukting_ho/do-we-have-a-diversity-problem-in-python-community License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/

Watch

Leonardo Rochael Almeida - From days to minutes, from minutes to milliseconds with SQLAlchemy

"From days to minutes, from minutes to milliseconds with SQLAlchemy [EuroPython 2019 - Talk - 2019-07-10 - Boston] [Basel, CH] By Leonardo Rochael Almeida Object Relational Mappers (ORMs) are awesome enhancers of developer productivity. The freedom of having the library write that SQL and give you back a useful, rich model instance (or a bunch of them) instead of just a tuple or a list of records is simply amazing. But if you forget you have an actual database behind all that convenience, then it'll bite you back, usually when you've been in production for a while, after you've accumulated enough data that your once speedy application starts slowing down do a crawl. Databases work best when you ask them once for (or to do) a bunch of stuff, instead of asking them lots of times for small stuff. We'll discuss how innocent looking attribute accesses on your model instances translate to sequential queries (the infamous N+1 problem). Then we'll go through some practical solutions, taken from real cases, that resulted in massive speed ups. We'll cover how changes in Python code resulted in changes to the resulting SQL Queries. We'll see solutions not only for queries, but also for inserts and updates, which tend to be less well documented. Though this talk focuses on SQLAlchemy, the lessons should be applicable to most ORMs in most programing languages. The ideas discussed, and solutions proposed are also valid for any storage back-end, not only SQL databases. This talk is geared towards Python developers with systems that talk to databases. It should be accessible to anyone who already programs in Python (early intermediary level), but will be most useful for developers with projects talking to SQL databases, specially using an ORM like SQLAlchemy. Attendees will learn to detect how N+1 query situations arise and how to work around them effectively. They will also learn how to do mass inserts and mass updates with SQLAlchemy. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/ "

Watch

Emmanuelle Gouillart - Image processing with scikit-image and Dash

"Image processing with scikit-image and Dash [EuroPython 2019 - Talk - 2019-07-10 - MongoDB [PyData track] [Basel, CH] By Emmanuelle Gouillart Images are an ubiquitous form of data in various fields of science and industry. Images often need to be transformed and processed, for example for helping medical diagnosis by extracting regions of interest or measures, or for building training sets for machine learning. In this talk, I will present and discuss several tools for automatic and interactive image processing with Python. I will start by a short introduction to scikit-image (https://scikit-image.org/), the open-source image processing toolkit of the Pydata ecosystem, which aims at processing images from a large class of modalities (2-D, 3-D, etc.) and strives to have a gentle learning curve with pedagogical example-based documentation. scikit-image provides users with a simple API based on a large number of functions, which can be used to build pipelines of image processing workflows. In a second part, I will explain how to use Dash for building interactive image processing operations. Dash (https://dash.plot.ly/) is an open-source Python web application framework developed by Plotly. Written on top of Flask, Plotly.js, and React.js, Dash is meant for building data visualization apps with highly custom user interfaces in pure Python. The dash-canvas component library of Dash (https://dash.plot.ly/canvas) is an interactive component for annotating images with several tools (freehand brush, lines, bounding boxes, ...). It also provides utility functions for using user-provided annotations for several image processing tasks such as segmentation, transformation, measures, etc. The latter functions are based on libraries such scikit-image and openCV. A gallery of examples showcases some typical uses of Dash for image processing on https://dash-canvas.plotly.host/. Also, other components of Dash can be leveraged easily to build powerful image processing applications, such as widgets to tune parameters or data tables for inspecting object properties. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/

Watch

Christian Barra - How software can feed the world 🌱

"How software can feed the world 🌱 [EuroPython 2019 - Talk - 2019-07-11 - Osaka / Samarkand [PyData track] [Basel, CH] By Christian Barra Infarm is a FaaS, Farming as a Service, and whether you believe it or not, our business is in-house farming at scale. We design and build our farms, grow vegetables and sell them, and the backbone of our infrastructure is based on Python. You can check this video to see what we do - https://twitter.com/christianbarra/status/1096399602159439874 More than 10 million observations are recorded from our farms, feeding our farm management system that allows operators, plant scientists, and supervisors to monitor each farm in real-time. During this talk I will briefly introduce the world's problems we are trying to resolve at Infarm and then talk about our IoT farms, infrastructure, how we use Python and how we plan to improve the capabilities of our farms by adding edge machine learning. Agenda What are the problems we are trying to solve at Infarm Our 4 tech pillars How we started with Python Issues we are facing while scaling our Python infrastructure to support > 400 farms How we plan to evolve our software and infrastructure on 4 different levels: consolidate, architecture, cloud native and observability How Python is going to support our automated farms and its role in making the farms smarter (edge computing with AI) License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2019.europython.eu/events/speaker-release-agreement/

Watch