List of videos

Harald Armin Massa - PostgreSQL - The Database for Industry 4.0 and IOT

"PostgreSQL - The Database for Industry 4.0 and IOT [EuroPython 2017 - Talk - 2017-07-12 - PyCharm Room] [Rimini, Italy] Industry 4.0 - the current trend to make more use of data technology and analysis in manufactring. IOT - The Internet of Things, where many ""things"" currently just loosing their information will transfer and store them within central systems. There are aspects of those trends most do agree on: There will be orders of magnitude more data to store and analyze. More agents will need to connect and interact with databases. This talk will explore what makes PostgreSQL an excellent candidate to be the database for managing all that data. Strengths in development, culture and community, extensibility and robustnest will be presented. Selected features of current Version 9.6 and the soon-to-be-released PostgreSQL Version 10 will be discussed for their value in those trends. There will be an explanation of their technical realisation, and special pointers how to use those features from PostgreSQL. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch

David Liu - Infrastructure design patterns with Python, Buildbot, and Linux Containers

"Infrastructure design patterns with Python, Buildbot, and Linux Containers [EuroPython 2017 - Talk - 2017-07-12 - PyCharm Room] [Rimini, Italy] In today’s world of fast-paced development, infrastructure can get left behind quickly, leading to a potential increase in technical debt. Buildbot is normally known to be a continuous integration (CI) framework built in Python, but can be refashioned to solve infrastructure design patterns that arise in enterprise or production and deployment situations. Using Python and native Buildbot components paired with Linux Containers, patterns such as license management, resource allocation, load balancing, and enterprise application deployment can be architected quickly with room for expansion as one’s needs grow. Learn how to move past the CI mindset and construct infrastructure needs with Buildbot and popular Linux Containers such as Docker and ClearContainers. Attendees will learn the best known methods of configuring Buildbot in non-CI implementations, and how to utilize the framework components for future needs. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch

Alice Harpole - Sustainable Scientific Software Development

"Sustainable Scientific Software Development [EuroPython 2017 - Talk - 2017-07-12 - PyCharm Room] [Rimini, Italy] In the experimental Sciences, new theories are developed by applying the Scientific method to produce results which are accurate, reproducible and reliable. This involves testing the experimental setup to show that it is working as designed and thoroughly documenting the progress of the experiment. Results will not be trusted unless the experiment has been carried out to a suitable standard. In computational Science, we should aim to apply the same principles. Results should only be trusted if the code that has produced it has undergone rigorous testing which demonstrates that it is working as intended, and any limitations of the code (e.g. numerical errors) are understood and quantified. The code should be well documented so that others can understand how it works and run it themselves to replicate results. Unfortunately, this can be quite challenging. By their very nature, scientific codes are built to investigate systems where the behaviour is to some extent unknown, so testing them can be quite difficult. They can be very complex, built over a number of years (or even decades!) with contributions from many people. However, even for the most complicated of codes there are a number of different tools we can use to build robust, reliable code. In this talk, I shall look at techniques and tools you can use to build more sustainable scientific code, including testing, continuous integration and documentation. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch

Andreas Heider, Robert Wall - Taking the Hipster out of Streaming

"Taking the Hipster out of Streaming [EuroPython 2017 - Talk - 2017-07-12 - Arengo] [Rimini, Italy] Winton ingests data continually from the world's financial markets. We track millions of individual timeseries, with divergent formats, from disparate time zones, and whose frequencies vary from months to milliseconds. We go beyond simply reading and storing it - we stitch distinct and vast data sets together and subject them to intricate calculations in real-time. This talk will focus on the way we use Python to achieve these ends, and how we are creating tools to further commoditise streaming as a service. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch

Alexander Lourenco - Realtime Distributed Computing At Scale: Storm And Streamparse

"Realtime Distributed Computing At Scale (in pure Python!): Storm And Streamparse [EuroPython 2017 - Talk - 2017-07-12 - Arengo] [Rimini, Italy] Realtime distributed computing is tough, especially at scale: managing a large data pipeline is tough, and it’s even tougher to keep latency low and availability high when processing tens of thousands of items per second. Many people turn in despair to Java or Scala when it comes time to scale up, but we can do it in Python: Apache Storm is a distributed realtime computation system that can let you scale up- and no need to reach for a new language! This talk will walk the audience through the basics of Apache Storm and how it’s an elegant, useful solution to realtime distributed computing, as well as how streamparse can let you write your storm components in Python by writing some code and a basic storm topology in Python. We’ll also look at how Parsely uses Storm in production to handle billions of realtime events a month. If we have time, we’ll go a bit into how Storm has several advantages over other common Python computing data streaming solutions, like Spark’s microbatching. Goals: At the end of the talk, ideally you should be able to understand: What Apache Storm is, how it works generally, and what scenarios it’s useful for How streamparse can be used to write your Storm topologies How Storm + streamparse is used in an actual high-availability, low-latency production environment License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch

Giuseppe Di Bernardo - Big Data Analytics at the MPCDF: GPU Crystallography with Python

"Big Data Analytics at the MPCDF: GPU Crystallography with Python [EuroPython 2017 - Talk - 2017-07-12 - Anfiteatro 1] [Rimini, Italy] In close collaboration with scientists from MPG, the Max Planck Computing and Data Facility is engaged in the development and optimization of algorithms and applications for high performance computing, as well as in the design and implementation of solutions for data-intensive projects. Python is now used at MPCDF in the emerging area of “atom probe crystallography” (APT): a Fourier spectral analysis in 3D reciprocal space can be simulated in order to reveal both composition and crystallographic structure at the atomic scale of billions APT experimental data sets. The Python data ecosystem has proved to be well suited to this, as it has grown beyond the confines of single machines to embrace scalability. This talk aims to describe our approach to scaling across multiple GPUs, and the role of our visualization methods too. Our data workflow analysis relies on the GPU-accelerated Python software package called PyNX, an open source Python library which provides fast parallel computation scattering. The code is well suited for GPU computing, using both the pyCUDA and pyOpenCL libraries. Exploratory data analysis and performance tests are initially carried on through Jupyter notebooks and Python packages e.g., pandas, matplotlib, plotly. In production stage, interactive visualization is realized by using standard scientific tool, e.g. Paraview, an open-source 3D visualization program which e.g. requires Python modules to generate visualization components within VTK files. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch

Pietro Battiston - How to use pandas the wrong way

"How to use pandas the wrong way [EuroPython 2017 - Talk - 2017-07-12 - Anfiteatro 1] [Rimini, Italy] UPDATE: slides and materials can be found at http://pietrobattiston.it/python:pycon#europython_rimini_july_2017 The pandas library represents a very efficient and convenient tool for data manipulation, but sometimes hides unexpected pitfalls which can arise in various and sometimes unintelligible ways. By briefly referring to some aspects of the implementation, I will review specific situations in which a change of approach can make code based on pandas more robust, or more performant. Some examples: inefficient indexing multiple dtypes and efficiency implicit type casting HDF5 storage overhead GroupBy.apply()... when you don't actually need it License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch

Francesco Nazzaro - Facing the challenge of climate change with xarray and Dask

"Facing the challenge of climate change with xarray and Dask [EuroPython 2017 - Talk - 2017-07-12 - Anfiteatro 1] [Rimini, Italy] In the last years climate change has become one of the most important topic. For any period longer than a few days science is not able to provide comparable forecasts, but still a lot of useful information about future climate conditions can be gained on time scale of a few months to even several years. Climate forecast and climate projections data are quite complex to analyse and represent. The Python science ecosystem proves extremely effective as a platform to retrieve, analyse, process and present this type of data. The backbone of the platform is the n-dimensional array library xarray that provides the perfect mix between pandas data structures and dask performance and parallelization. Reliable climate forecasts and climate projections are now available from the Copernicus Climate Change Service, operated by ECMWF, that will become the central hub for European effort in study and mitigate climate change impacts. The service also provides access to an open cloud platform, the CDS Toolbox, that is based on the Python 3 xarray/dask/pandas stack. In this talk I will present how to retrieve, analyse, process and display climate data in a generic use case with xarray and with the Copernicus CDS Toolbox. slides: http://slides.com/francesconazzaro/europython-2017 License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch

Martin Christen - Rendering complex 3D-Geodata using pyRT

"Rendering complex 3D-Geodata using pyRT [EuroPython 2017 - Talk - 2017-07-12 - Arengo] [Rimini, Italy] PyRT (pronounced ""pirate"") is a rather new open source project creating a ray tracer in pure Python and some optional CPU/GPU acceleration using bindings. Ray tracing is a technique for generating an image by tracing the path of light. PyRT was created to render large 3D City models. In this talk, the possibilities and experiences of ray tracing in Python using pyRT are shown. pyRT also runs in the Jupyter Notebook. Rendering complex 3D-Geodata, such as 3D-City models with an extremely high polygon count and a vast amount of textures at interactive framerates is still a very challenging task, especially on mobile devices. This talk presents an approach for processing, caching and serving massive geospatial data in a cloud-based environment for large scale, out-of-core, highly scalable 3D scene rendering in a web-based solution. PyRT is used for rendering large amounts of geospatial data. The approach for processing, rendering and caching 3D-City Models is shown. Screenshots: https://github.com/martinchristen/pyRT/raw/master/jupyter/img/sponza.png https://github.com/martinchristen/pyRT/blob/master/jupyter/img/Berlin_AO_small.PNG?raw=true License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2017.europython.eu/en/speaker-release-agreement/

Watch