List of videos

Anselm Kruis - Post-Mortem Debugging with Heap-Dumps

Anselm Kruis - Post-Mortem Debugging with Heap-Dumps [EuroPython 2014] [25 July 2014] UNIX core-dumps, Windows minidumps or Java heap-dumps are well established technologies for post-mortem defect analysis. I'll present a similar technology for Python. An improved pickling mechanism makes it possible to serialise the state of a Python program for subsequent analysis with a conventional Python-debugger. ----- Post-Mortem Debugging with Heap-Dumps === UNIX core-dumps, Windows minidumps and analogous solutions of other operating systems are well established technologies for post-mortem defect analysis of native-code processes. In principle those dumps can be used to analyse „interpreted“ programs running within a native-code interpreter-process. However in practise this approach is tedious and not always successful \[1\]. Therefore operating system independent dump methods were developed for some „interpreted“ languages \[2\]. A prominent example are Java heap dumps \[3\]. Unfortunately up to now there was no practically usable dump-method for Python. Various attempts were made to utilise OS-level dump methods \[4, 5\]. In 2012 Eli Finer published the Python module *pydump* \[6\]. This module pickles the traceback of an exception and subsequently uses the pdb debugger to analyse the unpickled traceback. Unfortunately *pydump* fails on PicklingErrors. In my talk I'll present the Python package [*pyheapdump*](https://pypi.python.org/pypi/pyheapdump). It has the same operation principle as Eli's *pydump*, but is an independent implementation. *pyheapdump* uses an extended pickler ([sPickle](https://pypi.python.org/pypi/sPickle)) to serialise all relevant objects of a Python process to a file. Later on a fault tolerant unpickler recreates the objects and a common Python debugger can be used to analyse the dump. The pickler extensions make it possible to: * pickle and unpickle many commonly not pickleable objects [7]. * replace the remaining not pickleable objects by surrogate objects so that the resulting object graph is almost isomorphic to the original object graph. Which objects are relevant? In its default operation mode *pyheapdump* uses the frame-stacks of all threads as start point for pickling. Following the usual rules for pickling the dump includes all local variables and all objects reachable from a local variable and so on. That is usually enough for a successful defect analysis. Compared with other Python post-mortem debugging methods *pyheapdump* has several advantages: * It is a pure Python solution and independent from the operation system. * Creation of the pyheapdump and fault analysis can be performed different computers. * It is not obstructive. It does not modify / monkey-patch or disturb the dumped process in any way, with the exception of loading additional modules. * If used with the Pydev-debugger, it supports multi-threaded applications. * If used with the Pydev-debugger and Stackless Python, it supports tasklets. The implementation of *pyheapdump* is fairly small, because it draws most of its functionality from the underlying sPickle package and from the new Stackless-Support \[8\] of the Pydev-Debugger. Therefore it is - despite of its short history - already a useful piece of software. Outline of the talk --- 1. Introduction to the problem 2. Previous works 3. The concept of *pyheapdump* 4. Live demonstration 5. Open problems and further development 6. Questions and Answers References --- 1. Andraz Tori, Python, 2011-01-16: *gdb and a very large core dump*, blog at <http://www.zemanta.com/blog/python-gdb-large-core-dump/> 2. David Pacheco, ACM Queue - Programming Languages Volume 9 Issue 10, October 2011: *Postmortem Debugging in Dynamic Environments*, PDF <http://dl.acm.org/ft_gateway.cfm?id=2039361&ftid=1050739&dwn=1&CFID=290171300&CFTOKEN=95099236> 3. Chris Bailey, Andrew Johnson, Kevin Grigorenko, IBM developerWorks, 2011-03-15: *Debugging from dumps - Diagnose more than memory leaks with Memory Analyzer*, PDF <http://www.ibm.com/developerworks/library/j-memoryanalyzer/j-memoryanalyzer-pdf.pdf> 4. Brian Curtin, 2011-09-29: *minidumper - Python crash dumps on Windows*, blog at <http://blog.briancurtin.com/posts/20110929minidumper-python-crash-dumps-on-windows.html> 5. David Malcolm, Fedora Feature, 2010-04-06: *Easier Python Debugging* at <http://fedoraproject.org/wiki/Features/EasierPythonDebugging> 6. Eli Finer, Github-Project, 2012: *pydump* at <https://github.com/gooli/pydump> 7. Anselm Kruis, EuroPython 2011: *Advanced Pickling with Stackless Python and sPickle*, archived talk at <https://ep2013.europython.eu/conference/talks/advanced-pickling-with-stackless-python-and-spickle> 8. Fabio Zadrozny, 2013-12-12: *PyDev 3.1.0 released* blog at <http://pydev.blogspot.de/2013/12/pydev-310-released.html>

Watch

Adriana Vasiu - Cutting-edge APIs using hypermedia at BSkyB

Adriana Vasiu - Cutting-edge APIs using hypermedia at BSkyB [EuroPython 2014] [22 July 2014] In this talk I will explain what hypermedia enabled API means, I will give an example of such an API and I will take you through the implementation details and the usage of flask, dougrain and HAL in this context. Also, I will present a brief comparison with an API that is not hypermedia enabled and take you through the advantages of using the hypermedia approach. ----- In the technology community at the moment there is a lot of talk about hypermedia enabled APIs and Web as an Architecture model. More and more applications nowadays try to adopt the loosely coupled and distributed web like architecture by using hypermedia as an engine of the application state. In Sky we are successfully implementing this approach for some of our components, and we’ve learnt that the major benefit for us is the scalability that it offers: as an increasingly expanding business with a constantly growing product portfolio, scalability of all our systems is crucial. In this talk I will share some of the things we learnt, I will explain what hypermedia enabled API means, I will give an example of such an API and I will take you through the implementation details and the usage of flask, dougrain and HAL in this context. Also, I will present a brief comparison with an API that is not hypermedia enabled and take you through the advantages of using the hypermedia approach.

Watch

Peter Hoffmann - log everything with logstash and elasticsearch

Peter Hoffmann - log everything with logstash and elasticsearch [EuroPython 2014] [22 July 2014] When your application grows beyond one machine you need a central space to log, monitor and analyze what is going on. Logstash and elasticsearch let you store your logs in a structured way. Kibana is a web fronted to search and aggregate your logs. ----- The talk will give an overview on how to add centralized, structured logging to a python application running on multiple servers. It will focus on useful patterns and show the benefits from structured logging.

Watch

Francesc Alted - Out-of-Core Columnar Datasets

Francesc Alted - Out-of-Core Columnar Datasets [EuroPython 2014] [25 July 2014] Tables are a very handy data structure to store datasets to perform data analysis (filters, groupings, sortings, alignments...). But it turns out that how the tables are actually implemented makes a large impact on how they perform. Learn what you can expect from the current tabular offerings in the Python ecosystem. ----- It is a fact: we just entered in the Big Data era. More sensors, more computers, and being more evenly distributed throughout space and time than ever, are forcing data analyists to navigate through oceans of data before getting insights on what this data means. Tables are a very handy and spreadly used data structure to store datasets so as to perform data analysis (filters, groupings, sortings, alignments...). However, the actual table implementation, and especially, whether data in tables is stored row-wise or column-wise, whether the data is chunked or sequential, whether data is compressed or not, among other factors, can make a lot of difference depending on the analytic operations to be done. My talk will provide an overview of different libraries/systems in the Python ecosystem that are designed to cope with tabular data, and how the different implementations perform for different operations. The libraries or systems discussed are designed to operate either with on-disk data ([PyTables] [1], [relational databases] [2], [BLZ] [3], [Blaze] [4]...) as well as in-memory data containers ([NumPy] [5], [DyND] [6], [Pandas] [7], [BLZ] [3], [Blaze] [4]...). A special emphasis will be put in the on-disk (also called out-of-core) databases, which are the most commonly used ones for handling extremely large tables. The hope is that, after this lecture, the audience will get a better insight and a more informed opinion on the different solutions for handling tabular data in the Python world, and most especially, which ones adapts better to their needs. [1]: http://www.pytables.org [2]: http://en.wikipedia.org/wiki/Relational_database [3]: http://blz.pydata.org [4]: http://blaze.pydata.org [5]: http://www.numpy.org/ [6]: https://github.com/ContinuumIO/dynd-python [7]: http://pandas.pydata.org/

Watch

Schlomo Schapiro - DevOps Risk Mitigation: Test Driven Infrastructure

Schlomo Schapiro - DevOps Risk Mitigation: Test Driven Infrastructure [EuroPython 2014] [23 July 2014] The (perceived) risk of the DevOps is that too many people get the right to "break" the platform. Test Driven Infrastructure is about adapting proven ideas from our developer colleagues to the development and operations of Infrastructure services like virtualization, OS provisioning, postfix configuration, httpd configuration, ssh tuning, SAN LUN mounting and others. This talk shows how ImmobilienScout24 utilizes more and more test driven development in IT operations to increase quality and to mitigate the risk of opening up the infrastructure developmen to all developers. ----- Common wisdom has it that the test effort should be related to the risk of a change. However, the reality is different: Developers build elaborate automated test chains to test every single commit of their application. Admins regularly “test” changes on the live platform in production. But which change carries a higher risk of taking the live platform down? What about the software that runs at the “lower levels” of your platform, e.g. systems automation, provisioning, proxy configuration, mail server configuration, database systems etc. An outage of any of those systems can have a financial impact that is as severe as a bug in the “main” software! One of the biggest learnings that any Ops person can learn from a Dev person is Test Driven Development. Easy to say - difficult to apply is my personal experience with the TDD challenge. This talk throws some light on recent developments at ImmobilienScout24 that help us to develop the core of our infrastructure services with a test driven approach: * How to do unit tests, integration tests and systems tests for infrastructure services? * How to automatically verify Proxy, DNS, Postfix configurations before deploying them on live servers? * How to test “dangerous” services like our PXE boot environment or the automated SAN mounting scripts? * How to add a little bit of test coverage to everything we do. * Test Driven: First write a failing test and then the code that fixes it. The tools that we use are Bash, Python, Unit Test frameworks and Teamcity for build and test automation. See http://blog.schlomo.schapiro.org/2013/12/test-driven-infrastructure.html for more about this topic.

Watch

ssc - Event discrete simulation with SimPy

ssc - Event discrete simulation with SimPy [EuroPython 2014] [25 July 2014] Often, experiments with real world systems are high-risk, accompanied by high costs or not even possible at all. That’s when simulations come into play. This talk will give a brief introduction into the topic of simulation. By means of simple examples, it will demonstrate how you can use SimPy to implement event-discrete simulations and which features SimPy offers to help you doing that. ----- Simulation is important for the analysis of complex systems or the analysis of the impact of certain actions on that systems. They are especially useful if the actions are potentially harmful or expensive. Simulation is used in various natural scientific and economic areas, e.g., for the modeling and study of biological or physical systems, for resource scheduling and optimization or at the research for the integration of renewable energies into the power grid (my personal background). The simulated time can thereby be seen as continuous or discrete (discrete time or discrete event). In this talk, I want to show why Python is a good choice for implementing simulation models and how SimPy can help here. Structure of the talk (20min talking + 5min discussion + 5min buffer): - Why simulation? (5min) - History of SimPy (3min) - How does SimPy work? (9min) - Conclusion (3min) In the introduction, I’ll briefly explain what simulation is and motivate, why it is a useful tool. The main part will consist of an introduction and demonstration of SimPy. Since SimPy is now more then ten years old, I’ll first give a quick overview about its history and development. Afterwards, I’ll explain SimPy’s concepts and features by means of simple examples. In the conclusion, I’ll give a short outlook on the future development of SimPy. The main goal of this talk is to create awareness that simulation is a powerful tool in a lot of domains and to give the audience enough information to ease their first steps.

Watch

Matt - Full Stack Python

Matt - Full Stack Python [EuroPython 2014] [22 July 2014] There has been a lot of noise about being a "full stack developer" recently. What does the full web stack look like for Python and how do you go about learning each piece? This talk will guide you up the layers from the server that handles the web request through the JavaScript that executes on a user's browser. ----- This talk distills information from the open source guide [Full Stack Python](http://www.fullstackpython.com/) I wrote into a 30 minute talk on web stack layers. An approximate timeline for this talk would be: * 5 min: intro story * 5 min: what the web developers need to know about virtual servers, web servers, and WSGI servers * 5 min: what do web frameworks provide? * 5 min: what are the most important parts of your web application to analyze and monitor? * 5 min: static files and execution on the user's browser * 5 min: concluding story and resources to learn more This is a high level overview intended for developers who are new to Python web development and need to understand what the web stack layers are and how they fit together.

Watch

Stefanie Lück - RISCy Business: Development of a RNAi design and off-target prediction software

Stefanie Lück - RISCy Business: Development of a RNAi design and off-target prediction software [EuroPython 2014] [24 July 2014] RNA interference (RNAi) is a biological mechanism for targeted inhibition of gene expression. It has also been used routinely to discover genes involved in the interaction of plants with pathogenic fungi. To minimize the miss-targeting of unrelated genes and to maximize the RNAi efficiency, we have developed a PyQt based cross- platform software tool called “si-Fi”. Our aim of the talk is to show that also hobby programmers can use Python in a very useful way.

Watch

Tom Christie - Documenting your project with MkDocs.

Tom Christie - Documenting your project with MkDocs. [EuroPython 2014] [22 July 2014] MkDocs is a new tool for creating documentation from Markdown. The talk will cover: How to write, theme and publish your documentation. The background and motivation for MkDocs. Choosing between MkDocs or Sphinx. ----- This talk will be a practical introduction to MkDocs, a new tool for creating documentation from Markdown: * The background behind MkDocs and the motivation for creating a new documentation tool. * Comparing against Sphinx - what benefits each tool provides. * Getting starting with MkDocs - how to write, theme and publish your documentation. * Under the covers - how MkDocs works, and some asides on a couple of the neat Python libraries that it uses.

Watch