List of videos

Elisabetta Bergamini - Bad hotel again? Find your perfect match!

Bad hotel again? Find your perfect match! [EuroPython 2018 - Talk - 2018-07-25 - PyCharm [PyData]] [Edinburgh, UK] By Elisabetta Bergamini For most travellers, online reviews play a major role when it comes to choosing which hotel to stay in. But can we actually trust a hotel review? And if yes, how can we select which are the most meaningful and interesting for us among the billions available in platforms such as Booking.com, Tripadvisor, Facebook (just to mention a few)? For 10 years now, at TrustYou we have built processes that analyze terabytes of hotel reviews at a global scale, and strive to understand what people complain about or like in the hotels worldwide. Dealing with a huge amount of reviews written in tens of different languages - each having its own subtle shades of meanings - is the challenge we work on everyday. In this talk, we will show what goes on behind the scenes of the TrustYou Metareview and dive into the technologies and the algorithms that allow us to provide travellers with all the information they need to find the perfect hotel. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch

Cheuk Ting Ho - Fuzzy Matching Smart Way of Finding Similar Names Using Fuzzywuzzy

Fuzzy Matching - Smart Way of Finding Similar Names Using Fuzzywuzzy [EuroPython 2018 - Talk - 2018-07-25 - PyCharm [PyData]] [Edinburgh, UK] By Cheuk Ting Ho Matching strings should be one of the first natural language processing problem that human encounter since we start use computer to handle data. Unlike numerical value which has an exact logic to compare them, it is very hard to say how alike two strings are for a computer. One may compare them character by character and have an idea of how many characters in the pair of stings are the same. Unfortunately in most application we need computer to perceive strings like we do and therefore we have to use fuzzy matching. Fuzzy matching on names is never straight forward though, the definition of how “difference” of two names are really depends case by case. For example with restaurant names, matching of words like “cafe” “bar” and “restaurant” are consider less valuable then matching of some other less common words. Also, do we consider company names that matches partly (like “Happy Unicorn company” and Happy Unicorn co.”) are the same? In the first half of the talk Levenshtein Distance, a measure of the similarity between two strings, will be explained. Different functions in Fuzzywuzzy like “partial em ratio” and “token/em sort_ratio” will also be explored and compared for difference. It is very important to understand our tool and choose the right one for our task. Then in the second half, we will start tackling the example problem: matching company names, we will show that besides using Fuzzywuzzy, we have to also handle problem like finding and avoid matching of common words and speeding up the matching process by grouping the names. By combining all tricks and techniques that we demonstrate, we will also evaluate how efficient this method is and the advantage of using this method. This talk is for people in all level of Python experience who would like to learn a trick or two and would like to be able to solve similar problems in the future. Theory of how the library works will be explained and It is easy to be pick up even for beginners. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch

Anmol Krishan Sachdeva - Understanding and Implementing Recurrent Neural Networks using Python

Understanding and Implementing Recurrent Neural Networks using Python [EuroPython 2018 - Talk - 2018-07-25 - PyCharm [PyData]] [Edinburgh, UK] By Anmol Krishan Sachdeva Recurrent Neural Networks (RNNs) have become famous over time due to their property of retaining internal memory. These neural nets are widely used in recognizing patterns in sequences of data, like numerical timer series data, images, handwritten text, spoken words, genome sequences, and much more. Since these nets possess memory, there is a certain analogy that we can make to the human brain in order to learn how RNNs work. RNNs can be thought of as a network of neurons with feedback connections, unlike feedforward connections which exist in other types of Artificial Neural Networks. The flow of talk will be as follows: - Self Introduction - Introduction to Deep Learning - Artificial Neural Networks (ANNs) - Diving DEEP into Recurrent Neural Networks (RNNs) - Comparing Feedforward Networks with Feedback Networks - Quick walkthrough: Implementing RNNs using Python (Keras) - Understanding Backpropagation Through Time (BPTT) and Vanishing Gradient Problem - Towards more sophisticated RNNs: Gated Recurrent Units (GRUs)/Long Short-Term Memory (LSTMs) - End of talk - Questions and Answers Session License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch

Raniere Silva, Tania Sanchez Monroy - A Jupyter Enhancement Proposal Story

A Jupyter Enhancement Proposal Story [EuroPython 2018 - Talk - 2018-07-25 - Fintry [PyData]] [Edinburgh, UK] By Raniere Silva, Tania Sanchez Monroy Python users should be familiar with the concept of Python Enhancement Proposals (PEPs), the way that the Python language evolves over time. In a similar fashion, the Jupyter project has Jupyter Enhancement Proposals (JEPs). This talk with cover the proposer first-hand experience when submiting JEP 23 - Add Template as Metatada enhancement proposal from it's beginning, during EuroPython 2017, up to its current status. We will, in addition, present efforts made as part of the OpenDreamKit project to perform Jupyter notebooks conversions using custom metadata, templates, and exporters, in a programmatic way. Outline 0:00 - 0:05 Who are we? We are impostors! 0:05 - 0:10 Our previous experience with Jupyter Notebook. We will talk about the time that Software Carpentry used Jupyter Notebook for their lesson creation and OpenDreamKit Jupyter notebook programmatic notebooks conversion. 0:10 - 0:15 You are not alone. We will talk how the idea for the Jupyter Enhancement Proposals (JEPs) borned at EuroPython 2017 Help Desk 0:15 - 0:20 Writing our first Jupyter Enhancement Proposals. We will cover our steps to create the pull request required by the Jupyter Project. 0:20 - 0:25 What is the current status of the Jupyter Enhancement Proposals? We will cover any progress from the time of this talk proposal submission and the date of it presentation. 0:25 - 0:30 Time for questions License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch

Peter Hoffmann - Using Pandas and Dask to work with large columnar datasets in Apache Parquet

Using Pandas and Dask to work with large columnar datasets in Apache Parquet [EuroPython 2018 - Talk - 2018-07-25 - Fintry [PyData]] [Edinburgh, UK] By Peter Hoffmann Apache Parquet Data Format Apache Parquet is a binary, efficient columnar data format. It uses various techniques to store data in a CPU and I/O efficient way like row groups, compression for pages in column chunks or dictionary encoding for columns. Index hints and statistics to quickly skip over chunks of irrelevant data enable efficient queries on large amount of data. Apache Parquet with Pandas & Dask Apache Parquet files can be read into Pandas DataFrames with the two libraries fastparquet and Apache Arrow. While Pandas is mostly used to work with data that fits into memory, Apache Dask allows us to work with data larger then memory and even larger than local disk space. Data can be split up into partitions and stored in cloud object storage systems like Amazon S3 or Azure Storage. Using Metadata from the partiton filenames, parquet column statistics and dictonary filtering allows faster performance for selective queries without reading all data. This talk will show how use partitioning, row group skipping and general data layout to speed up queries on large amount of data. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch

Michele Simionato - Python in scientific computing: what works and what doesn't

Python in scientific computing: what works and what doesn't [EuroPython 2018 - Talk - 2018-07-25 - Fintry [PyData]] [Edinburgh, UK] By Michele Simionato There is no want of technologies for doing scientific calculations in Python. In this talk I will share some hard-learned knowledge about what works and what doesn't with the libraries we are using at GEM (the Global Earthquake Model foundation). I will show how the following libraries fare with respect to our main concerns of performance, simplicity, reliability and portability and I will talk about several library bugs we found and had to work around. I will also talk about some libraries that we do not use (such as cython, numba, dask, pytables, ...) and the reason why we do not use them. Hopefully this will be useful to people using or planning to use a similar software stack. My slides are here: https://gitpitch.com/micheles/papers/europython2018 License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch

Marco Buttu - White Mars living far away from any form of life

White Mars: living far away from any form of life [EuroPython 2018 - Keynote - 2018-07-25 - Smarkets] [Edinburgh, UK] By Marco Buttu Concordia Station is a French/Italian facility located inside Antarctica, in a plateau called Dome-C, in the middle of nowhere. A dark and cold place: no Sun from May to August, temperatures around -80 Celsius degress, no life. Here I am living and performing scientific research with other 12 collegues from Italy, France and Austria. We are the most isolated people on Earth, more than the austronauts in the International Space Station. There is no way to move from Concordia until November, and no one can come. It is like to live in another planet, and that is why the European Space Agency is interested in making bio-medical research on us, in order to better understand how the human body behaves in a such extraterrestrial environment. We will introduce our studies, describe this place and our life here, and of course also speak about Python. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch

EuroPython 2018 - Lightning talks on Wednesday, July 25

Lightning talks [EuroPython 2018 - - 2018-07-25 - Smarkets] [Edinburgh, UK] License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch

Nicole Harris - PyPI: Past, Present and Future

PyPI: Past, Present and Future [EuroPython 2018 - Talk - 2018-07-26 - Smarkets] [Edinburgh, UK] By Nicole Harris The Python Package Index (PyPI) is the principal repository of software packages for the Python programming language. In May 2018, PyPI served 12.3 billion HTTP requests, with 1.4 million people visiting pypi.org via their web browser. The Python community depends on PyPI for the ongoing functioning of the entire Python ecosystem. License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/ Please see our speaker release agreement for details: https://ep2018.europython.eu/en/speaker-release-agreement/

Watch