Talk (Data - Day 1) - Architecture for the extraction, automation and massive data processing

Abstract: Present a solution that integrates various components in its architecture, both computational resources, databases and its own python applications and other open source ones. The idea is to show the problems and challenges posed by traditional scraping and how we have been able to build solutions that reduce them, even more so if what is sought is to do it en masse and in parallel. This also means building an automated flow for the post-processing and transformation of the data using machine learning services such as NLP and classification. For more details: https://pretalx.com/pycon-sweden-2021/talk/EGMFSZ/ Speaker: Alfonso de la Guarda