Talk (Data - Day 1) - Fullstack datascientist v.2021

Abstract: What are the essential software engineering skills a datascientist should have to succesfully bring own work to production? We - Sergei Beilin, Ph.D., software engineering consultant in AI/ML, and his wife Natalia Beylina, Ph.D., datascientist - will go through the most important things a modern datascientist needs to know about software engineering, from both software engineer and datascientist point of views, and using our own experience. We will discuss: * programming language(s): how much of the language should one know? * execution models, orchestration, containerization - kubernetes, kubeflow, airflow, spark/databricks, etc * storage, network protocols/APIs, file formats - from CSVs to delta, from json to avro * modern systems architecture concepts to understand * and how the whole system architecture and infrastructure landscape will dictate the way you deploy and run your work * tools and devops practices * processes: integrating data scientists' workflow into typical agile * bad practices to avoid: a few examples we've seen ourselves For more details: https://pretalx.com/pycon-sweden-2021/talk/KR99KF/ Speakers: Sergei Beilin Natalia Beylina