PySyft: Data Science on data you are not allowed to see — Valerio Maggio
[EuroPython 2024 — North Hall on 2024-07-11] PySyft: Data Science on data you are not allowed to see by Valerio Maggio https://ep2024.europython.eu/session/pysyft-data-science-on-data-you-are-not-allowed-to-see In today's data-driven world, privacy stands as an essential requirements for the ethical and effective practice of data science. Moreover, the implementation of robust privacy guarantees in data analysis not only protects sensitive information, but also unlocks the potential for unprecedented democratisation of models and datasets. PySyft: https://github.com/OpenMined/PySyft, is a stack of open source tools that is designed to help organisations to securely collaborate with external (untrusted) individuals. By using PySyft, organisations can enable external auditors (e.g. data scientists) to use their assets, such as datasets or models, in order to conduct studies with a specific, known purpose. Data scientists can run their analysis using those assets through PySyft, and without seeing nor obtaining a copy of the assets themselves. We call this process _Remote Data Science._ PySyft is a framework for Remote Data Science. In the first part of my talk I will introduce the problem of privacy in Data Science, PETs (Privacy Enhancing Technologies), and OpenMined mission to democratise access to data and information. Afterwards, I will demonstrate how *_PySyft_* works, and how it can be used to run a machine learning experiments, with privacy guarantees. --- This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License: https://creativecommons.org/licenses/by-nc-sa/4.0/