Nabanita Roy - Leveraging Linked Data using Python and SPARQL
Leveraging Linked Data using Python and SPARQL [EuroPython 2021 - Talk - 2021-07-28 - Parrot [Data Science]] [Online] By Nabanita Roy Wikipedia is the digital encyclopedia that we use daily to find out facts and information. What could be better than being able to extract the extreme wealth of crowd-sourced knowledge from Wikipedia without using traditional web scrapers? Various community-driven projects extract knowledge from Wikipedia and stores them structurally, retrievable using SPARQL. It can be used to mine data for a range of Data Science projects. In this talk, I will walk through the basics of the Open Web and how to use Python to use this huge open database. The agenda includes the following: • Why Wikipedia? • Introduction to DBpedia and Wikidata • Introduction to Linked Data • How to query DBpedia/WikiData o Build SPARQL Query o Use Python’s SPARQLWrapper • Python Code Walkthrough to create o A Tabular Dataset using SPARQL o A Corpus for Language Models using Wikipedia and BeautifulSoup o An Use-Case leveraging both SPARQLWrapper and Wikipedia to Create Domain-Specific Corpus Prerequisites – Basic knowledge of Python programming, Natural Language Processing, and SQL License: This video is licensed under the CC BY-NC-SA 4.0 license: https://creativecommons.org/licenses/by-nc-sa/4.0/ Please see our speaker release agreement for details: https://ep2021.europython.eu/events/speaker-release-agreement/