Big Data Pipeline Design and Tuning in PySpark by Rockie Yang

Conference: PyCon SE 2019

Year: 2019

PySpark is a great tool for doing big data ETL pipeline. While designing a big data pipeline, which is easy to maintain with a holistic view, simple to spot bottleneck is difficult. Not to say enable analytics on ETL pipelines. Rockie Yang will share his experiences on build effective ETL pipeline with PySpark. Audience level: Intermediate Speaker: Rockie Yang