Go Concurrency powering Gigabyte scale real-world data pipeline | Chinmay Naik | Conf42 Golang 2024

Conference: Conf42 Golang 2024

Year: 2024

Read the abstract ➤ https://www.conf42.com/Golang_2024_Chinmay_Naik_concurrency_gigabyte_realworld Other sessions at this event ➤ https://www.conf42.com/golang2024 Support our mission ➤ https://www.conf42.com/support Join Discord ➤ https://discord.gg/DnyHgrC7jC Chapters 0:00 intro 0:20 preamble 0:27 about chinmay naik 1:04 mongodb to rdbms data migration 2:00 student collection (mongodb) 2:39 student table (postresql) 3:06 student - address and phone relationships 3:43 data migration - mongodb to postresql 4:45 how mongodb json data maps to sql 5:14 inserts are cool, what about updates and deletes in mongodb? 6:04 how do we migrate data? 7:07 mongo oplog (operation log) 7:37 what does oplog record look like? 8:35 when are we gerring to the golang concurrency? 8:41 sequential data pipeline 10:10 mongo oplog / two oplogs / postgresql 11:33 sequential pipeline performance 11:51 perf improvemwent - let's add worker pool 12:20 worker pool 13:21 worker pools v2.0 13:53 worker pool v2.0 performance 14:25 can you guess the problem? 15:01 worker pools v2.0 - the problem 16:08 back to drawing board? 16:18 fan-out for each database 19:18 concurrent data pipeline 21:06 performance comparison 21:27 resource utilization 22:46 concurrent data pipeline - improvement 23:42 16 databases and 128 collections per db 24:20 performance comparison 24:45 final concurrent data pipeline 25:30 key takeaways 27:38 keep learning