Data Exposed  - Channel 9

Spark Performance Tuning - Part 2


Listen Later

This week's Data Exposed show welcomes back Maxim Lukiyanov to talk more about Spark performance tuning with Spark 2.x. Maxim is a Senior PM on the big data HDInsight team and is in the studio today to present Part 2 of his 4-part series. Topics in today's video: [01:40] - DataSets vs. DataFrames vs. RDDs [10:45] - Garbage Collection Overhead and Executor Size [18:20] - Data Formats [22:35] - Data Partitioning [26:25] - Caching Be sure to follow the Data Exposed show on Twitter at @DataExposed!
...more
View all episodesView all episodes
Download on the App Store

Data Exposed  - Channel 9By Microsoft