Data Exposed  - Channel 9

Spark Performance Tuning - Part 3


Listen Later

This week's Data Exposed show welcomes back Maxim Lukiyanov to talk more about Spark performance tuning with Spark 2.x. Maxim is a Senior PM on the big data HDInsight team and is in the studio today to present Part 3 of his 4-part series. Topics in today's video: [00:45] - Recap and overview of the first two videos [03:40] - Join Types (SortMerge and Broadcast) [09:30] - Cost-based Optimizer [21:35] - Outliers and Data Skew Spark 2.2 rc4 on Azure HDInsight: Script action https://github.com/hdinsight/script-actions/tree/master/install-spark2-2 Be sure to follow the Data Exposed show on Twitter at @DataExposed!
...more
View all episodesView all episodes
Download on the App Store

Data Exposed  - Channel 9By Microsoft