
Sign up to save your podcasts
Or
This is the third in a 4-part series where Anders Larson and Shea Parkes discuss predictive analytics with high cardinality features. In this episode they focus on feature engineering via the hashing trick. The hashing trick is most applicable for extremely high cardinality, and at first glance can seem almost ridiculous. In a lot of ways, it is the same as bucketing values at random. But there are times that it is more valuable to include randomly engineered buckets than to exclude the original high cardinality feature entirely.
4.6
3131 ratings
This is the third in a 4-part series where Anders Larson and Shea Parkes discuss predictive analytics with high cardinality features. In this episode they focus on feature engineering via the hashing trick. The hashing trick is most applicable for extremely high cardinality, and at first glance can seem almost ridiculous. In a lot of ways, it is the same as bucketing values at random. But there are times that it is more valuable to include randomly engineered buckets than to exclude the original high cardinality feature entirely.
1,643 Listeners
3,156 Listeners
4,336 Listeners
30,898 Listeners
26,366 Listeners
1,778 Listeners
111,382 Listeners
56,005 Listeners
9,553 Listeners
15 Listeners
5,917 Listeners
12 Listeners
2 Listeners
9,050 Listeners
14,763 Listeners