In the 10th Episode of the StackToHeap podcast, we speak with Lordson from Arjira Tech about the patterns, challenges and solutions that one comes across while setting up a data ingestion and processing platform using Spark.
We discuss about the data ingestion issues, handling PII data, DSL for onboarding new data sources and using notebooks for orchestrating the Spark jobs.