
Sign up to save your podcasts
Or
A large social network needs to develop systems for ingesting, storing, and processing large volumes of data.
Data engineering at scale requires multiple engineering teams that are responsible for different areas of the infrastructure.
Data needs to be structured coherently in order to minimize the data cleaning process. Machine learning models need to be developed, deployed, and iterated on at scale. Areas of the company which produce data need to be decoupled from the areas of the company which consume data, so that engineers throughout the company can reliably build tools on top of these large data sets.
In our previous episodes about LinkedIn, we covered two major components of LinkedIn’s data engineering systems: the Kafka infrastructure and the LinkedIn data platform used by engineers to productively build data applications.
Kapil Surlaker is a senior director of engineering at LinkedIn, and he joins the show to discuss the bigger picture of LinkedIn’s data infrastructure. Kapil works with teams across LinkedIn to understand the requirements for the products and internal tools, and translate those requirements into team structures and software platforms that let LinkedIn use data more productively.
We discuss a wide range of topics, including engineering management, the modern data platform, and LinkedIn’s adoption of public cloud.
Full disclosure: LinkedIn is a sponsor of Software Engineering Daily.
A large social network needs to develop systems for ingesting, storing, and processing large volumes of data.
Data engineering at scale requires multiple engineering teams that are responsible for different areas of the infrastructure.
Data needs to be structured coherently in order to minimize the data cleaning process. Machine learning models need to be developed, deployed, and iterated on at scale. Areas of the company which produce data need to be decoupled from the areas of the company which consume data, so that engineers throughout the company can reliably build tools on top of these large data sets.
In our previous episodes about LinkedIn, we covered two major components of LinkedIn’s data engineering systems: the Kafka infrastructure and the LinkedIn data platform used by engineers to productively build data applications.
Kapil Surlaker is a senior director of engineering at LinkedIn, and he joins the show to discuss the bigger picture of LinkedIn’s data infrastructure. Kapil works with teams across LinkedIn to understand the requirements for the products and internal tools, and translate those requirements into team structures and software platforms that let LinkedIn use data more productively.
We discuss a wide range of topics, including engineering management, the modern data platform, and LinkedIn’s adoption of public cloud.
Full disclosure: LinkedIn is a sponsor of Software Engineering Daily.