Commitconf 2018

Defining data pipelines - Juan Riaza


Listen Later

Si quieres ver el vídeo con slides: https://youtu.be/J7xiNsiKs1U
Apache Airflow is a workflow automation and scheduling system that can be used to author and manage data pipelines. Workflows are defined programmatically as directed acyclic graphs (DAG) of tasks, written in Python?.
At Idealista we use it on a daily basis for data ingestion pipelines. We'll do a thorough review about managing dependencies, handling retries, alerting, etc. and all the drawbacks.
...more
View all episodesView all episodes
Download on the App Store

Commitconf 2018By Autentia