Sign up to save your podcastsEmail addressPasswordRegisterOrContinue with GoogleAlready have an account? Log in here.
FAQs about Data Archives - Software Engineering Daily:How many episodes does Data Archives - Software Engineering Daily have?The podcast currently has 380 episodes available.
February 17, 2020Great Expectations: Data Pipeline Testing with Abe GongA data pipeline is a series of steps that takes large data sets and creates usable results from them. At the beginning of a data pipeline, a data set might be pulled from a database, a distributed file system, or a Kafka topic. Throughout a data pipeline, different data sets are joined, filtered, and statistically...more1h 3minPlay
February 14, 2020Data Warehouse ETL with Matthew ScullionA data warehouse provides low latency access to large volumes of data. A data warehouse is a crucial piece of infrastructure for a large company, because it can be used to answer complex questions involving a large number of data points. But a data warehouse usually cannot hold all of a company’s data at any...more52minPlay
February 12, 2020Flink and BEAM Stream Processing with Maximilian MichelsDistributed stream processing systems are used to read large volumes of data and perform operations across those data streams. These stream processing systems often build off of the MapReduce algorithm for collecting and aggregating large volumes of data, but instead of processing a calculation over a single large batch of data, they process data on...more44minPlay
February 11, 2020Druid Analytics with Jad NaousLarge companies generate large volumes of data. This data gets dumped into a data lake for long-term storage, then pulled into memory for processing and analysis. Once it is in memory, it is often read into a dashboard, which presents a human with a visualization of the data. The end-user who is consuming this data...more50minPlay
February 10, 2020The Data Exchange with Ben LoricaData infrastructure has been transformed over the last fifteen years. The open source Hadoop project led to the creation of multiple companies based around commercializing the MapReduce algorithm and Hadoop distributed file system. Cheap cloud storage popularized the usage of data lakes. Cheap cloud servers led to wide experimentation for data tools. Apache Spark emerged...more1h 3minPlay
February 07, 2020Presto with Justin BorgmanA data platform contains all of the data that a company has accumulated over the years. Across a data platform, there is a multitude of data sources: databases, a data lake, data warehouses, a distributed queue like Kafka, and external data sources like Salesforce and Zendesk. A user of the data platform often has a...more1h 10minPlay
February 06, 2020Nubank Data Engineering with Sujith NairNubank is a popular bank that is based in Brazil. Nubank has more than 20 million customers, and has accumulated a high volume of data over the six years since it was started. Mobile computing and cloud computing have given rise to “challenger banks” that operate more like software companies. When a software company reaches...more59minPlay
January 30, 2020Alpaca: Stock Trading API with Yoshi YokokawaStock trading takes place across a variety of software platforms. Etrade and Schwab have allowed individual traders to buy securities for decades. Robinhood built a business around a similar model, but also removed the commission. Wealthfront and Betterment provide “roboadvisor” services that abstract away the underlying securities and focus on managing a risk profile. Each...more1h 5minPlay
January 17, 2020Apollo GraphQL with Geoff SchmidtGraphQL has become a core piece of infrastructure for many software applications. GraphQL is used to make requests that are structured as GraphQL queries and responded to through a GraphQL server. The GraphQL server processes the query and fetches the response from the necessary databases, APIs, and backend services. Around 2016, when GraphQL was becoming...more1h 5minPlay
January 13, 2020Data Infrastructure Go-To-Market with Sean KnappEvery large company generates large amounts of data. Data engineering is the process of storing, transforming, and leveraging that data. Data infrastructure companies provide tools and platforms for performing data engineering. The last fifteen years has seen a rise in modern data management companies built in a time of decreasing storage costs, an increased volume...more49minPlay
FAQs about Data Archives - Software Engineering Daily:How many episodes does Data Archives - Software Engineering Daily have?The podcast currently has 380 episodes available.