Sign up to save your podcastsEmail addressPasswordRegisterOrContinue with GoogleAlready have an account? Log in here.
FAQs about Data Archives - Software Engineering Daily:How many episodes does Data Archives - Software Engineering Daily have?The podcast currently has 383 episodes available.
May 11, 2016Spark and Cassandra with Tim BerglundApache Spark is a framework for fast, distributed, in-memory analysis. Apache Cassandra is a distributed database management system that provides high availability and fast throughput. Today, we are collecting fast, big data streams from user behavior, smart phones and sensors, and the disk checkpointing of and query language of Hadoop MapReduce is no longer adequate....more58minPlay
April 26, 2016Azure Event Hubs and Kafka with Dan RosanovaApache Kafka has become the most popular open-source solution for persistent replicated messaging in the Hadoop ecosystem. But some software engineers who are working with “big data” don’t want to deal with the configuration and set up of Kafka. One way to side step this problem is to go with a managed solution, like Microsoft...more53minPlay
April 14, 2016CockroachDB with Ben Darnell“Eventual consistency is really kind of a marketing term from some of these NoSQL systems – it’s not really consistent in any strong sense of the term.” Google has published papers on distributed systems such as BigTable, Chubby, and the Google File System. During this episode, we focus on a product that takes inspiration from...more56minPlay
April 04, 2016Stream Processing at Uber with Danny Yuan“Be aggressive in vision, but conservative in operation.” Uber is a transportation company with a high volume of temporal spacial data, constantly being collected from the devices of its users. At any given time, the engineers and data scientists at Uber need to be able to query the system, and understand what is going on...more47minPlay
March 14, 2016Data Visualization and Mapping with Aurelia Moser“I’m always worried that if you teach too much magic, people don’t learn the basics – they don’t know why something is working, they just know the documentation said it should work that way.” On Software Engineering Daily, we often discuss big data in terms of data engineering and data science. Data engineering is the...more57minPlay
March 12, 2016FiloDB with Evan Chan“The world is becoming more and more interactive, and people want answers right away, so you’re seeing the rise of stream processing and real-time.” Big data is yesterday–fast data is now. FiloDB is a reactive columnar OLAP database that is built on Cassandra and Spark. Today’s guest is Evan Chan, creator of FiloDB. In our...more55minPlay
March 11, 2016Cassandra with Tim Berglund“There isn’t any central node in Cassandra. Every node is a peer, there is no master – there is no single point of failure.” Apache Cassandra can serve as both the real-time data store for online transactional applications, as well as the read-intensive database for data warehousing operations. In order to combine these two use...more1hPlay
March 10, 2016Hadoop: Past, Present and Future with Mike Cafarella“HDFS is going to be a cockroach – I don’t think its ever going away.” Hadoop was created in 2003. In the early years, Hadoop provided large scale data processing with MapReduce, and distributed fault-tolerant storage with the Hadoop Distributed File System. Over the last decade, Hadoop has evolved rapidly, with the support of a...more58minPlay
March 09, 2016Data Engineering at Airbnb with Maxime Beauchemin“One big transformation we’re seeing right now is the slow agonizing death of MapReduce.” When a company gets big enough, there is so much data to be processed that an entire data engineering team becomes responsible for managing this data and making it available to other teams. Airbnb is one such company. Max Beauchemin works...more56minPlay
February 26, 2016Computational Neuroscience with Jeremy Freeman“You want to take a scientist who knows a little bit of matlab programming and try to teach them mapreduce, and write a mapreduce program in java to do image processing? It’s a disaster!” Apache Spark is replacing MATLAB in the domain of computational neuroscience. The constraints of running MATLAB on a single machine can’t...more54minPlay
FAQs about Data Archives - Software Engineering Daily:How many episodes does Data Archives - Software Engineering Daily have?The podcast currently has 383 episodes available.