Sign up to save your podcastsEmail addressPasswordRegisterOrContinue with GoogleAlready have an account? Log in here.
FAQs about Data Archives - Software Engineering Daily:How many episodes does Data Archives - Software Engineering Daily have?The podcast currently has 380 episodes available.
August 21, 2015Time-Series Database with InfluxDB CEO Paul DixInfluxDB is an open-source time-series database. Time-series data can be used by for metrics and analytics. Paul Dix is the CEO of InfluxDB. Questions What differentiates InfluxDB from a regular database with a timestamp on every entry? What is the full-stack architecture of a typical user of InfluxDB? Why are distributed time series databases so...more56minPlay
August 20, 2015Streaming SQL with PipelineDB CEO Derek NelsonPipelineDB is a streaming SQL database. Derek Nelson is the CEO of PipelineDB. Questions What are continuous views? Why is PipelineDB a good fit for the Kafka+Storm+HBase-type architecture? How does PipelineDB affect the application tier or the browser tier? What are the latency guarantees for how long it takes raw data streams to be converted into the refined queries provided by a continuous view? What probabilistic data structures does PipelineDBContinue reading......more54minPlay
August 19, 2015Push Databases with RethinkDB CEO Slava AkhmechetRethinkDB is an open-source database for the realtime web. RethinkDB pushes changes to the application rather than waiting for a request. Slava Akhmechet is the CEO of RethinkDB. Questions RethinkDB supports a “push” model rather than request handling. Why? What are some use cases for pushing data? What does the full-stack architecture look like when the database has push? What did you learn from the Meteor team? Is RethinkDB like aContinue reading......more58minPlay
August 18, 2015MemSQL with Nikita ShamgunovMemSQL is a high-performance, in-memory database that combines the horizontal scalability of distributed systems with the familiarity of SQL. Nikita Shamgunov is co-founder and CTO of MemSQL. Questions What types of data does a user want to keep on disk versus on an in-memory database? How does MemSQL compare to MySQL? How do MemSQL users leverage Apache Spark? How does a user onboard with MemSQL? What are the engineering difficultiesContinue reading......more58minPlay
August 08, 2015Hortonworks Data Platform with Venkatesh SeetharamHortonworks Data Platform is a managed Hadoop architecture for enterprises. Venkatesh Seetharam is a software engineer at Hortonworks. He has worked on several Apache projects, including Hadoop, Falcon, and Atlas. Questions include: Will Hadoop ever be so big we will have to start over from scratch? What is the YARN data operating system? How are customers of Hortonworks dealing with numerous managed Big Data providers? How do customers use ApacheContinue reading......more47minPlay
August 08, 2015Facebook Presto with Christopher BernerPresto is a low latency SQL language built for interactive analysis. Christopher Berner works on Presto at Facebook. Questions: Is Presto for data scientists, developers, or everyone? What are the problems with Hive? How does Hive break a query into mapreduces? How do the clients, coordinators, and workers interact? Is Presto both fast and cheap? How does Presto tune Java to get speed improvements? What are the advantages toContinue reading......more57minPlay
August 06, 2015Apache Kafka with Guozhang WangApache Kafka is a publish-subscribe messaging system rethought as a distributed commit log. Kafka serves as the central repository for data streams in a distributed system. Guozhang Wang is an engineer at Confluent, which offers a stream data platform built using Kafka. Questions include: What is a central repository for data streams? How does Kafka improve transportation between systems? How does Kafka allow for richer analytical processing? What are the rolesContinue reading......more58minPlay
August 03, 2015Apache Spark Creator Matei Zaharia InterviewApache Spark is a fast and general engine for big data processing. Matei Zaharia created Spark, and is the co-founder of Databricks, a company using Spark to power data science. Questions: What was the motivation behind creating Spark? How much faster is a Spark job than a Hadoop job? What is the relationship between streaming and batch processing? Is Spark’s core advantage over Storm and Samza its usability? How usefulContinue reading......more54minPlay
August 02, 2015Cloudera Chief Technologist Eli Collins Discusses Streaming, Batch, Business, and Open-SourceCloudera allows enterprises to leverage their data through its Hadoop platform. Eli Collins is the Chief Technologist at Cloudera. Topics include: changes to Hadoop since Cloudera’s founding Cloudera’s usage of Spark, Docker, and other open-source technologies how enterprises use batch and streaming together Cloudera’s open-source policy Should Frito Lay open source its chip-making abilities? how collaboration occurs between big, competing companies the growth of increasingly vertical managed big data platforms likeContinue reading......more57minPlay
July 30, 2015MongoDB with Bryan ReineroMongoDB is a cross-platform document-oriented database. Bryan Reinero is a developer advocate at MongoDB. Questions include: How are isomorphic JavaScript applications using NoSQL? What is the joke behind the “MongoDB is web scale meme”? Is Mongo used primarily for scalability, modular schema, or simply the first-class JSON objects? What is MongoDB’s impact on the movement towards single-page web applications? How can a developer choose between different NoSQL databases? Links: MongoDBContinue reading......more1h 7minPlay
FAQs about Data Archives - Software Engineering Daily:How many episodes does Data Archives - Software Engineering Daily have?The podcast currently has 380 episodes available.