O'Reilly Data Show Podcast

Structured streaming comes to Apache Spark 2.0

05.19.2016 - By O'Reilly MediaPlay

Download our free app to listen on your phone

Download on the App StoreGet it on Google Play

With the release of Spark version 2.0, streaming starts becoming much more accessible to users. By adopting a continuous processing model (on an infinite table), the developers of Spark have enabled users of its SQL or DataFrame APIs to extend their analytic capabilities to unbounded streams.

Within the Spark community, Databricks Engineer, Michael Armbrust is well-known for having led the long-term project to move Spark’s interactive analytics engine from Shark to Spark SQL. (Full disclosure: I’m an advisor to Databricks.) Most recently he has turned his efforts to helping introduce a much simpler stream processing model to Spark Streaming (“structured streaming”).

More episodes from O'Reilly Data Show Podcast