Data Engineering Podcast

Decoupling Data Operations From Data Infrastructure Using Nexla


Listen Later

Summary

The technological and social ecosystem of data engineering and data management has been reaching a stage of maturity recently. As part of this stage in our collective journey the focus has been shifting toward operation and automation of the infrastructure and workflows that power our analytical workloads. It is an encouraging sign for the industry, but it is still a complex and challenging undertaking. In order to make this world of DataOps more accessible and manageable the team at Nexla has built a platform that decouples the logical unit of data from the underlying mechanisms so that you can focus on the problems that really matter to your business. In this episode Saket Saurabh (CEO) and Avinash Shahdadpuri (CTO) share the story behind the Nexla platform, discuss the technical underpinnings, and describe how their concept of a Nexset simplifies the work of building data products for sharing within and between organizations.

Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Schema changes, missing data, and volume anomalies caused by your data sources can happen without any advanced notice if you lack visibility into your data-in-motion. That leaves DataOps reactive to data quality issues and can make your consumers lose confidence in your data. By connecting to your pipeline orchestrator like Apache Airflow and centralizing your end-to-end metadata, Databand.ai lets you identify data quality issues and their root causes from a single dashboard. With Databand.ai, you’ll know whether the data moving from your sources to your warehouse will be available, accurate, and usable when it arrives. Go to dataengineeringpodcast.com/databand to sign up for a free 30-day trial of Databand.ai and take control of your data quality today.
  • We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to dataengineeringpodcast.com/census today to get a free 14-day trial.
  • Your host is Tobias Macey and today I’m interviewing Saket Saurabh and Avinash Shahdadpuri about Nexla, a platform for powering data operations and sharing within and across businesses
  • Interview
    • Introduction
    • How did you get involved in the area of data management?
    • Can you describe what Nexla is and the story behind it?
    • What are the major problems that Nexla is aiming to solve?
      • What are the components of a data platform that Nexla might replace?
      • What are the use cases and benefits of being able to publish data sets for use outside and across organizations?
      • What are the different elements involved in implementing DataOps?
      • How is the Nexla platform implemented?
        • What have been the most comple engineering challenges?
        • How has the architecture changed or evolved since you first began working on it?
        • What are some of the assumptions that you had at the start which have been challenged or invalidated?
        • What are some of the heuristics that you have found most useful in generating logical units of data in an automated fashion?
        • Once a Nexset has been created, what are some of the ways that they can be used or further processed?
        • What are the attributes of a Nexset? (e.g. access control policies, lineage, etc.)
          • How do you handle storage and sharing of a Nexset?
          • What are some of your grand hopes and ambitions for the Nexla platform and the potential for data exchanges?
          • What are the most interesting, innovative, or unexpected ways that you have seen Nexla used?
          • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Nexla?
          • When is Nexla the wrong choice?
          • What do you have planned for the future of Nexla?
          • Contact Info
            • Saket
              • LinkedIn
              • @saketsaurabh on Twitter
              • Avinash
                • LinkedIn
                • @avinashpuri on Twitter
                • Parting Question
                  • From your perspective, what is the biggest gap in the tooling or technology for data management today?
                  • Links
                    • Nexla
                    • Nexsets
                    • The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

                      Support Data Engineering Podcast

                      ...more
                      View all episodesView all episodes
                      Download on the App Store

                      Data Engineering PodcastBy Tobias Macey

                      • 4.6
                      • 4.6
                      • 4.6
                      • 4.6
                      • 4.6

                      4.6

                      135 ratings


                      More shows like Data Engineering Podcast

                      View all
                      Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

                      Software Engineering Radio - the podcast for professional software developers

                      272 Listeners

                      The Changelog: Software Development, Open Source by Changelog Media

                      The Changelog: Software Development, Open Source

                      283 Listeners

                      The Cloudcast by Massive Studios

                      The Cloudcast

                      152 Listeners

                      Thoughtworks Technology Podcast by Thoughtworks

                      Thoughtworks Technology Podcast

                      42 Listeners

                      Data Skeptic by Kyle Polich

                      Data Skeptic

                      481 Listeners

                      Talk Python To Me by Michael Kennedy

                      Talk Python To Me

                      590 Listeners

                      Software Engineering Daily by Software Engineering Daily

                      Software Engineering Daily

                      625 Listeners

                      The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

                      The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

                      441 Listeners

                      Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                      Super Data Science: ML & AI Podcast with Jon Krohn

                      298 Listeners

                      Python Bytes by Michael Kennedy and Brian Okken

                      Python Bytes

                      213 Listeners

                      DataFramed by DataCamp

                      DataFramed

                      265 Listeners

                      Practical AI by Practical AI LLC

                      Practical AI

                      190 Listeners

                      The Stack Overflow Podcast by The Stack Overflow Podcast

                      The Stack Overflow Podcast

                      64 Listeners

                      The Real Python Podcast by Real Python

                      The Real Python Podcast

                      140 Listeners

                      Latent Space: The AI Engineer Podcast by swyx + Alessio

                      Latent Space: The AI Engineer Podcast

                      76 Listeners