Data Engineering Podcast

Closing The Loop On Event Data Collection With Iteratively


Listen Later

Summary

Event based data is a rich source of information for analytics, unless none of the event structures are consistent. The team at Iteratively are building a platform to manage the end to end flow of collaboration around what events are needed, how to structure the attributes, and how they are captured. In this episode founders Patrick Thompson and Ondrej Hrebicek discuss the problems that they have experienced as a result of inconsistent event schemas, how the Iteratively platform integrates the definition, development, and delivery of event data, and the benefits of elevating the visibility of event data for improving the effectiveness of the resulting analytics. If you are struggling with inconsistent implementations of event data collection, lack of clarity on what attributes are needed, and how it is being used then this is definitely a conversation worth following.

Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • What are the pieces of advice that you wish you had received early in your career of data engineering? If you hand a book to a new data engineer, what wisdom would you add to it? I’m working with O’Reilly on a project to collect the 97 things that every data engineer should know, and I need your help. Go to dataengineeringpodcast.com/97things to add your voice and share your hard-earned expertise.
  • When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • If you’ve been exploring scalable, cost-effective and secure ways to collect and route data across your organization, RudderStack is the only solution that helps you turn your own warehouse into a state of the art customer data platform. Their mission is to empower data engineers to fully own their customer data infrastructure and easily push value to other parts of the organization, like marketing and product management. With their open-source foundation, fixed pricing, and unlimited volume, they are enterprise ready, but accessible to everyone. Go to dataengineeringpodcast.com/rudder to request a demo and get one free month of access to the hosted platform along with a free t-shirt.
  • You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data platforms. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to dataengineeringpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host is Tobias Macey and today I’m interviewing Patrick Thompson and Ondrej Hrebicek about Iteratively, a platform for enforcing consistent schemas for your event data
  • Interview
    • Introduction
    • How did you get involved in the area of data management?
    • Can you start by describing what you are building at Iteratively and your motivation for creating it?
    • What are some of the ways that you have seen inconsistent message structures cause problems?
    • What are some of the common anti-patterns that you have seen for managing the structure of event messages?
    • What are the benefits that Iteratively provides for the different roles in an organization?
    • Can you describe the workflow for a team using Iteratively?
    • How is the Iteratively platform architected?
      • How has the design changed or evolved since you first began working on it?
      • What are the difficulties that you have faced in building integrations for the Iteratively workflow?
      • How is schema evolution handled throughout the lifecycle of an event?
      • What are the challenges that engineers face in building effective integration tests for their event schemas?
      • What has been your biggest challenge in messaging for your platform and educating potential users of its benefits?
      • What are some of the most interesting or unexpected ways that you have seen Iteratively used?
      • What are some of the most interesting, unexpected, or challenging lessons that you have learned while building Iteratively?
      • When is Iteratively the wrong choice?
      • What do you have planned for the future of Iteratively?
      • Contact Info
        • Patrick
          • LinkedIn
          • @Patrickt010 on Twitter
          • Website
          • Ondrej
            • LinkedIn
            • @ondrej421 on Twitter
            • ondrej on GitHub
            • Parting Question
              • From your perspective, what is the biggest gap in the tooling or technology for data management today?
              • Closing Announcements
                • Thank you for listening! Don’t forget to check out our other show, Podcast.__init__ to learn about the Python language, its community, and the innovative ways it is being used.
                • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
                • If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
                • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
                • Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat
                • Links
                  • Iteratively
                  • Syncplicity
                  • Locally Optimistic
                  • DBT
                    • Podcast Episode
                    • Snowplow Analytics
                      • Podcast Episode
                      • JSON Schema
                      • Master Data Management
                        • Podcast Episode
                        • SDLC == Software Development Life Cycle
                        • Amplitude
                        • Mixpanel
                        • Mode Analytics
                        • CRUD == Create, Read, Update, Delete
                        • Segment
                          • Podcast Episode
                          • Schemaver (JSON Schema Versioning Strategy)
                          • Great Expectations
                            • Podcast.init Interview
                            • Data Engineering Podcast Interview
                            • Confluence
                            • Notion
                            • Confluent Schema Registry
                              • Podcast Episode
                              • Snowplow Iglu Schema Registry
                              • Pulsar Schema Registry
                              • The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

                                Support Data Engineering Podcast

                                ...more
                                View all episodesView all episodes
                                Download on the App Store

                                Data Engineering PodcastBy Tobias Macey

                                • 4.5
                                • 4.5
                                • 4.5
                                • 4.5
                                • 4.5

                                4.5

                                140 ratings


                                More shows like Data Engineering Podcast

                                View all
                                Software Engineering Radio by se-radio@computer.org

                                Software Engineering Radio

                                273 Listeners

                                The Changelog: Software Development, Open Source by Changelog Media

                                The Changelog: Software Development, Open Source

                                292 Listeners

                                Software Engineering Daily by Software Engineering Daily

                                Software Engineering Daily

                                625 Listeners

                                The Cloudcast by Massive Studios

                                The Cloudcast

                                153 Listeners

                                Talk Python To Me by Michael Kennedy

                                Talk Python To Me

                                585 Listeners

                                Thoughtworks Technology Podcast by Thoughtworks

                                Thoughtworks Technology Podcast

                                42 Listeners

                                Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                                Super Data Science: ML & AI Podcast with Jon Krohn

                                304 Listeners

                                Python Bytes by Michael Kennedy and Brian Okken

                                Python Bytes

                                214 Listeners

                                Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

                                Syntax - Tasty Web Development Treats

                                982 Listeners

                                DataFramed by DataCamp

                                DataFramed

                                268 Listeners

                                Practical AI by Practical AI LLC

                                Practical AI

                                213 Listeners

                                AWS Podcast by Amazon Web Services

                                AWS Podcast

                                201 Listeners

                                The Stack Overflow Podcast by The Stack Overflow Podcast

                                The Stack Overflow Podcast

                                63 Listeners

                                The Real Python Podcast by Real Python

                                The Real Python Podcast

                                141 Listeners

                                Latent Space: The AI Engineer Podcast by swyx + Alessio

                                Latent Space: The AI Engineer Podcast

                                95 Listeners