Data Engineering Weekly

What Happened at Data Council 2023?


Listen Later

Hey folks, have you heard about the Data Council conference in Austin? The three-day event was jam-packed with exciting discussions and innovative ideas on data engineering and infrastructure, data science and algorithms, MLOps, generative AI, streaming infrastructure, analytics, and data culture and community. 

"People are so nice in the data community. Meeting them and brainstorming with many ideas and various thought processes is amazing. It was an amazing experience; The conference is mostly like a jam of different thought processes, ideas, and entrepreneurship.

The keynote by Shrishanka from AcrylData talked about how data catalogs are becoming the control center for pipelines, a game-changer for the industry.

I also had a chance to attend a session on Malloy, a new way of thinking about SQL queries. It was experimental but had some cool ideas on abstracting complicated SQL queries. ChatGPT will change the game in terms of data engineering jobs and productivity. Charge GPT, for example, has improved my productivity by 60%. And generative AI is becoming so advanced that it can produce dynamic SQL code in just a few lines.

But of course, with all this innovation and change, there are still questions about the future. Will Snowflake and Databricks outsource data governance experience to other companies? Will the modern data stack become more mature and consolidated? These are the big questions we need to ask as we move forward in the world of data. 

The talk by Uber on their Ubermetric system migrating from ElasticSearch to Apache Pinot - which, by the way, is an incredibly flexible and powerful system. We also chatted about Pinot's semi-structured storage support, which is important in modern data engineering. 

Now, let's talk about something (non)controversial: the idea that big data is dead. DuckDB brought up three intriguing points to back up this claim. 

* Not every company has Big Data.

* The availability of instances with higher memory is becoming a commodity

* Even with the companies have big data; they do only incremental processing, which can be small enough 

Abhi Sivasailam presented a thought-provoking approach to metric standardization. He introduced the concept of "metric trees" - connecting high-level metrics to other metrics and building semantics around them. The best part? You can create a whole tree structure that shows the impact of one metric on another. Imagine the possibilities! You could simulate your business performance by tweaking the metric tree, which is mind-blowing!

Another amazing talk was about cross-company data exchange, where Pardis discussed various ways companies share data, like APIs, file uploads, or even Snowflake sharing. But the real question is: How do we deal with revenue sharing, data governance, and preventing sensitive data leaks? Pardis's startup General Folders, is tackling this issue, becoming the "Dropbox" of data exchange. How cool is that?

To wrap it up, three key learnings from the conference were:

* The intriguing idea is that "big data is dead" and how it impacts data infrastructure architecture.

* Data Catalog as a control plane for modern data stack? Is it a dream or reality?

* The growing importance of data contracts and the fascinating idea of metric trees.

Overall, the Data Council conference was an incredible experience, and I can't wait to see what they have in store for us next year.



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.dataengineeringweekly.com
...more
View all episodesView all episodes
Download on the App Store

Data Engineering WeeklyBy Ananth Packkildurai

  • 2.7
  • 2.7
  • 2.7
  • 2.7
  • 2.7

2.7

3 ratings


More shows like Data Engineering Weekly

View all
Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

Software Engineering Radio - the podcast for professional software developers

271 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

290 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

623 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

585 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

301 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

146 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

226 Listeners

DataFramed by DataCamp

DataFramed

269 Listeners

Tech Brew Ride Home by Morning Brew

Tech Brew Ride Home

960 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

181 Listeners

Practical AI by Practical AI LLC

Practical AI

207 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

141 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

500 Listeners

The Data Engineering Show by The Firebolt Data Bros

The Data Engineering Show

8 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

93 Listeners