Data Engineering Weekly

DEW #133: How to Implement Write-Audit-Publish (WAP), Vector Database - Concepts and examples & Data Warehouse Testing Strategies for Better Data Quality


Listen Later

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives.

On DEW #133, we selected the following article

LakeFs: How to Implement Write-Audit-Publish (WAP)

I wrote extensively about the WAP pattern in my latest article, An Engineering Guide to Data Quality - A Data Contract Perspective. Super excited to see a complete guide on implementing the WAP pattern in Iceberg, Hudi, and of course, with LakeFs.

https://lakefs.io/blog/how-to-implement-write-audit-publish/

Jatin Solanki: Vector Database - Concepts and examples

Staying with the vector search, a new class of Vector Databases is emerging in the market to improve the semantic search experiences. The author writes an excellent introduction to vector databases and their applications.

https://blog.devgenius.io/vector-database-concepts-and-examples-f73d7e683d3e

Policy Genius: Data Warehouse Testing Strategies for Better Data Quality

Data Testing and Data Observability are widely discussed topics in Data Engineering Weekly. However, both techniques test once the transformation task is completed. Can we test SQL business logic during the development phase itself? Perhaps unit test the pipeline?

The author writes an exciting article about adopting unit testing in the data pipeline by producing sample tables during the development. We will see more tools around the unit test framework for the data pipeline soon. I don’t think testing data quality on all the PRs against the production database is not a cost-effective solution. We can do better than that, tbh.

https://medium.com/policygenius-stories/data-warehouse-testing-strategies-for-better-data-quality-d5514f6a0dc9



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.dataengineeringweekly.com
...more
View all episodesView all episodes
Download on the App Store

Data Engineering WeeklyBy Ananth Packkildurai

  • 2.7
  • 2.7
  • 2.7
  • 2.7
  • 2.7

2.7

3 ratings


More shows like Data Engineering Weekly

View all
Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

Software Engineering Radio - the podcast for professional software developers

274 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

288 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

624 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

583 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

301 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

145 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

225 Listeners

DataFramed by DataCamp

DataFramed

266 Listeners

Tech Brew Ride Home by Morning Brew

Tech Brew Ride Home

963 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

180 Listeners

Practical AI by Practical AI LLC

Practical AI

213 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

139 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

507 Listeners

The Data Engineering Show by The Firebolt Data Bros

The Data Engineering Show

8 Listeners

Latent Space: The AI Engineer Podcast by Latent.Space

Latent Space: The AI Engineer Podcast

100 Listeners