The Cloudcast

Improving AI Through Data Quality


Listen Later

Elliot Shmukler (@eshmu, Co-Founder/CEO @anomalo_hq) talks about the impact of data quality on AI, how unstructured data can be improved, and how monitoring of data lakes can help prevent model drift and give organizations confidence with predictable results.

SHOW: 945

SHOW TRANSCRIPT: The Cloudcast #945 Transcript

SHOW VIDEO: https://youtube.com/@TheCloudcastNET 

CLOUD NEWS OF THE WEEK:  http://bit.ly/cloudcast-cnotw

NEW TO CLOUD? CHECK OUT OUR OTHER PODCAST:  "CLOUDCAST BASICS"

SPONSORS:

  • [DoIT] Visit doit.com (that’s d-o-i-t.com) to unlock intent-aware FinOps at scale with DoiT Cloud Intelligence.
  • [VASION] Vasion Print eliminates the need for print servers by enabling secure, cloud-based printing from any device, anywhere. Get a custom demo to see the difference for yourself.
  • [FCTR] Try FCTR.io (that's F-C-T-R dot io) free for 60 days. Modern security demands modern solutions. Check out Fctr's Tako AI, the first AI agent for Okta, on their website

SHOW NOTES:

  • Anomalo website
  • The Cloudcast #598 - Data Quality
  • Snowflake invests in Anomalo

Topic 1 - Elliot, welcome back! It’s hard to believe it has been 3 years since we spoke! Give everyone a brief introduction.

Topic 2 - Here’s the problem I see when it comes to AI adoption today. There isn’t an “off the shelf” AI model with an organization's data built in; that’s impossible. So, you must bring this data, often unstructured, to the model, often with mixed results. Do you agree?

Topic 3 - I see data quality in two ways… the quality of the data before ingestion is one way, we want the data to be clean going in. But, we also need a way to detect, mitigate, and do a root cause analysis for quality checks along the way, correct? Give everyone an idea of what this life cycle looks like.

Topic 4 - What are you seeing as the barriers to adoption? Is it the tools, the models, the need for RAG pipelines, the lack of data scientists, and AIOps?

Topic 5 - We have this crossroads where proprietary data makes an organization unique, but exposing that unique data puts the organization at risk. How much of a factor does this play, and how do you advise organizations around this complex intersection

Topic 6 - There is always this concept of predictable results. This answer should be consistent and repeatable. We’ve seen things like model/data drift and hallucinations hinder this concept, leading to a lack of confidence in the results. How do you advise organizations to tackle this lifecycle management and predictability over time?


FEEDBACK?

  • Email: show at the cloudcast dot net
  • Bluesky: @cloudcastpod.bsky.social
  • Twitter/X: @cloudcastpod
  • Instagram: @cloudcastpod
  • TikTok: @cloudcastpod
...more
View all episodesView all episodes
Download on the App Store

The CloudcastBy Massive Studios

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

147 ratings


More shows like The Cloudcast

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

273 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

290 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,101 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

625 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

585 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

288 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

145 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

982 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

189 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

182 Listeners

Practical AI by Practical AI LLC

Practical AI

209 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

202 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

141 Listeners