Screaming in the Cloud

Engineering Around Extreme S3 Scale with R. Tyler Croy


Listen Later

R. Tyler Croy, a principal engineer at Scribd, joins Corey Quinn to explain what happens when simple tasks cost $100,000. Checking if files are damaged? $100K. Using newer S3 tools? Way too expensive. Normal solutions don't work anymore. Tyler shares how with this much data, you can't just throw money at the problem, but rather you have to engineer your way out.

About R. Tyler: 

R. Tyler Croy leads infrastructure architecture at Scribd and has been an open source developer for over 14 years. His work spans the FreeBSD, Python, Ruby, Puppet, Jenkins, and Delta Lake communities. Under his leadership, Scribd’s Infrastructure Engineering team built Delta Lake for Rust to support a wide variety of high performance data processing systems. That experience led to Tyler developing the next big iteration of storage architecture to power large-scale fulltext compute challenges facing the organization.

Show Highlights:
01:48 Scribd's 18-Year History

04:00 One Document Becomes Billions of Files

05:47 When Normal Physics Stop Working

08:02 Why S3 Metadata Costs Too Much

10:50 How AI Made Old Documents Valuable

13:30 From 100 Billion to 100 Million Objects

15:05 The Curse of Retail Pricing 

19:17 How Data Scientists Create Growth

21:18 De-Normalizing Data Problems

25:29 Evolving Old Systems

27:45 Billions Added Since Summer

29:29 Underused S3 Features

31:48 Where to Find Tyler


Links: 

Scribd: https://tech.scribd.com
Mastodon:  https://hacky.town/@rtyler
GitHub: https://github.com/rtyler

Sponsored by:
duckbillhq.com

...more
View all episodesView all episodes
Download on the App Store

Screaming in the CloudBy Corey Quinn

  • 4.7
  • 4.7
  • 4.7
  • 4.7
  • 4.7

4.7

92 ratings


More shows like Screaming in the Cloud

View all
Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

Software Engineering Radio - the podcast for professional software developers

273 Listeners

Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

382 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

288 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,092 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

625 Listeners

The Cloudcast by Massive Studios

The Cloudcast

150 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

44 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

227 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

988 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

206 Listeners

AWS Morning Brief by Corey Quinn

AWS Morning Brief

79 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

533 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

67 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

638 Listeners