AWS Bites

153. LLM Inference with Bedrock


Listen Later

If you’re curious about building with LLMs, but you want to skip the hype and learn what it takes to ship something reliable in production, this episode is for you.We share our real-world experience building AI-powered apps and the gotchas you hit after the demo: tokens and cost, quotas and throttling, IAM and access friction, marketplace subscriptions, and structured outputs that do not break your JSON parser.We focus on Amazon Bedrock as AWS’s managed inference layer: how to get started with the current access model, how to choose models, how pricing works, and what to watch for in production.We also go deep on structured outputs: constrained decoding, schema design that improves output quality, and how to avoid “grammar compilation timed out”.


In this episode, we mentioned the following resources:

  • fourTheorem: Bedrock structured outputs guide https://fourtheorem.com/amazon-bedrock-structured-outputs/
  • Amazon Bedrock https://aws.amazon.com/bedrock/
  • Bedrock docs https://docs.aws.amazon.com/bedrock/latest/userguide/
  • Bedrock pricing https://aws.amazon.com/bedrock/pricing/
  • Structured outputs https://docs.aws.amazon.com/bedrock/latest/userguide/structured-outputs.html
  • Cross-region inference https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html
  • Quotas https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html
  • Throttling help https://repost.aws/knowledge-center/bedrock-throttling-error
  • Prompt caching https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
  • Troubleshooting error codes https://docs.aws.amazon.com/bedrock/latest/userguide/troubleshooting-api-error-codes.html


Do you have any AWS questions you would like us to address?

Leave a comment here or connect with us on X/Twitter, BlueSky or LinkedIn:


- ⁠https://twitter.com/eoins⁠ | ⁠https://bsky.app/profile/eoin.sh⁠ | ⁠https://www.linkedin.com/in/eoins/⁠

- ⁠https://twitter.com/loige⁠ | ⁠https://bsky.app/profile/loige.co⁠ | ⁠https://www.linkedin.com/in/lucianomammino/

...more
View all episodesView all episodes
Download on the App Store

AWS BitesBy AWS Bites

  • 4.7
  • 4.7
  • 4.7
  • 4.7
  • 4.7

4.7

12 ratings


More shows like AWS Bites

View all
Up First from NPR by NPR

Up First from NPR

56,944 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

204 Listeners

.NET Rocks! by Carl Franklin and Richard Campbell

.NET Rocks!

242 Listeners

Bad Friends by Bobby Lee & Andrew Santino

Bad Friends

14,543 Listeners

The AWS Developers Podcast by Amazon Web Services

The AWS Developers Podcast

26 Listeners