I built Limpa, a simple Django and htmx-powered web app that generates ad-free podcast feeds by transcribing the latest episode, using a large language model to detect ad segments, and cutting them out with ffmpeg. It runs on Modal with NVIDIA’s Parakeet v3 for transcription and uses Django’s new Tasks framework for background jobs. I’m not hosting it as a service, but the project is documented so you can run it yourself.
Relevant links:
Original articleLimpa (GitHub)Pi-holeuBlock OriginSponsorBlockRSS feed finderDjangohtmxNVIDIA Parakeet v3 modelModalTranscript with timestamps (code)LLM ad extraction prompt (code)ffmpegDjango Tasks frameworkCelery (Django first steps)FlowerOpenCode.ai