Injecting ads into a podcast at exact timestamps sounds easy until your sponsor read lands mid-sentence. This tool finds natural break points using word-level timestamps and semantic scoring, then stitches in your rolls cleanly. It also transcribes episodes and generates show notes - and that transcription is what powers the midroll engine.
My podcast tool does two things. The flagship is the AI podcast generator - describe a show, it researches, writes, and voices every episode on a schedule.
This post is about the other half: the midroll injection engine, for podcasters who already record their own episodes and want to inject sponsor reads, intros, and outros without doing it by hand - and without the result sounding like garbage.
Upload your episode audio plus your rolls (sponsor reads, intros, outros, whatever). The tool injects them automatically:
This is what makes it not suck. If you say "every 20 minutes," it doesn't blindly cut at 20:00. It finds 20:00 plus or minus ~5 minutes, looking for the best break point based on sentence boundaries and context shifts. The transcription gives us word-level timestamps, and a semantic scoring model identifies where you've actually finished a thought. So your sponsor read doesn't land in the middle of "and that's why transformers use atten--THIS EPISODE IS BROUGHT TO YOU BY--tion mechanisms."
After a mid-roll, the tool replays a few seconds of audio from before the break. So listeners get re-oriented. You know that feeling when an ad ends and you've completely lost the thread? This fixes that.
Change your rolls and every affected episode gets flagged as "stale." Re-process them in bulk. Swap out a sponsor, re-roll everything, done.
The tool also transcribes your episode (via faster-whisper), then an LLM generates a title, teaser, and full description from the transcript. Paste those straight into your podcast host or website. No more staring at a blank description field for 20 minutes after you just spent an hour recording.
The transcription also feeds the midroll engine - word-level timestamps are what make the smart break-point detection possible.
The tool has a free tier for uploads (no TTS credits needed). System ads get injected for monetization in that case, but you can use your own rolls on the paid tier.
Upload your episode and rolls. The tool finds natural break points and stitches everything together - plus transcription and show notes.