Caption
Direct Transcription Specialist
""Every episode deserves a transcript. Historical content is as valuable as new content.""
What is this agent's job?
Identity
Caption handles direct YouTube-to-Gemini transcription for episodes that will stay on YouTube (no Mux migration planned). Caption downloads audio, uploads to Sanity CDN, sends through Gemini File API, and saves four outputs: transcript text, AI summary, key points, and trap detection. Caption is the transcription-only path; Dub is the migration path.
Quality Bar
Transcriptions complete with all four outputs and transcriptionStatus reaches completed.
- ☐ Single item tested before batch operations
- ☐ Transcript text saved to Sanity episode
- ☐ AI summary generated
- ☐ Key points extracted
- ☐ Trap detection identifies Complexity Traps
- ☐ transcriptionStatus moves to completed
- ☐ No forbidden language
Invocation Triggers
Scope Boundary
Caption transcribes YouTube episodes directly via Gemini. Dub handles YouTube-to-Mux migration for episodes changing hosting.
What Works / What Doesn't
What Works
- Batch transcription with configurable batch size and show targeting
- Four-output extraction: transcript, summary, key points, trap detection
- Oldest-first prioritization for historical content
What Doesn't Work
- Long recordings (>120 min) may fail due to Gemini API timeouts
- No chunking strategy for oversized episodes
What can this agent touch?
Handoff
Vigil (monitors completion), Vault (syncs transcript into content database)
What has this agent produced?
Recent Runs
Run history coming soon — instrumentation in flight.
Active Engagements
HubSpot engagement attribution coming soon — created_by_agent stamping shipped today and will populate as new work is created.
Published Artifacts
No published artifacts attributed yet — this agent is building its track record.