Capability Report: Mirror Content Quality Protocol

Capability Report: Mirror Content Quality Protocol

Date: April 10, 2026 Built by: Marquee + Baldwin + Mirror Status: Operational


What It Is

Mirror can now evaluate whether a page serves its visitors, not just whether it renders correctly.

The previous Mirror checked visual rendering (layout, typography, branding), interactive behavior (clicks, state changes, keyboard navigation), and technical health (HTTP status, console errors, broken images). It had no concept of content quality. A page could have an unformatted 33,701-character wall of text, null descriptions, no host attribution, and an empty show badge -- and Mirror would report PASS because every element was technically present.

The new Mirror adds a third evaluation mode: Content Quality Audit. This mode applies the Stranger Test, readability checks, and page-type-specific completeness checklists before any page can receive a PASS verdict. Content quality is no longer optional -- it is part of the quality bar.


Why It Matters

The gap this closes

Mirror was the last checkpoint before content reached visitors. If Mirror said PASS, the team moved on. But Mirror's definition of PASS was "nothing is broken," not "this is useful." The gap between those two definitions is where visitor trust erodes -- pages that look functional but communicate nothing about why someone should stay.

What triggered this

On April 9, 2026, Mirror certified an episode page as PASS despite the entire transcript being a single <p> element (unreadable wall of text), no description visible anywhere on the page, no hosts or guests attributed, no show name displayed, and no duration shown. Chris caught the wall of text himself during /media-prep. The defect was obvious to a human in under two seconds but invisible to Mirror's automation.

The principle

"Content exists" is not "content works." A 33,000-character transcript with no paragraph breaks is present but useless. A page with a video player and no description is functional but uninformative. Quality means the visitor gets value, not that the server returns 200.


How It Works

Mode 3: Content Quality Audit

Runs alongside or independently from Mode 1 (Visual Review) and Mode 2 (Interactive Audit). Three components.

Component 1: The Stranger Test

Before any page receives PASS, Mirror confirms that a first-time visitor arriving from a search engine can answer three questions within 10 seconds of landing:

Question What Mirror Checks
What is this about? Clear title, description, or summary visible above the fold
Who created this? Authors, hosts, guests, or contributors attributed on the page
Is it worth my time? Duration, takeaways, topic indicators, or other engagement signals visible

If any answer is "no," the page cannot PASS. The Stranger Test is mandatory for every review, regardless of which modes are active.

Component 2: Readability Checks

Apply to all page types:

Check FAIL Condition
Wall of text Any unbroken text block exceeding approximately 500 characters without a visual break (paragraph, heading, speaker label, or list)
Heading structure No headings below h1, or headings that do not create scannable structure
Empty sections Section heading visible but content area is blank or shows "null"/"undefined"
Placeholder content Lorem ipsum, template text, or auto-generated filenames as titles
Missing attribution Content with no author, host, or source identified
No context above fold Video/audio player with no description of what it contains

Component 3: Page-Type Checklists

Each page type has a tiered completeness checklist separating FAIL (must have) from WARN (should have) conditions.

Episode pages require: descriptive title, visible description or summary, working player, host/guest names, air date, and show name on badge. Duration, key takeaways, formatted transcript with speaker labels, timestamps, tags, and SEO metadata are WARN-level.

Article pages require: headline that communicates the insight, author attribution, publish date, and body content with proper paragraph structure. Featured image, category/tags, SEO metadata, and related content links are WARN-level.

Portal sections require: section title and purpose clear, data is current (not stale placeholder), and navigation works. Empty state handling and client-specific data accuracy are WARN-level.

Transcript-specific checks (when a page contains a transcript): speaker labels must be visually distinct, paragraph breaks must exist at topic transitions, timestamps must be present for transcripts over 10 minutes, and speaker map or legend must exist if multiple speakers. A 33,000-character transcript rendered as a single <p> element is a FAIL.

Report Format

Mirror reports now use three severity levels:

Status Meaning
PASS Stranger Test passed, all FAIL-tier fields present, readability acceptable, interactions working
WARN Stranger Test passed, no FAIL-tier issues, but WARN-tier fields missing or minor readability concerns
FAIL Stranger Test failed, or any FAIL-tier field missing, or wall-of-text detected, or critical interaction broken

Every report includes a mandatory Stranger Test section with explicit YES/NO for each of the three questions and evidence for the answer.


Files Changed

File Change
.claude/agents/mirror.md Added Mode 3: Content Quality Audit (Stranger Test, readability checks, page-type checklists, transcript-specific checks). Updated report format from pass/issues binary to PASS/FAIL/WARN with mandatory Stranger Test section. Updated delegation contract quality bar to require content quality checks before PASS. Added transcript quality checks to quality bar checklist. Updated "CRITICAL" callout: content existence is not a pass.

Validation

Retroactive test: the incident episode

Applying the new protocol to the April 8 episode page that Mirror previously passed:

Check Result
Stranger Test: What is this about? NO -- no description or summary visible. FAIL.
Stranger Test: Who created this? NO -- no hosts or guests listed. FAIL.
Stranger Test: Is it worth my time? NO -- no duration, no takeaways above fold. FAIL.
Readability: Wall of text FAIL -- 33,701 chars in single <p> element
Episode checklist: Description FAIL -- null
Episode checklist: Host attribution FAIL -- missing
Episode checklist: Show badge FAIL -- show title null
Overall FAIL (6 FAIL conditions)

The old Mirror: PASS. The new Mirror: FAIL on six independent criteria. The protocol catches the exact class of defect that was invisible before.

Forward test criteria

Any future Mirror review should be validated against these conditions:

  1. A page with a working video player but no description should FAIL (not PASS with a suggestion)
  2. A page with an unformatted transcript (no paragraph breaks, no speaker labels) should FAIL
  3. A page with all fields populated and properly formatted should PASS
  4. A page with all FAIL-tier fields present but missing WARN-tier fields should receive WARN status

What's Next

Immediate (open items from the incident)

  1. Episode template fallback. The episode page template should render fullSummary when summary and description are both null. This is a template change, not a Mirror change, but Mirror will catch the gap on future reviews.

  2. Transcript re-generation. The April 8 episode needs its transcript regenerated with formatting (speaker labels, paragraph breaks, timestamps). The transcribe-pending.ts complex prompt intermittently produces 0-char responses from Gemini 2.0 Flash -- this needs investigation.

  3. Speaker map population. The speakerMap field in Sanity is never populated by the transcription pipeline. Mirror's transcript checks include speaker map verification, so this gap will now be visible on every review.

Expansion

  1. Additional page types. The current checklists cover episodes, articles, and portal sections. As Mirror reviews additional surface types (show landing pages, course pages, assessment pages, collective pages), new checklists should be added using the same cross-agent pattern: domain specialist defines what quality means, Mirror codifies the checks.

  2. Cross-agent quality review pattern. Baldwin provided the content criteria for this protocol. This pattern -- technical QA agent informed by domain specialist -- should be repeated whenever quality gates are established for new page types. Pavilion for portal sections. Showcase for walkthrough apps. Provost for course pages. The specialist knows what quality means in context; Mirror knows how to check it at scale.