Verification Protocol
Owner: Q (Quality System Manager) Approved: 2026-04-12 by Chris Carolan (Founder, Advisory Committee) Review Cycle: Semi-annually or on protocol change Status: Active
Purpose
This protocol defines how process steps, agent capabilities, and system integrations move between verification statuses. It establishes what counts as evidence, who can verify, and when verification expires.
Verification Statuses
| Status | Definition | Evidence Required |
|---|---|---|
| VERIFIED | Tested and confirmed working in production with real data | Date, what was tested, what was confirmed, how to reproduce |
| UNTESTED | Code exists, builds, deploys -- but has never executed in production with real data | Code path identified, file path documented |
| ASSUMED | Believed to work based on code reading, documentation, or analogy -- not execution | Reasoning documented, what would prove or disprove |
| BLOCKED | Known issue prevents this step from working | Blocker described, owner identified, workaround (if any) |
| NOT BUILT | Designed or referenced in documentation, but implementation does not exist | Design intent documented, where it would be implemented |
| DEPRECATED | Replaced by a different approach | Replacement identified, migration status |
Verification Lifecycle
code written
NOT BUILT ──────────────────────────────> UNTESTED
|
+-------------+-------------+
| | |
tested with code reviewed known issue
real data (not run) found
| | |
v v v
VERIFIED ASSUMED BLOCKED
|
+-----+-----+
| |
system changes time passes
code modified (decay window)
| |
v v
UNTESTED UNTESTED
Key rule: VERIFIED is not permanent. It decays.
Evidence Requirements
VERIFIED Evidence
A VERIFIED marker must include ALL of the following:
- Date -- When the verification occurred (ISO 8601)
- What was tested -- Specific action taken (e.g., "called POST /api/studio/stream with show slug 'ai-daily'")
- What was confirmed -- Specific result observed (e.g., "Mux returned streamId, streamKey, and RTMP URL")
- Reproducibility -- How someone else could repeat this test (e.g., "login as admin, select show, click Go Live")
Example of sufficient evidence:
**Status:** VERIFIED 2026-04-12
**Evidence:** Called `POST /api/auth/login` with `chris@valuefirstteam.com`.
Received 200 with JWT token. Cookie `vf_media_token` set with 30-day expiry.
Redirect to `/studio` successful. Dashboard loaded with show list from Sanity.
**Reproduce:** Navigate to media.valuefirstteam.com, enter admin email, verify redirect.
Example of insufficient evidence:
**Status:** VERIFIED 2026-04-12
**Evidence:** Login works.
UNTESTED Evidence
An UNTESTED marker must include:
- Code path -- Where the code lives (file path + function/endpoint name)
- What exists -- Brief description of what the code does
- Why untested -- What is preventing testing (no trigger, no test data, dependency unmet)
ASSUMED Evidence
An ASSUMED marker must include:
- Reasoning -- Why it is believed to work (e.g., "standard Mux API pattern, used in other projects")
- What would verify -- Specific test that would move this to VERIFIED or UNTESTED
- What would disprove -- Specific observation that would indicate it does not work
- Risk if wrong -- Impact if the assumption is incorrect
Verification Decay
Verification does not last forever. A VERIFIED step returns to UNTESTED when any of the following occur:
Automatic Decay Triggers
| Trigger | Detection Method | Action |
|---|---|---|
| Code implementing the step is modified | Git diff on the file path referenced in the evidence | Status → UNTESTED, note: "Code changed since verification" |
| External dependency version changes | Dependency update detected (package.json, API version) | Status → UNTESTED for all steps using that dependency |
| Infrastructure change | Service migration, credential rotation, environment change | Status → UNTESTED for affected steps |
Time-Based Decay
| Risk Tier | Decay Window | Rationale |
|---|---|---|
| Tier 1: Critical | 90 days | Production-touching processes need frequent revalidation |
| Tier 2: High | 180 days | External-system processes need periodic revalidation |
| Tier 3: Standard | 365 days | Internal processes change less frequently |
| Tier 4: Support | No time decay | Infrastructure processes are stable unless changed |
Time-based decay is a floor, not a ceiling. Automatic triggers override time-based decay -- if code changes on day 10, the step is UNTESTED on day 10 regardless of the decay window.
Who Can Verify
Verification Levels
| Level | Who | Acceptable For | Method |
|---|---|---|---|
| Self-verification | The owning agent tests its own process | Tier 3-4 processes | Agent executes and documents |
| Independent verification | Q or another agent tests | Tier 1-2 processes (required) | Q executes per audit schedule |
| Human verification | Chris or a contributor tests | Client-facing outputs, security-sensitive processes | Human executes and reports |
Verification Authority
| Role | Can Verify | Cannot Verify |
|---|---|---|
| Q | Any process (independent authority) | Q's own processes (V reviews) |
| Process owner (agent) | Own process (self-verification) | Other agents' processes |
| Aegis | Agent registration completeness | Operational agent capabilities |
| Hone | Cross-layer reference integrity | Process execution |
| Chris / Contributors | Any process (human authority) | N/A |
Independence Requirement
For Tier 1 (Critical) processes, at least one verification must be independent -- performed by someone other than the process owner. This prevents the "I wrote it, I verified it, it works" pattern that led to the April 12 discovery.
Verification Scoring
Per-Process Score
Verification Score = (VERIFIED steps / Total steps) * 100%
Steps with status NOT BUILT or DEPRECATED are excluded from the denominator (they are gaps, not verification failures).
Operational Readiness Thresholds
| Score | Status | Meaning |
|---|---|---|
| 80-100% | Operational | Process can be relied upon |
| 50-79% | Partially Verified | Process works but has untested paths |
| 25-49% | Draft | More untested than tested -- treat as a plan, not a procedure |
| 0-24% | Unverified | The procedure documents intent, not reality |
A process MUST NOT be described as "operational" unless its verification score is >= 80%.
Verification in Practice
When Writing Operating Procedures
Use /process-documentation (Q's command). Every step gets a verification marker. The command enforces this.
When Updating Agent Definitions
If an agent definition claims a capability (e.g., "processes transcripts via Upstash"), the claim should reference:
- The operating procedure step where this is verified, OR
- An explicit "unverified capability" note
When Reporting Status
In daily ops, weekly reviews, and leadership meetings:
- Report verification scores, not capability counts
- "Media production: 5 of 20 steps verified (25%)" not "Media production: operational"
- Trend direction matters: "Up from 3/20 last week" shows progress
When Building New Capabilities
The design-to-operational lifecycle:
- 5P Plan / PRD -- Define what to build (no verification markers needed)
- Build -- Implement the capability
- Operating Procedure -- Q documents with UNTESTED markers on new steps
- Test -- Execute steps with real data
- Verify -- Update markers to VERIFIED with evidence
- Operational -- Score >= 80%, process enters audit rotation
Integration with Existing Systems
| System | How Verification Protocol Integrates |
|---|---|
| Enforcement layer | Enforcement prevents known bad patterns. Verification confirms good patterns work. Complementary, not overlapping. |
| CARs | A CAR may downgrade a step from VERIFIED to BLOCKED. The CAR's permanent prevention should restore it to VERIFIED. |
| Capability reports | Capability reports document what was built. Operating procedures verify it works. Capability report = "we built X." Operating procedure = "X works (or doesn't)." |
| Agent readiness (Aegis) | Agent readiness levels should align with the verification scores of the processes the agent operates. An agent cannot be "operational" if its core processes are unverified. |
| Hone consistency | Hone checks that references are valid (file exists, function exists). Verification checks that the referenced thing works. Hone = structural integrity. Verification = functional integrity. |
Revision History
| Date | Change | Author |
|---|---|---|
| 2026-04-12 | Initial protocol. Defines statuses, evidence, decay, authority, and scoring. | Q, V |