Capability Report: Content Vault
Date: 2026-04-08 Author: Q (Quality System) Project: Unified Canonical Architecture (Layers 2-4) Status: Operational
What Was Built
The Content Vault is a 56MB SQLite database at /mnt/d/data/content-vault.db that serves as the canonical record of everything in the Value-First system. It stores 4,478 records across 42 tables representing 23 entity types, organized through the Four Unified Views (Customer Health, Company Health, Context Health, Community Health).
What Problem It Solves
Before the Content Vault, agents answered questions like "how many episodes does AI Daily have?" or "which shows is Chris a host on?" by querying multiple systems independently -- Sanity for CMS records, the filesystem for session files, HubSpot for contacts, YouTube for episodes. Each system held a partial picture. No system could answer cross-system questions ("does this show exist in Sanity AND YouTube AND the filesystem?"). Agents guessed at file paths, grepped the codebase for operational knowledge, and produced conflicting counts depending on which system they queried first.
The Content Vault eliminates this by providing a single queryable surface across all systems. An agent queries one database instead of five APIs.
Schema: 42 Tables, 23 Entity Types
Core Entity Tables (31 tables serving the data model)
Unified Customer View -- Customer Health (5 entity types):
| Table | Entity | Records | Source Systems |
|---|---|---|---|
contributors |
Friend | 116 | Sanity contributors, .claude/agents/, people/ configs |
clients |
Organization | 34 | clients/ filesystem, HubSpot Companies |
signals |
Signal | 0 | HubSpot-primary (live queries) |
sessions |
Session | 351 | clients/*/sessions/ filesystem, Upstash |
events |
Event | 0 | HubSpot-primary (live queries) |
Unified Revenue View -- Company Health (4 entity types):
| Table | Entity | Records | Source Systems |
|---|---|---|---|
engagements |
Engagement | 0 | HubSpot-primary (Deals, Services) |
investments |
Investment | 0 | HubSpot-primary (custom object 2-53994804) |
offerings |
Offering | 0 | HubSpot-primary (Products) |
deliverables |
Deliverable | 0 | HubSpot-primary (custom object 2-18484424) |
Unified Business Context -- Context Health (7 entity types):
| Table | Entity | Records | Source Systems |
|---|---|---|---|
methodology |
Methodology Content | 52 | Sanity (valueReality, trap, valuePathStage), Obsidian, skills/methodology/ |
episodes |
Episode (legacy) | 1,086 | YouTube API, Sanity, Mux |
canonical_episodes |
Episode (canonical) | 1,579 | YouTube (VF + Profoundly channels), Sanity, HubSpot Events |
articles |
Article | 127 | Sanity blogPost documents |
plans |
Plan | 51 | docs/plans/ filesystem |
reports |
Report | 3 | Leadership/reports/ filesystem |
architecture_decisions |
Architecture Decision | 0 | To be populated from decision records |
reference_content |
Reference Content | 177 | Sanity (tool, persona, sponsor, presentation, event, course, playbook) |
educational_content |
Educational Content | 152 | Sanity (course, lesson, goldenPathQA) |
Unified Team Enablement -- Community Health (6 entity types):
| Table | Entity | Records | Source Systems |
|---|---|---|---|
skills |
Skill | 157 | skills/ filesystem |
commands |
Command | 73 | .claude/commands/ filesystem |
scripts |
Script | 86 | scripts/ filesystem |
shows |
Show | 42 | Sanity show documents, YouTube playlists |
assets |
Asset | 0 | To be populated from filesystem + Sanity CDN |
tasks |
Task | 0 | HubSpot-primary (Tasks) |
Relationship and Index Tables (5 tables)
| Table | Purpose | Records |
|---|---|---|
contributor_shows |
Many-to-many: which contributors appear on which shows | 96 |
episode_methodology |
Many-to-many: which episodes discuss which methodology concepts | 0 |
cross_references |
Generic cross-entity references | 0 |
transcripts |
Raw transcript index (client session transcripts) | 296 |
hubspot_contacts / hubspot_listings / hubspot_projects |
HubSpot record caches | 0 |
Infrastructure Tables (6 tables)
| Table | Purpose |
|---|---|
content_fts |
FTS5 virtual table for full-text search (1,987 indexed records) |
content_fts_config |
FTS5 internal |
content_fts_content |
FTS5 internal |
content_fts_data |
FTS5 internal |
content_fts_docsize |
FTS5 internal |
content_fts_idx |
FTS5 internal |
vault_meta |
Schema version tracking (currently v3) |
sync_log |
Sync operation audit trail |
youtube_comments |
YouTube comment data |
youtube_live_chat |
YouTube live chat data |
sqlite_sequence |
SQLite auto-increment tracking |
Population: How Data Gets In
Data enters the Content Vault through two paths:
Batch population scripts --
scripts/canonical/populate-methodology.jsand similar scripts scan source systems (Sanity CMS, YouTube API, filesystem directories) and write records to the Vault. This was the initial seeding mechanism during Layer 4 (FR-5) of the architecture.Vault agent (ongoing) -- The Vault agent owns all SQLite writes. When agents process new content (transcripts, episodes, articles), Vault creates or updates the corresponding records.
Tables with zero records (signals, events, engagements, investments, offerings, deliverables, tasks, assets) are intentionally empty. These entities are HubSpot-primary -- their source of truth is live HubSpot data queried through MCP or CLI. The Vault tables exist as the schema for future caching or batch intelligence, not as replacements for the live system.
FTS5 Search Capability
The content_fts virtual table provides full-text search across 1,987 records using SQLite's FTS5 extension. The table indexes five columns:
source_type-- Entity type (episode, session, transcript, skill, article)source_id-- Primary key reference back to the source tabletitle-- Searchable titlebody_text-- Full content texttags-- Associated tags/categories
Current FTS distribution:
| Source Type | Indexed Records |
|---|---|
| episode | 1,056 |
| session | 351 |
| transcript | 296 |
| skill | 157 |
| article | 127 |
FTS5 enables queries like "find all content mentioning NetSuite" across every entity type in a single query, ranked by relevance. This complements Upstash Vector (semantic search over transcripts/sessions) with structured full-text search over the entire Vault.
Consumers: Who Uses the Content Vault
9 Canonical Query Tools (scripts/canonical/)
Every query tool reads from the Content Vault as its primary data source:
show-query.js-- Show roster, cross-system alignmentepisode-query.js-- Episode lifecycle, recording pipeline statusarticle-query.js-- Publishing cadence, author attributionfriend-query.js-- People intelligence (human + AI, cross-system)methodology-query.js-- Framework content alignment, drift detectionorganization-query.js-- Client/company cross-system viewcustomer-health-query.js-- Session history, relationship signalscompany-health-query.js-- Revenue entity healthunified-audit.js-- Full 23-entity census
25+ Slash Commands (rewired in Layer 5)
Commands across all three leaders now consume Vault data through canonical query tools instead of direct system queries. These include daily operations (/daily-ops, /daily-recap), media commands (/media-prep, /media-recap, /show-prep, /show-health), relationship commands (/relationship-brief, /relationship-pulse, /sentinel-check, /meeting-prep), revenue commands (/revenue-brief, /investment-brief), planning commands (/weekly-plan, /weekly-review), BU briefs (/academy-brief, /collective-brief, /apps-brief, /commerce-brief), and content commands (/content-multiply, /media-brief).
Agents
Any agent can query the Vault through canonical tools or direct SQLite queries. Key consumers include Vault (write operations), Catalog (indexing), Beacon (operational output), Prelude/Encore (media intelligence), Sentinel (relationship monitoring), and Pulse (health scoring).
How to Verify It Works
# Check database exists and has current schema
sqlite3 /mnt/d/data/content-vault.db "SELECT * FROM vault_meta;"
# Expected: schema_version|3
# Check record counts
sqlite3 /mnt/d/data/content-vault.db "SELECT 'Total: ' || COUNT(*) FROM (
SELECT id FROM articles UNION ALL SELECT id FROM episodes
UNION ALL SELECT id FROM sessions UNION ALL SELECT slug FROM methodology
UNION ALL SELECT id FROM skills UNION ALL SELECT id FROM commands
);"
# Check FTS5 works
sqlite3 /mnt/d/data/content-vault.db "SELECT source_type, title FROM content_fts WHERE content_fts MATCH 'value path' LIMIT 5;"
# Run unified audit for full census
node /mnt/d/Projects/value-first-operations/scripts/canonical/unified-audit.js
What This Enables
Cross-system alignment detection -- Every canonical query tool has an
--auditflag that compares Vault records against their source systems (Sanity, YouTube, filesystem). Discrepancies surface immediately.Single-query intelligence -- "How many active shows do we have?" is one query, not three API calls to Sanity + YouTube + filesystem.
Entity lifecycle tracking -- Episodes have
recording_lifecycle(scheduled through distributed). Sessions havesynthesis_status(raw through intelligence_extracted). These lifecycle columns enable pipeline health monitoring.Foundation for remaining FRs -- FR-7 (Rendered Surface Query Layer), FR-8 (Agent Query Interface), FR-9 (Surface Refactoring), and FR-11 (Functional Validation) all depend on the Content Vault as the canonical data source.
Ownership
- Write authority: Vault agent (all SQLite writes)
- Schema authority: V (architecture decisions)
- Audit authority: Q (quality verification)
- Location:
/mnt/d/data/content-vault.db - Schema documentation:
/mnt/d/Projects/value-first-operations/docs/plans/unified-canonical-data-model.md