Corrective Action: V2 Forbidden-Language Pre-Commit Hook ENOBUFS on Large Diffs

Date: 2026-05-11 Category: Platform Infrastructure Defect — V2 git hooks layer Impact: During V2 commit 356ebf9 (V2 connect: 42 packs to skills/domains/ + manifest + Emerjent re-sync), V's forbidden-language pre-commit hook failed with ENOBUFS (buffer overrun) on the diff size. V required Chris's explicit authorization to run git commit --no-verify. This violates the canonical rule (wiki/conventions.md § Git operations) — "Never skip hooks unless Chris explicitly asks. If a hook fails, investigate the root cause." Resolution Status: Open. Workaround used (--no-verify with Chris auth) for the single stream-closing commit; hook implementation needs to be rewritten to handle large diffs without blowing its child-process buffer.

Incident

What Happened

V attempted to commit 42 pack directories + manifest update + Emerjent re-sync to the V2 monorepo (/mnt/d/Projects/VFT_Platform/2026_VFT_Platform_Infrastructure/). The pre-commit hook — the V2 forbidden-language scanner that V deployed earlier as a discipline gate for this monorepo — fired and crashed with ENOBUFS (error: stdout maxBuffer exceeded).

The diff size was approximately:

42 new pack directories (each 6 files, average ~1000 lines) → ~252 files / ~250,000+ lines
Manifest updates (~50 lines)
Emerjent re-sync metadata (~10 lines)

The pre-commit hook's implementation runs the equivalent of git diff --cached | grep -E '<forbidden-pattern>' and pipes the diff through Node child-process buffers. Node's default child-process stdout buffer is 1MB; the diff exceeded this and the child process crashed before the hook could complete the scan.

V escalated to Chris: "The V2 pre-commit hook is ENOBUFS-ing on this diff size. The pack content is clean — it has been audited and there's no forbidden language. Can I run --no-verify to land this commit?" Chris authorized: "Yes, push it." V committed with --no-verify.

Why This Matters

Three issues compound:

Issue 1 — The hook's purpose was defeated at exactly the moment it mattered most. The whole point of a forbidden-language pre-commit hook is to be a hard gate. When the hook is structurally incapable of running on a large diff, it stops being a gate and becomes a usability problem.

Issue 2 — --no-verify violates the wiki/conventions.md § Git operations canon. The canon is unambiguous: "Never skip hooks (--no-verify, --no-gpg-sign) unless Chris explicitly asks. If a hook fails, investigate the root cause." Chris's authorization made this commit specifically legal — but the pattern itself ("hook fails → ask for --no-verify auth") is exactly the bypass behavior the canon exists to prevent.

Issue 3 — Q's audit (this stream's actual gate) ran cleanly outside the hook. Q's forbidden-language scan in docs/quality/audits/2026-05-11-pack-stream.md covered all 42 pack directories and found zero narrative-voice violations. The hook would have produced the same result if it had run. The hook's failure mode is "loud crash when handling its biggest legitimate inputs," which is worse than "silently allow narrative-voice violations on smaller inputs."

Timeline

Time	Event
2026-05-10/11	Pack stream ships 42 pack directories on V1
2026-05-11	V copies all 42 packs to V2 `skills/domains/`
2026-05-11	V attempts `git commit` on V2 — pre-commit hook fires
2026-05-11	Hook crashes with ENOBUFS on the large diff
2026-05-11	V flags to Chris with audit-clean evidence; requests `--no-verify` authorization
2026-05-11	Chris authorizes; V commits with `--no-verify` as commit `356ebf9`
2026-05-11	Q files this CAR as part of project gate closure

Root Cause

Primary Mechanism

The V2 forbidden-language pre-commit hook implementation uses Node's default child-process stdout buffer (1MB) to capture git diff --cached output. For diffs up to ~1MB, the hook works correctly. For the 42-pack diff (estimated 8-12MB raw text), the buffer overruns and Node throws ENOBUFS before the scan logic executes.

The hook never reaches its scan code. It crashes during input capture. The forbidden-language detection logic itself is not implicated.

Secondary Mechanisms

S1 — Hook implementation does not stream the diff. A streaming implementation (read git diff --cached via spawn() with on('data') line-by-line, scan each line as it arrives) would have no inherent size limit. The current implementation appears to use exec() or execSync() which buffers everything before processing. This is a common Node anti-pattern for arbitrary-size inputs.

S2 — Hook does not have a fallback degraded mode. When the diff exceeds the buffer, the hook crashes and fails the commit. A more resilient design would detect "diff too large for default buffer" and either (a) escalate to streaming mode automatically, or (b) split the diff into chunks and scan each chunk, or (c) emit a clear "diff exceeds N MB, hook cannot scan — re-run with --no-verify after independent audit" message rather than a raw Node ENOBUFS stack trace.

S3 — No documented commit-size limit was set as a known constraint. Contributors to V2 may not realize that a pack-stream-size commit will brick the hook. The hook fails silently from a discoverability perspective — there's no "if your commit is > 1MB diff, do X" guidance in the V2 docs.

What This Is Not

This is not a pack-content defect. Q's audit confirms the pack content is clean.

This is not a V1 hook problem. The V1 pre-commit hook (if any equivalent exists) ran cleanly on each individual pack commit in the V1 stream because V1 commits were per-wave or per-pack, with smaller diffs. The cap-break is V2-specific because V2's commit was the single batch.

This is not a discipline failure on V's part. V had clean audit evidence in hand, escalated to Chris with the relevant context, received explicit authorization, and used the authorization narrowly for the single commit. That is the canonical exception path the conventions allow for.

Immediate Fix

The bypass already happened. V2 commit 356ebf9 is landed. No immediate emergency action required.

For the next large V2 commit before the permanent fix:

The same pattern (audit clean → escalate to Chris → --no-verify with explicit auth) is acceptable per the canon. Q recommends that V keep Q's audit report cited in the commit message when bypassing the hook, so the bypass is auditable post-hoc.

Permanent Prevention

Action	Owner	Mechanism	Timeline
P1 — Rewrite the V2 forbidden-language pre-commit hook to stream the diff	Squire (script implementation) → V2 deploy	Replace exec()/execSync() with spawn() + on('data') line-by-line scanning. Remove any reliance on buffering the full diff.	Before next V2 large-batch commit
P2 — Add diff-size detection + clear error message	Squire	If the diff exceeds 5MB before scan, emit a clear message: "Diff is large (XMB). The hook is scanning in streaming mode. This may take a few seconds." Plus an upper cap (50MB?) at which point the hook explicitly says: "Diff exceeds the hook's safe scanning range; re-run with --no-verify only after independent forbidden-language audit."	Same change as P1
P3 — Document the hook's contract in V2's `wiki/` or equivalent	V (V2 documentation owner)	Add a short reference: "V2 pre-commit hook scope, what it scans for, what to do if it fails, when --no-verify is canonical (audited-clean large batches)."	After P1 ships
P4 — Verify V1 hook does not have the same defect	Squire	Inspect the V1 equivalent hook (if any) for the same buffering anti-pattern. The V1 stream did not surface this defect because commits were per-wave, but the same hook code might exist.	Same week as P1
P5 — Add hook-implementation discipline to `skills/skill-architecture/`	Hone	Document the streaming-vs-buffered pattern as a canonical implementation rule: "Any git hook that processes arbitrary-size inputs (diffs, file lists, output streams) must use streaming patterns. exec() and execSync() with default buffers are acceptable only for fixed-size inputs (e.g., `git rev-parse HEAD`)."	Next skill-architecture review

Verification

This CAR's permanent prevention is VERIFIED when:

V2 pre-commit hook reads the diff via stream and successfully scans a 10MB+ test diff (P1)
Hook emits a clear progress/cap message at large sizes (P2)
The hook is documented in V2 contributor reference (P3)
V1 equivalent hook is audited (P4 — VERIFIED clean OR same fix applied)
The streaming-vs-buffered rule is in skills/skill-architecture/ (P5)

Until P1 ships: this CAR remains OPEN. The same bypass pattern will recur on the next V2 large-batch commit (anticipated when V2 connects further skills, more agents, or major Sanity schema rollouts).

docs/quality/audits/2026-05-11-pack-stream.md — the audit that ran cleanly outside the hook
docs/quality/cars/2026-05-11-corrective-sentinel-bundle-cap.md — sibling CAR; both are platform-surface defects unmasked by stream volume
wiki/conventions.md § Git operations — the canonical rule the bypass touched (Chris-authorized exception path used correctly)
memory/reference_v2_platform.md — V2 platform reference (where V2 monorepo is documented)