New Capability: Image Generation, Editing, and Composition Pipeline

New Capability: Image Generation, Editing, and Composition Pipeline

Date: April 5, 2026 (Saturday) Origin: The "Meet the AI Team" LinkedIn series required not only portrait generation (documented Apr 3) but also image editing, card composition, CMS asset management, and asset recovery. Production of the 28-post series surfaced six distinct image capabilities — three proven, two extended, one disproven. Impact: Complete image pipeline from generation through editing, composition, CMS upload, and recovery. 134 LinkedIn cards produced across 28 posts. 85 portraits uploaded to Sanity with zero failures. Gemini image editing proven as the correct tool for artifact removal on existing images.


What Was Proven

Six capabilities were exercised during production. Five are reliable. One (Sharp pixel patching for text removal) is not.

1. Batch AI Portrait Generation (Gemini)

Tool: gemini-3.1-flash-image-preview via @google/generative-ai SDK What it does: Generates 1024x1024 images from text prompts. Evidence: 82 agent portraits generated Apr 3, 3 leader portraits generated Apr 5. Script: scripts/generate-leader-portraits.cjs Cost: Approximately $0.067 per 1,000 images via Nano Banana 2.

Key learning: Prompt vocabulary directly controls output aesthetic. "Photorealistic" pushes the model toward human faces. Terms like "cybernetic entity" or "digital consciousness" maintain the desired ambiguity between human and artificial. The prompt must match the desired aesthetic precisely — the model takes vocabulary literally.

2. Gemini Image Editing (Image-to-Image)

Tool: Same gemini-3.1-flash-image-preview model. What it does: Accepts an existing image as base64 inline data plus a text instruction, returns an edited version preserving the original style. Evidence: Successfully removed "AEGIS" text overlay from the Sage portrait while preserving 100% of the original style, colors, composition, and detail.

Pattern:

const result = await model.generateContent([
  { inlineData: { mimeType: 'image/jpeg', data: base64ImageData } },
  'Remove the text "AEGIS" from this image. Keep everything else exactly the same.'
]);

Key learning: This is the correct approach for fixing artifacts on images you want to keep. Full regeneration changes style; editing preserves it. The model understands spatial context well enough to reconstruct the area behind removed text.

Limitation: Model availability is not uniform. gemini-2.0-flash-exp returned 404 for image editing. Only gemini-3.1-flash-image-preview was confirmed working for both generation and editing.

3. Sharp SVG Card Composition

Tool: sharp npm package with SVG overlay. What it does: Composites portrait images with text, color bars, backgrounds, and branding elements into distribution-ready cards. Evidence: 134 cards generated across 28 posts (singles, carousels, title cards, CTA cards). Script: scripts/compose-linkedin-cards.cjs

Dimensions:

  • Singles: 2400x1254 (1200x627 at 2x retina)
  • Carousels: 2160x2160 (1080x1080 at 2x retina)

Key learning: SVG overlay for text rendering produces clean, scalable results without dependencies on canvas, Puppeteer, or headless browsers. Text positioning, font sizing, color fills, and geometric shapes are all expressible in the SVG template. Sharp handles the composite in a single operation.

4. Sanity Batch Asset Upload

Tool: @sanity/client assets.upload() + patch(). What it does: Uploads image buffers to Sanity CDN and links them to document fields via asset references. Evidence: 85 portraits uploaded and linked to contributor avatar fields. Zero failures across the full batch. Script: scripts/sanity/batch-upload-avatars.cjs

Pattern:

const asset = await client.assets.upload('image', imageBuffer, { filename });
await client.patch(documentId).set({
  avatar: { _type: 'image', asset: { _type: 'reference', _ref: asset._id } }
}).commit();

Key learning: The upload-then-patch pattern is reliable at batch scale. No rate limiting was needed for 85 sequential uploads. The assets.upload() method accepts Node.js Buffers directly — no need to write temporary files.

5. Sanity CDN Asset Recovery

Tool: Direct HTTPS download from cdn.sanity.io. What it does: Recovers previous versions of images from Sanity's CDN using deterministic URLs. Evidence: Recovered the pre-AEGIS version of the Sage portrait from CDN using the asset hash captured in upload logs.

URL pattern: https://cdn.sanity.io/images/{projectId}/{dataset}/{hash}-{width}x{height}.{ext}

Key learning: Sanity CDN assets are immutable. Uploading a new image to the same field does not delete the previous asset. The old URL continues to work indefinitely. This means any image ever uploaded to Sanity is recoverable if you have the asset hash or URL from logs, API responses, or document history.

6. Sharp Pixel Patching for Text Removal (FAILED)

Tool: sharp extract(), blur(), composite(). What it does: Attempts to remove text by extracting the region, blurring it, and compositing back. Evidence: Three attempts to remove "AEGIS" text from the Sage portrait. All failed — ghosting, color mismatch, and imprecise coordinate targeting left visible artifacts.

Why it fails:

  • JPEG compression embeds text edge data into surrounding pixels. Blurring the text region does not reconstruct the underlying image.
  • Coordinate estimation for small text on complex backgrounds is imprecise. Off by a few pixels and the patch is visibly wrong.
  • Blur patches create an obvious soft spot on otherwise sharp images.

When Sharp pixel patching IS appropriate: Simple geometric overlays, solid color fills, image resizing, cropping, format conversion. It is a composition and transformation tool, not an inpainting tool.


Decision Matrix

Need Use Do Not Use
Generate new image from scratch Gemini generation (gemini-3.1-flash-image-preview) --
Fix artifact on existing image Gemini image editing (send image + instruction) Full regeneration (changes style)
Remove text from image Gemini image editing Sharp blur patching (leaves artifacts)
Composite image + text into card Sharp SVG overlay (scripts/compose-linkedin-cards.cjs) Gemini (unnecessary cost and latency)
Upload images to CMS Sanity client batch (scripts/sanity/batch-upload-avatars.cjs) Manual Studio upload
Recover previous image version Sanity CDN direct download (deterministic URL) --
Resize, crop, format convert Sharp Gemini (wrong tool)

Infrastructure

Scripts

Script Location Purpose
scripts/generate-leader-portraits.cjs Monorepo Gemini portrait generation (leaders)
scripts/generate-agent-portraits.ts Monorepo Gemini portrait generation (all agents, batch)
scripts/compose-linkedin-cards.cjs Monorepo Sharp SVG card composition
scripts/sanity/batch-upload-avatars.cjs Monorepo Sanity batch upload + field linking

Dependencies

Dependency Status Notes
GEMINI_API_KEY Confirmed Monorepo root .env, paid tier required for image generation
@google/generative-ai SDK Confirmed Monorepo node_modules
sharp npm package Confirmed Monorepo node_modules
SANITY_API_TOKEN (write) Confirmed apps/website/.env (root .env is read-only)
@sanity/client Confirmed apps/website/node_modules

Output Locations

Asset Type Location
Agent portraits (82) apps/website/public/images/generated/portraits/portrait-{slug}.png
Leader portraits (3) apps/website/public/images/generated/portraits/portrait-{v,sage,pax}.png
LinkedIn cards /mnt/d/Media/linkedin-series/
Sanity CDN cdn.sanity.io/images/{projectId}/{dataset}/{hash}.png

Verification

# Confirm all portrait files exist
ls /mnt/d/Projects/value-first-operations/apps/website/public/images/generated/portraits/portrait-*.png | wc -l
# Expected: 85

# Confirm LinkedIn cards exist
ls /mnt/d/Media/linkedin-series/*.png 2>/dev/null | wc -l
# Expected: 134

# Confirm Sanity uploads
node /mnt/d/Projects/value-first-operations/scripts/sanity/query.js custom \
  '*[_type == "contributor" && avatar != null] | length'
# Expected: 85+

# Test Gemini image editing availability
node -e "
const { GoogleGenerativeAI } = require('@google/generative-ai');
const ai = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = ai.getGenerativeModel({ model: 'gemini-3.1-flash-image-preview' });
console.log('Model loaded:', model.model);
"

Leader Applications

V (Operations)

Primary owner of the image pipeline. The card composition script (compose-linkedin-cards.cjs) and batch upload script (batch-upload-avatars.cjs) are V's infrastructure. The LinkedIn series production proved that image generation, editing, composition, and CMS upload can all execute without human intervention beyond the initial creative direction. Future applications: new agent onboarding (portrait + Sanity upload in one operation), episode cover art, event promotional graphics.

Sage (Customer)

The Sage portrait AEGIS text removal was the triggering event for proving Gemini image editing. Sage's Customer Org portraits required the most visual distinction during the series — 12 Concierge agents plus 11 intelligence specialists. The series captions position each agent in relationship context ("Sentinel watches for signals you might miss"). Visual identity strengthens how relationships perceive the team.

Pax (Finance)

Cost profile for the complete pipeline: approximately $5.50 for 85 portraits (Gemini generation), $0 for 134 card compositions (local Sharp), $0 for 85 Sanity uploads (within plan limits), $0 for image editing (included in Gemini API). Total infrastructure cost for the "Meet the AI Team" visual identity: under $6.00. At this cost, regeneration and iteration carry no financial friction.


What This Changes

Before this capability, image work required manual tools (Figma, Photoshop) or one-off prompts to external image generators with no programmatic control. The organization now has:

  1. Programmatic image generation — Any agent added to the roster gets a portrait automatically.
  2. Non-destructive image editing — Artifacts on existing images are fixable without regeneration.
  3. Automated composition — Card layouts are code, not design files. Changing dimensions, colors, or text is a script edit.
  4. Batch CMS integration — Images flow from generation to CDN without manual upload.
  5. Asset immutability — Every image ever uploaded to Sanity is recoverable from CDN.

The pipeline is complete from creation through distribution.