Skip to main content

92% of Prompt Library Offloadable to Cheaper Models

Insight

A full audit of the 298-prompt Veritas library revealed that only 23 prompts (8%) actually need Claude’s capabilities. The rest are generic enough to run on GPT-4o-mini or Gemini Flash (135 prompts) or GPT-4o/Gemini Pro (140 prompts) — saving 60-90% on inference costs for those workloads.

Why This Matters

Claude Opus costs 15/15/75 per million tokens (input/output). GPT-4o-mini costs 0.15/0.15/0.60 — a 99% reduction. Even mid-tier offloading to GPT-4o at 2.50/2.50/10 saves 87%. For a library used across automation workflows, the cumulative savings are substantial.

What Stays on Claude

Only SO1-internal work: FORGE autonomous agents, n8n pipeline design, multi-agent orchestration, and prompts that need the SO1 product context (Rover, Sparki, Choco, Traceo). These are the “meaty” tasks worth the premium.

What Offloads

  • Content writing (blog posts, social media, campaign schedules) → GPT-4o-mini
  • Formulaic audits (55+ checklist-pattern scans) → GPT-4o-mini
  • Bootstrap scaffolding (15 boot-* tasks) → GPT-4o-mini
  • Pipeline phases (security, infrastructure, debt analysis) → GPT-4o
  • Specialist analysis roles (network, code, LLM validation) → GPT-4o

Action

OPEN-ASSISTANTS.md in veritas provides full traceability — every prompt classified with rationale and implementation snippets for OpenAI Assistants API and Gemini SDK. Ready to operationalize.