Skip to main content

What happened

Clari’s 25 GitHub repositories were audited after merging 8 outstanding pull requests (136K lines of code and documentation). The goal: assess how far the platform is from automatically converting raw agent work artifacts (“TASKSETs”) into polished project documentation — the kind of structured technical records (architecture decisions, system design docs, operational runbooks) that a professional engineering org produces.

Where things stand

The platform is roughly 30% built. The document-processing pipeline (sift-service) exists and runs end-to-end, but its intelligence layer is mocked out — it uses pattern matching instead of actual language understanding, so it misses most of the valuable content in real documents. A parallel Go implementation (engine) has 25K lines of backend logic but none of it is connected to live API routes. The strongest asset is the specification layer: 60K+ lines of architecture documentation that precisely defines what Clari should do, how data flows between services, and what output formats look like. The specs are ahead of the code.

Business impact

  • 6-8 focused sessions to reach a working prototype: feed in TASKSET files, get back structured Findings and Architecture Decision Records.
  • 10-14 sessions for the full product with Polar subscription tiers (free = basic extraction, paid = AI-powered synthesis), Git integration (auto-commit docs to the right repos), and cross-document analysis.
  • Once operational, Clari eliminates the manual step of converting agent session outputs into publishable documentation — a process currently done by hand after every build session.

Key takeaways

  1. Build on the Python pipeline, not the Go monolith. The Python sift-service ships faster, has MCP integration (agents can call it directly), and is the natural home for AI/LLM features. Go becomes the production gateway later.
  2. Polar tier gating from day one. Free tier gets rule-based extraction. Paid tiers get AI-powered semantic understanding and cross-document synthesis. Matches the consumer app architecture used elsewhere in the ecosystem.
  3. The self-documentation loop is the endgame. Agents produce TASKSETs. Clari transforms TASKSETs into project docs. Those docs feed back into agent context. This closes the knowledge cycle — no more documentation debt accumulating between sessions.

Action items

  • Fix 4 critical bugs in sift-service before any new features (broken package name, mocked LLM, empty outputs, rigid extraction patterns).
  • Start TASKSET F-1 (TASKSET-aware document parsing) as the first real feature build.
  • Resolve Go-vs-Python implementation divergence before it compounds further.
See also: Technical findings