Catalog

Areas of a software org we can audit

A working catalog. Each area lists concrete agent loops, harness/skills recipes, and platform-level work. Cross-cutting concerns live at the end.

Source: IDEAS.md Scope: software development only

01 IDE & local developer workflow #

Harness selection & house style

Project prompt files (CLAUDE.md, AGENTS.md, .cursorrules)

Plan-mode adoption as a team norm

Skills / commands library

Sub-agent recipes

02 Agent loops in the SDLC #

Plan Build Review Ship Operate Issue triage label · dedupe · link Ticket → spec support investigation Issue → PR draft, repro, fix Codemod / migration large refactors PR review style · security · tests CI / flake / coverage quarantine · backfill Release notes user + internal Canary watcher promote / rollback Incident summary · postmortem CVE / deps bump · patch
Agent loops mapped to SDLC stages. Each is independently shippable as a pilot.

Issue / bug-tracker loops

Support ↔ engineering loops

PR / code review loops

CI / test loops

Dependency & security loops

Incident / on-call loops

Release / deploy loops

Multi-agent shape (when a loop fans out)

03 Agent surface UX (Defensive UX) #

Engineering clients under-invest here because their users are other engineers — "they'll figure it out." They won't; they'll just stop using the agent. 1–2 week deliverable that lifts adoption more than another month of prompt tuning. See → Patterns · Defensive UX.

Universal principles (from Yan, Microsoft HAI, Google PAIR, Apple HIG): set right expectations · enable efficient dismissal · provide attribution · anchor on familiarity · collect feedback in-flow.

04 Production access for dev agents #

Security note. The Supabase MCP / Cursor incident showed what happens when an MCP server holds credentials more privileged than the user the agent represents. See incident dossier.

05 Code quality, refactors & migrations #

06 Developer-facing chat assistant #

07 Internal engineering knowledge #

08 Cross-cutting · Platform, security, governance #

Harness & provider strategy

Sandboxing & permissions

Guardrails (distinct from sandboxing & hooks)

Context engineering audit

Security & compliance

Evaluation & observability (the prerequisite, not the polish)

Feedback flywheel instrumentation

Cost hygiene

Enablement & curriculum

How we package engagements #

Assessment 2–4 weeks Three-phase map · eval gap · UX gap · risk Pilot 4–8 weeks Loops + evals + guardrails + UX + feedback Scale & enable ongoing Platform · governance · curriculum

1 · Assessment (2–4 weeks) — deliverables

2 · Pilot (4–8 weeks) — ships with its supporting infra, not after

3 · Scale & enable (ongoing)