Swarm Decompose — split one instruction into N independent subtasks

The swarm decompose engine is the lead/decompose step of the agent swarm (#577, part of #419): it turns one free-text instruction into a validated, machine-readable plan of N independent, non-overlapping subtasks — the foundation every other swarm step builds on.

It is the deliberate division of labour the parallel-orchestration research (docs/research/parallel-agent-orchestration.md) named as the single “unsolved piece”: the semantic split — which subtasks, what each one owns — is the lead agent’s judgement, exactly as the article frames it (“Let the lead decompose”). This engine is the deterministic safety + emission backbone around that judgement. It validates the plan, analyses cross-subtask file ownership for overlap, detects pinch points, computes a fan-out suitability verdict, and emits both machine- and human-readable forms.

It never spawns workers. The decompose step only produces the plan. The dispatch step (the cdfork branch model, or the in-session Workflow lane) consumes it.

When to use

Before fanning out a broad instruction across parallel agents, to get a clean, conflict-free division of labour you can trust.
To answer “should this even fan out?” — the verdict recommends a single agent when there are fewer than three genuinely independent pieces.
To produce a cdfork-compatible branch list from a fresh instruction (not just pre-existing approved issues, which cdfork --from-issues already covers).

Run it all: `/swarm`

The whole pipeline — decompose → approve → dispatch → gate → grade → integrate — is wrapped in one command (#582). Use it once the plan is approved; the sections below document each step it composes.

swarm run [--cap <n>] [--dry-run] [--finalize]   # or /swarm run …
swarm plan --plan plan.json [--cap <n>]
swarm summary

Composes, never reimplements. swarm run drives the existing primitives by their CLIs — approve.js verify → lane routing (#583) → cdfork fork and/or the in-session Workflow lane (fan-out + worktree isolation) → the #579 gate → the #580 grader → integrate.js. It adds only what the primitives lack.
Two lanes, routed by shape (#583). Heavyweight tickets fan out to cdfork; wide mechanical work runs in-session via the native Workflow tool. Both feed the same gate, grader, integrator, and summary — see Two dispatch lanes.
Approval-gated. It refuses to dispatch unless .swarm/approved-plan.json exists and its hash still matches — the plan cannot drift between approval and spawn.
Concurrency ceiling (--cap, default 4). Bounds simultaneously-running workers (cdfork itself is uncapped); excess subtasks queue and start as slots free.
Idempotent resume. Run state lives in .swarm/run.json; re-invoking is a safe resume — completed workers aren’t relaunched and the summary is recomputed.
One summary. swarm summary lists every branch as passed / failed / needs-attention (and merged once integrated).

Preview the exact composition without spawning anything with swarm run --dry-run. Full reference: the /swarm command.

The three commands

node scripts/swarm/decompose.js scaffold "<instruction>"
node scripts/swarm/decompose.js analyze  <plan.json|->  [--format json|human|both]
node scripts/swarm/decompose.js branches <plan.json|->  [--include-serial]

1. `scaffold` — free-text instruction → lead brief

scaffold is the entry point for a raw instruction. It does not (and cannot, deterministically) invent the split — it emits the brief the lead agent fills to produce a schema-valid plan, keeping the judgement where it belongs while guaranteeing the output is machine-consumable.

node scripts/swarm/decompose.js scaffold "Add a product catalog to the storefront"

2. `analyze` — validate + annotate

analyze takes a candidate plan (authored against schemas/swarm-plan.schema.json) and returns it annotated with fan_out, overlaps, and pinch_points. --format selects the output: json (machine), human (readable), or both (JSON on stdout, human on stderr — the default).

node scripts/swarm/decompose.js analyze plan.json --format human

Exit codes let CI / the dispatch step gate on the result:

Exit	Meaning
`0`	Valid plan, no file-ownership overlaps — safe to fan out.
`3`	Valid plan, but overlaps remain — resolve before fan-out.
`1`	Invalid plan (schema or semantic error).
`2`	Usage error.

3. `branches` — emit a cdfork branch list

branches emits the same tab-separated <branch>\t<id>\t<title> contract that cdfork fork consumes from from-issues.sh. serial_only subtasks (pinch points) are excluded by default so they are never fanned out; pass --include-serial to override.

node scripts/swarm/decompose.js branches plan.json
# feat/api-routes-...  api-routes  Add the JSON:API product endpoints
# feat/product-grid-...  product-grid  Build the Next.js product grid page

The plan format

A plan is one instruction plus a list of subtasks, each owning a disjoint set of globs. Authored fields are required; the engine computes the rest. Full contract: schemas/swarm-plan.schema.json (a sample lives at schemas/fixtures/swarm-plan.sample.json).

{
  "instruction": "Add a product catalog to the storefront",
  "subtasks": [
    {
      "id": "api-routes",
      "title": "Add the JSON:API product endpoints",
      "owned_globs": ["src/api/products/**"],
      "deliverable": "a Hono router exporting product list/detail handlers",
      "verification": "npm test -- products"
    }
  ]
}

Field	Required	Purpose
`id`	✓	kebab-case, unique; substitutes for an issue number in the TSV.
`title`	✓	Drives the derived `<kind>/<id>-<slug>` branch name.
`owned_globs`	✓	The files this subtask is the sole writer of.
`deliverable`	✓	The concrete artifact produced.
`verification`	✓	A command that proves it is done.
`branch`		Explicit branch name (else derived from the title).
`serial_only`		Force out of the fan-out (engine sets it for detected pinch points).
`depends_on`		Ids that must finish first; excludes this subtask from the independent count.

How the analysis works

Fan-out verdict (AC-02). A subtask is independent when it is not serial_only, has no depends_on, and is in no overlap. With ≥ 3 independent subtasks the verdict is fan-out; fewer and it recommends a single-agent run — the article’s rule of thumb.
Overlap analysis (AC-03). Every pair of subtasks is compared with a recursive **-aware glob matcher (plus a literal directory-ancestor rule). Any two touching the same path are flagged, never silently emitted — a non-empty overlaps list means the decomposition is not cleanly independent.
Pinch-point detection (AC-04). Subtasks touching a single-source-of-truth path — lockfiles, migrations, build/CI config, deploy artifacts — are detected and forced serial_only so they are kept out of the fan-out. The matcher is precise: a broad src/api/** is not falsely flagged just because it could one day contain a migrations/ dir. This reuses the same serial-only convention cdfork already understands.

Approval checkpoint — before any spawn

Decomposition produces a plan; it does not authorise a fan-out. The article’s single most-emphasised human gate sits between decompose and dispatch: review the plan before anyone spawns, because a wrong dependency here becomes a tangled merge later. scripts/swarm/approve.js (#578) makes that gate mechanical — dispatch can only proceed against a recorded, hash-verified approval.

node scripts/swarm/approve.js review <plan.json|->  [--format json|human|both]
node scripts/swarm/approve.js record <plan.json|->  --decision approve|revise --approver <who> [--out <path>] [--note <text>]
node scripts/swarm/approve.js verify <approval.json|->
node scripts/swarm/approve.js gate   [--root <dir>] [--format json|human]

The protocol

review surfaces the article’s checklist on the analyzed plan: any two subtasks touching the same files (blocking), any non-independently- verifiable deliverable — a placeholder verification, or one command shared verbatim across subtasks (warning), and any suspect dependency ordering — a depends_on cycle (blocking) or a dependency on a serial-only pinch point (warning). Exit 0 approvable / 3 blocking / 1 invalid.
gate resolves how approval must be obtained from the project’s automation level and RDB setting. At Level 3 it is a mandatory stop — a human-referee step that must not be auto-approved; with RDB enabled the prompt routes through ask_remote, otherwise the terminal.
Present the review to the human and get an explicit approve / revise.
- Revise → record --decision revise logs the rejection and writes no approved plan. Return to decomposition and re-plan.
- Approve → record --decision approve writes a content-hashed approval artifact (schemas/swarm-approval.schema.json, default .swarm/approved-plan.json). record refuses to approve a plan with blocking findings — the gate is not advisory.
verify is the dispatch-side gate: it authorises a spawn only when the artifact’s decision is approve and the recorded plan_hash still matches its embedded plan. The plan therefore cannot drift between approval and spawn — the swarm runs against exactly what was approved.

The gate is mechanical, not advisory. Blocking findings can’t be approved; an unapproved or modified plan can’t pass verify; at Level 3 the checkpoint can’t be auto-skipped. The dispatch step (#582) and the integrator (#581) both operate against the verified artifact.

Two dispatch lanes — cdfork vs Workflow (#583)

A swarm has two dispatch lanes, and they are complementary, not competing. scripts/swarm/lane.js routes each subtask to the lane that fits its shape, with an explicit, explained verdict — surfaced in the /swarm dry-run plan:

node scripts/swarm/lane.js route --plan plan.json --format human

Lane	Shape	When	Mechanism
cdfork	N full warp-drive sessions, one per branch, in tmux windows on the machine where the code lives	heavyweight feature/fix tickets that each need a full dev loop	`cdfork fork …` — worktree isolation per session
Workflow	tens-to-hundreds of lightweight subagents in one session	wide, mechanical, independent fan-out (docs per module, an API migration across 30 files, a benchmark matrix)	the native Claude Code Workflow tool — `pipeline()` with `isolation: worktree` + the native concurrency cap

The router’s bias is conservative (AC-04): a subtask routes to the Workflow lane only when it looks mechanical (title verb: document / update / rename / format / migrate …) and narrowly scoped (no broad recursive globs). Heavyweight verbs, broad globs, depends_on, and serial_only pinch points all keep a subtask on cdfork — the safe full-dev-loop default. A too-small Workflow set is folded back to cdfork (the wide-fan-out win needs width), and the demotion is explained.

The Workflow lane runs a generated script. Because the Workflow tool is invoked by the orchestrator in-session (not via a CLI), scripts/swarm/workflow-lane.js emits the program to run:

node scripts/swarm/workflow-lane.js emit --plan plan.json   # prints the Workflow script

The emitted script is a pipeline() over the Workflow-lane workers: each flows implement (in an isolation: worktree subagent) → verify, where verify runs the same #579 gate + #580 grade/revise loop per worker (AC-03), capturing to .swarm/{gate,grade}-results — the same artifacts the integrator (#581) and summary (#582) read. So quality enforcement and the downstream merge are lane-agnostic (AC-05): a branch folds in identically regardless of which lane produced it.

Selection is explicit, not magic. swarm run --dry-run prints the lane split (lanes: N cdfork, M workflow) and a lane-route step before any spawn, so you see exactly what runs where.

Result gate — before any result folds back in

Approval gates the plan; the result gate (#579) gates each worker’s output. When a worker finishes, its branch must clear deterministic non-negotiables before it is eligible for integration — the article’s “gate every result with a hook” step. scripts/swarm/result-gate.js enforces four checks:

tests — the project’s detected test command passes,
lint — the detected lint command passes (absent ⇒ skipped, not failed),
secrets — no secret patterns in the diff’s added lines (values are masked in output, never echoed),
scope — every changed file is within the subtask’s #577 owned_globs; an out-of-scope write fails the gate.

node scripts/swarm/result-gate.js check --plan plan.json --subtask <id> --base <ref> --cwd <worktree> --worker <id>
node scripts/swarm/result-gate.js summary [--results-dir <dir>] [--format json|human]

A failing check makes the worker ineligible and records the reasons (feeding the grader / revise loop, #580). Each verdict is captured to .swarm/gate-results/<worker>.json; summary rolls them up for the end-of-run report (#582).

Two substrates

In-session (Workflow lane): the SubagentStop hook swarm-result-gate.sh runs the gate when a worker subagent finishes and emits {"decision":"block"} on an ineligible result, forcing a revise pass. It reads the worker’s owned globs + base ref from the environment (SWARM_OWNED_GLOBS or SWARM_PLAN+SWARM_SUBTASK, SWARM_BASE), fails open for non-swarm subagents, and respects stop_hook_active to avoid re-block loops. Register it under SubagentStop in ~/.claude/settings.json.
cdfork lane: run result-gate.js check --cwd <worktree> --base <main> per branch — the same engine, applied to each worktree before integration.

The gate is deterministic and per-worker. One bad agent cannot quietly poison the integrated result: an ineligible branch is recorded ineligible and held out of the merge until it is revised.

Grading — score each branch, revise the weak ones

The gate is pass/fail; the grader (#580) is what turns a fast swarm into a good one. It scores each gated-passing branch against a rubric and sends sub-bar branches back to their worker to revise — the article’s “grade the results and send back the weak ones” step. scripts/swarm/grade.js does the scoring:

node scripts/swarm/grade.js check   [--rubric <file>] --branch <name> --cwd <worktree> \
                                    [--gate-result <file>] [--review-cmd <cmd> | --review-verdict <file>]
node scripts/swarm/grade.js rubric  [--rubric <file>]          # print the effective rubric
node scripts/swarm/grade.js summary [--results-dir <dir>] [--format json|human]

The default rubric covers three dimensions:

Criterion	Kind	Passes when
`change-tests`	deterministic	a change-specific test exists among the changed files and the suite passes (per the #579 gate’s tests check)
`conventions`	review	the change adheres to the project `CLAUDE.md` conventions
`no-placeholders`	deterministic	no `TODO`/`FIXME`/placeholder/stub marker on an added diff line

The rubric is configurable (AC-01): drop a .swarm/rubric.json (or pass --rubric <file>) — an array of { id, kind, description } criteria, each deterministic (scored from the diff + gate result) or review.

Reusing the code-reviewer agent (AC-05). review-kind criteria are scored by the existing code-reviewer agent — never a parallel reviewer. grade.js stays deterministic by treating the reviewer as an injected verdict; scripts/swarm/grade-review.sh is the live hook that runs code-reviewer over the branch diff via claude -p and prints the verdict JSON grade.js ingests. Without a verdict (--review-cmd/--review-verdict), review criteria skip (neutral) — the gate’s hard checks still stand.

The auto-revise loop (AC-03) is the grade-until-rubric-pass template on the generic loop runner (#424) — not a bespoke loop:

node scripts/loop/run.js ~/.claude/templates/loops/grade-until-rubric-pass.json --cwd <worktree>

Each iteration the worker revises only the failed criteria, then grade.js re-grades. The loop’s max_iterations is the retry ceiling; its telemetry and budget_exceeded halt are the loop runner’s, reused as-is.

Branches that exhaust the ceiling are flagged needs-attention, not merged (AC-04). A branch graded sub-bar after the last revise pass is held by the integrator (--grade-results) and shown needs-attention in the run summary — only graded-passing branches fold in.

Integration — the lead merges, not the workers

A swarm produces N graded branches; someone has to fold them back together. The article’s rule: the lead is the integration layer — workers never merge their own work. scripts/swarm/integrate.js (#581) is that single integrator.

node scripts/swarm/integrate.js order     --plan plan.json [--results-dir <dir>] [--grade-results <dir>]
node scripts/swarm/integrate.js integrate --plan plan.json [--results-dir <dir>] [--grade-results <dir>] [--target <branch>] \
                                          [--test-cmd <cmd>] [--strategy merge-commit|squash] [--dry-run] [--finalize]

Collect the integrable branches — workers the #579 result gate marked eligible, minus any the #580 grader scored sub-bar (held needs-attention via --grade-results, not merged). Ineligible workers are reported and skipped; ungraded branches pass through on their gate verdict (no-op when grading wasn’t run).
Order them by the #577 depends_on graph (Kahn topological sort), never arbitrary order. A branch whose dependency did not pass is held; a cycle blocks integration outright.
Merge each branch, in order, into the integration branch (the one set by warp session start <branch>). After every merge the full test suite runs; a failure halts before the next branch. A merge conflict halts and is aborted cleanly — surfaced for human resolution, never auto-resolved.
Ship with --finalize (or run warp finalize yourself): the existing ship path rebases the integration branch onto main and merges/PRs it. cdfork output flows into the established mechanism, not a new one.

Workers never merge to the integration target. The integrator is the sole writer of the integration branch; worker sessions only push their own feature branches. One bad branch can’t pile up into the others — it’s held or halted on, not silently folded in.