Outline
- Learning objectives — 9 states, lifecycle events, human gates, why done is computed
- Key concept — Brief is the atomic agent-executable work unit; done is computed, never claimed
- Diagram walkthrough — vertical lane showing all 9 states with events + gates
- The 9 states (inbox → outcome_attributed)
- The 6 key lifecycle events
- Human gates (🔑) — only 3 require human approval by default
- Workflow narrative — one Brief's full journey from
T+0(signal) toT+45d(quarterly attribution) - What this saves vs. the old way — time-per-Brief comparison table
- The bypass scenario — how policy overrides become observable signals
- Hands-on — 10-min staging walk-through (
pm33_create_work_item,pm33_update_work_item,pm33_query_audit_log) - Further reading — Brief spec,
/briefskill, modules 3-4
Learning objectives
After this module you should be able to:
- Describe the 9 states a Brief moves through
- Identify which transitions are automatic vs. require human approval
- Name the 6 lifecycle events emitted during a Brief's life
- Explain why
doneis computed (not claimed)
Key concept
A Brief is the atomic unit of agent-executable work in PM33. It replaces the legacy "story" for anything an AI agent will execute. The Brief is a contract: machine-verifiable acceptance criteria, a TDD plan, an outcomeHook that defines how success will be measured, and explicit specialist + skill requirements.
The Brief's status moves through 9 states. Most transitions are automatic — driven by lifecycle events emitted by CI, the orchestrator, and the verification engine. A few transitions are human-gated — typically when an approval, policy override, or scope decision is required.
The crucial design rule: done is computed, not claimed. An agent cannot mark its own work done. The AutoDoneVerificationService observes signals (test pass, PR merge, deploy succeeded, outcome window opens) and flips the status itself. This eliminates the OUTCOMES-001 class of "agent said done but tests were mocked" failures.
Diagram walkthrough
Open 02-brief-lifecycle.excalidraw.
The diagram shows a vertical lane with 9 states stacked top to bottom. To the right of each state, a "lifecycle event" label shows what triggers entry into that state. To the left, a "human gate" icon (🔑) appears where human approval is required.
The 9 states
- inbox — Raw idea, just filed. May live in
docs/reference/TECHNICAL_DEBT.md(markdown inbox) or come from a customer signal. Not yet a tracked work item. - promoted — Lifted from inbox into PM33 as a work item. Has a UUID. Indexed and searchable. Not yet refined.
- backlog — Refined enough to be a real Brief. Has machine-verifiable AC, TDD phases defined, specialist class assigned. Ready to be sprinted but not yet placed.
- planned — Placed in a sprint. Dependencies resolved. Capacity-aware. Visible in sprint planning UI.
- in_progress — Agent picked it up. Per-agent worktree spawned. RED test written.
- in_review — Agent finished, PR opened, independent reviewer + CI gates running.
- done — All verification gates passed. Code merged. Status auto-flipped by the verification engine.
- outcome_tracking — The
outcomeHookwindow is open. PM33 is measuring whether the predicted metric movement actually happened. (This is the differentiator — most tools stop at "done.") - outcome_attributed — The metric moved (or didn't). The AR(1) model updated. The Brief's contribution is recorded in the strategic objective's audit trail.
The 6 key lifecycle events
These are the structured signals the platform emits on transitions:
inbox_filed→ state becomesinboxpromoted_from_inbox→ state becomespromotedbrief_authored→ state becomesbacklogsprint_assigned→ state becomesplannedagent_picked_up→ state becomesin_progresspr_opened→ state becomesin_reviewpr_merged+verification_passed→ state becomesdoneoutcome_window_opened→ state becomesoutcome_trackingoutcome_measured→ state becomesoutcome_attributed
Each event carries a payload: { work_item_id, actor, timestamp, correlation_id, metadata }. The audit log records every one.
Human gates (🔑)
Only 3 transitions require human approval (by default — workspace can customize):
- promoted → backlog: a PM confirms the work is in-scope and the AC are right
- in_review → done: for changes touching auth, schema, or other gated zones (per CLAUDE.md policy)
- outcome_attributed → strategic_objective_updated: an exec or owner confirms the metric movement is meaningful (vs. noise)
Everything else is automatic.
Workflow narrative
Let's follow one Brief from inbox to outcome.
T+0 (Friday, week 0). A customer support ticket comes in: "It takes 4 minutes to load the Sprint Planning page when I have >500 work items." The CS agent tags it voc:performance in Zendesk. The Zendesk → PM33 bridge fires inbox_filed. The Brief enters inbox state. Status: not yet a tracked work item, just a raw signal.
T+3d (Monday, week 1). Pam's morning sweep runs. It finds the inbox entry, scores it against active strategic objectives. The closest match: "Make PM33 usable for workspaces > 1000 work items" — a priority-2 objective. Pam emits promoted_from_inbox. The Brief gets a UUID, enters promoted state. Sarah (the PM owning that area) sees it in her queue.
T+3d, afternoon. Sarah reviews. She thinks "yes, this is real, and we have an idea why." She refines: changes the title to "Sprint Planning page p95 > 3s for workspaces with 500+ work items", adds machine-verifiable AC (p95 < 800ms after the fix, measured via the existing /metrics endpoint), assigns specialist = performance-engineer, llmTier = sonnet. She approves. Lifecycle event brief_authored fires. State → backlog.
T+5d (Wednesday). Sprint 49 planning. The capacity scheduler places the Brief in Sprint 49 (it fits within available capacity and has no upstream dependencies). Lifecycle event sprint_assigned fires. State → planned.
T+8d (Monday, week 2 — Sprint 49 day 1). A specialist agent claims the Brief. The harness coordinator spawns a per-agent git worktree at .claude/worktrees/agent-perf-sprint-page-rl47x9, runs agent-init.sh to activate the per-agent git index. The agent writes the RED test:
it('Sprint Planning page p95 < 800ms with 500+ work items', async () => {
await seedWorkItems(500);
const samples = await samplePageLoad(50);
expect(percentile(samples, 0.95)).toBeLessThan(800);
});
Test fails (p95 = 4200ms). Lifecycle event agent_picked_up fired earlier; state → in_progress. The TDD discipline is now enforced — the agent CANNOT proceed without the test going green.
T+8d, afternoon. Agent investigates. Finds an N+1 query in SprintPlanningQuery.tsx that fetches each work item's dependencies individually. Rewrites to use BacklogQueryService.getDataset(mode='full') which batches the deps query. Re-runs the test: p95 = 640ms. Test passes. Commits to the worktree branch.
T+8d, evening. PR opens. Lifecycle event pr_opened fires. State → in_review. The independent code-reviewer agent fires. It checks: schema-drift gate (no schema changes), ESLint (clean), pre-existing test suite still green, security check (no auth changes). All pass. Reviewer posts gh pr review --approve.
T+9d (Tuesday, week 2). PR merges. Lifecycle events pr_merged + verification_passed fire. The Brief auto-flips to done. Sarah didn't have to lift a finger — the workspace policy didn't require human approval for performance-only changes.
T+9d, 1 hour later. Lifecycle event outcome_window_opened fires. State → outcome_tracking. The outcomeHook for this Brief says: "track sprint_planning_page_p95 for 7 days post-deploy, attribute to this Brief if delta > 50%."
T+16d (next Tuesday). The window closes. Realized data: p95 dropped from 4200ms → 580ms (86% delta, well above the 50% threshold). Lifecycle event outcome_measured fires. State → outcome_attributed. The strategic objective "Make PM33 usable for workspaces > 1000 work items" gets credit for this Brief (along with the 6 other Briefs that contributed during the measurement window).
T+45d (a month and a half later). Sarah's quarterly review. Her CEO asks "did the > 1000 work items objective actually move the needle?" PM33 generates a one-pager: "Yes — across 9 contributing Briefs, p95 page-load latency for 1000+ workspaces dropped 78%. Top contributing Brief: this one. Audit trail and per-Brief metric attribution available."
That's the lifecycle. Notice what's NOT in this story:
- Sarah never wrote a "story" with a user persona
- Sarah never updated a Jira ticket manually
- The PR description didn't need to be hand-written
- The "did this matter?" question got answered by the system, not by a spreadsheet
What this saves vs. the old way
Estimate the time spent on a typical sprint item, before PM33:
| Activity | Before PM33 | With PM33 | Saved |
|---|---|---|---|
| Triage + refinement | 30 min PM + 15 min eng | 5 min PM (Pam does the rest) | 40 min |
| Sprint placement | 20 min planning meeting | 0 min (auto-scheduled) | 20 min |
| Spec writing | 45 min eng + back-and-forth | 5 min Brief approve | 40 min |
| Implementation (the actual code) | unchanged | unchanged | 0 |
| PR review coordination | 20 min finding reviewer | 0 min (auto-dispatched) | 20 min |
| Post-merge attribution | rarely done | automatic | priceless |
Per Brief savings: ~2 hours of PM + eng time, NOT counting the outcome attribution (which most teams just don't do today because it's too expensive to do manually).
At a team velocity of 30 Briefs/sprint, that's 60 hours/sprint reclaimed for actual work. At loaded cost of $150/hour, ~$9,000/sprint of recovered productivity.
The bypass scenario
What happens when something breaks the policy? Say an emergency hotfix that needs to ship in the middle of the night and there's no time for the review gate.
The agent (or human) can bypass: git commit --no-verify, PM33_ALLOW_SHARED_INDEX=true, etc. The bypass is logged. A bypass_event record attached to the Brief captures: who, when, what was bypassed, what was the stated justification. The PR comment shows it. The Brief's audit trail shows it. Reviewers see it. Compliance dashboards aggregate it.
Bypass tracking doesn't BLOCK (most bypasses are legitimate). It makes the bypass observable — so the team can decide later whether the policy needs to change, or whether the bypass was a one-off.
This is the same design philosophy as the rest of the platform: trust people, observe behavior, surface signals.
Hands-on (optional, 10 minutes)
Walk a Brief through its states yourself on staging:
# 1. Create a fake Brief
mcp__pm33-staging__pm33_create_work_item \
type='story' \
title='Curriculum example Brief — please delete' \
description='Walking through lifecycle for demo' \
storyPoints=1
# 2. Update its status (manual transition)
mcp__pm33-staging__pm33_update_work_item \
id='<the-id-returned>' \
status='in_progress'
# 3. Query audit log to see the event
mcp__pm33-staging__pm33_query_audit_log \
filters='{"work_item_id": "<the-id>"}'
# 4. Delete when done
mcp__pm33-staging__pm33_delete_work_item id='<the-id>'
You should see at least 2 audit events: work_item_created and status_changed.
Further reading
docs/design/PM33_BRIEFS_SPEC.md— the full Brief schema/briefskill — interactive Brief authoring- Module 3: Setting Up Your First Harness — where the agent execution actually happens
- Module 4: Outcome Attribution — what happens in the
outcome_trackingstate