Platform Overview — Builder Track (Module 1)

Outline

Learning objectives — what you'll be able to do after this module
Key concept — PM33 as a closed-loop platform; the loop is the product
Diagram walkthrough — slide 01 v2: title + tagline → execution loop → 3 pillars
Workflow narrative — "meet Sarah," a real PM walking the loop end-to-end
What's NOT in this diagram — pointers to modules 2-6 for deeper dives
The Harness Ecosystem — Anthropic-guidance integration: CLAUDE.md, hooks, skills, plugins, MCP, subagents, LSP + how PM33 implements each
Hands-on — 5-min staging exercise (pm33_query_backlog + pm33_score_alignment)
Further reading — pointers to spec docs, harness framework, module 2

Learning objectives

After this module you should be able to:

Explain the difference between "AI codegen" and "AI-driven strategic execution"
Name the 6 stages of the Pam Orchestrator workflow
Name the 8 stages of the Execution Loop
Identify which subsystem owns audit, capacity, scheduling, and outcome attribution

Key concept

PM33 is a closed-loop platform. The loop is the product. Everything else — the MCP tools, the Briefs, the harness, the agents — exists to keep the loop running and the data flowing.

This sounds abstract. Here's the concrete version: when a PM picks a feature to build, PM33 doesn't just track that feature. It tracks the strategic objective the feature was supposed to advance, dispatches the agent that builds it, validates the shipped code, measures whether the metric moved, and updates the next sprint's priorities based on what was learned. All in one system, with one audit trail.

That loop — strategy → code → outcome → recalibration — is what no other AI development tool offers today.

Diagram walkthrough

Open 01-pm33-strategic-execution-platform.excalidraw in excalidraw.com. (Filename historical; the slide is now titled "Closed-Loop AI Product Development" — see ../diagrams/01-narrative.md for the full layout guide.)

The slide has four sections, top to bottom:

1. Title + tagline (top)

Big title sets the frame. The purple-bordered hero box carries the central thesis:

Outcome Attribution is the Trick. The loop is the product. Everything else exists to keep it running.

2. The Execution Loop (middle — the dominant visual)

Eight circular nodes in a horizontal flow: Customer Signal → Strategic Alignment → Brief Authored → Sprint Planned → Agent Executes → Validate → Shipped → Outcome Tracked.

A thick red arrow loops back from node 8 to node 1, labeled "RECALIBRATE — Bayesian model updates priorities + forecast for the next Brief." This is the load-bearing piece. Without it, you have AI codegen. With it, you have a closed loop.

Color encoding:

Orange nodes = governance moments (input + validation)
Purple nodes = orchestration (the AI agent work)
Green node = code shipped
Blue node = state / data (outcome measurement)
Red arrow = the feedback loop

3. The 3 pillars (below the loop)

Three vertical cards explaining what makes the closed loop possible:

Pillar 1 — Governance & Trust (orange) — what the platform enforces at runtime:

Multi-tenant RLS (data isolation at the database row, not the app)
Append-only audit log (every state transition; 7-year default retention)
Role-based access control (hierarchical RBAC; PM33: 15 roles)
Schema parity (staging vs prod drift detection, ESLint-enforced)
Bypass tracking (policy overrides are observable, not silent)

This is product runtime governance — what enterprise buyers evaluate in security review. NOT the same as development-time configuration (CLAUDE.md, skills, hooks, plugins, MCP servers) which is covered in the "Harness Ecosystem" section below.

Pillar 2 — AI Agent Orchestration (purple) — how AI agents actually run the work:

Atomic Brief (machine-verifiable AC, TDD phases, outcomeHook)
Agent registry (PM33: MCP server, 80+ tools)
Lifecycle event bus (every state change is observable)
Capacity-aware scheduler
Coordinator + specialist pattern (per-agent worktrees prevent absorption)

Pillar 3 — Code + Trackers + State (green) — where the work lands; no rip-and-replace:

Code repositories (GitHub · GitLab · Bitbucket)
Work trackers (Jira · Linear · Asana · GitHub Issues — bi-directional sync)
Knowledge bases (Notion · Confluence · Coda)
Atomic Brief state machine
Observability surfaces (Slack · email · in-app · proactive drift alerts)

Most AI dev tools stop at "shipped." A closed-loop platform measures whether the shipped thing moved the metric, then updates next sprint's priorities accordingly. That's the difference between productivity tooling and a strategic execution platform.

Bottom panel — The Execution Loop

Eight nodes in a horizontal flow, with a red arrow looping back from the last node to the first:

Customer Signal (VOC / Idea / Bug) — orange
Strategic Alignment — purple
Brief Authored — purple
Sprint Planned (capacity-aware) — purple
Agent Executes (per-agent index) — purple
Validate (independent review) — orange (notice the color shift — validation is a governance moment)
Shipped (PR merged) — green
Outcome Tracked (metric moved?) — blue

The red recalibration arrow from node 8 back to node 1 is the load-bearing piece. The label: recalibrate — AR(1) model updates priorities + forecast.

Workflow narrative

Meet Sarah, a senior PM at Acme Corp. She's been at the company 18 months, owns a product area with 12 engineers, and her CEO just told her the new OKR is "reduce time-to-first-customer-value by 30% by Q4".

Monday morning, week 1. Sarah opens PM33 and sees a Strategic Objective waiting for her: Reduce TTFCV by 30%. She clicks it. PM33 has already done preliminary work — Pam has surfaced 7 existing epics that touch onboarding (the obvious lever) and 3 customer support tickets that map to TTFCV friction. Sarah confirms 4 of those epics as "in-scope for this objective" and dismisses 3. She also marks the support tickets as "valid signal, promote to ideas."

Behind the scenes (this is what the top panel of the diagram is doing):

Audit log records: objective_created, epic_linked (×4), idea_promoted (×3)
Pam reads CLAUDE.md + Brief Schema for the workspace conventions
Capacity scheduler notes: 12 engineers, current sprint utilization 83%, available capacity for the next sprint

Tuesday morning. Sarah opens one of the promoted ideas: "Email verification step adds friction to onboarding." Pam has already drafted a Brief for it — atomic, machine-verifiable AC, TDD plan, an outcomeHook field that says "track conversion from signup → first-API-call within 24h, attributed to this Brief." Sarah edits the AC, approves it, hits "Schedule for next sprint."

Behind the scenes:

Brief enters backlog state. Audit event: brief_authored
Capacity scheduler runs (the plan stage in the orchestrator). The Brief is placed in Sprint 47, owned by a specialist matching backend-architect/sonnet/[harness-discipline, auth-implementation-patterns].

Wednesday morning, week 2. Sprint 47 starts. The specialist agent picks up the Brief, spawns a per-agent git worktree (so it can't absorb other sessions' work — the structural ABSORPTION-002 fix described in the platform's git model), writes the RED test, then the GREEN implementation, refactors, and runs the delivery gates.

Wednesday afternoon. PR opens. Independent code-review agent runs against it. CI runs. Schema-drift gate passes (no shared/schema.ts changes that aren't also in migrations). Lifecycle event pr_opened fires. The Brief moves to in_review.

Wednesday evening. Sarah approves the PR (the only human-gate for this Brief, because the workspace policy requires human-approval for any auth-touching change). PR merges. Lifecycle event pr_merged fires. The Brief auto-flips to done.

Two weeks later. The outcomeHook window closes. PM33 computes: did signup→first-API-call within 24h actually improve? It did — by 14%. The metric moved. The AR(1) forecast model updates: Brief in this area DO move the TTFCV needle, with predicted impact ~12%/Brief and confidence widening from σ=4 to σ=2.7. The next sprint's planner now weighs onboarding-area Briefs higher.

Six months later. Sarah's CEO asks "where are we on the TTFCV objective?" PM33 generates a one-page report: 32% reduction achieved (vs. 30% target), attributed across 11 shipped Briefs, top 3 contributors listed with metric movement per Brief, audit trail available for any auditor. The CEO is satisfied. Sarah doesn't have to dig through 4 tools and build a spreadsheet to make the case.

That's the loop. That's what every other line of code in PM33 exists to enable.

What's NOT in this diagram (and lives in the deeper modules)

How the Brief gets atomically specified — Module 2
How the harness coordinates multiple specialist agents in parallel — Module 3
How outcome attribution actually closes the loop — Module 4
How governance + audit make this defensible to security review — Module 5
How to pitch this to a buyer — Module 6

The Harness Ecosystem (why configuration matters as much as the model)

Slide 01 shows product architecture. What it doesn't show — and what is equally load-bearing — is the development harness that surrounds the AI agents doing the work.

The key insight (from Anthropic's How Claude Code Works in Large Codebases):

The ecosystem built around the model — the harness — determines how Claude Code performs more than the model alone.

This is true for any AI coding agent, not just Claude. The model is one variable. The configuration around it — what the agent knows, what skills it can load, what tools it can call, what gets enforced at hook time — is the rest of the equation.

PM33's experience has confirmed this. Same model, same Brief schema, same coordinator pattern: the difference between a smooth 20-hour harness and a thrash-y one is the harness ecosystem. Below is what that ecosystem includes, and how PM33 (and any AI-development-first organization) should think about each piece.

The 7 components of the harness ecosystem

1. CLAUDE.md (foundational)

What it is: a markdown file the AI agent loads automatically at every session start. Contains essential pointers, critical gotchas, working conventions, and team standards.

Why it matters: without CLAUDE.md, every session re-discovers the same conventions. With it, the agent operates as if it'd been on the team for months.

PM33's implementation: /CLAUDE.md (~10,000 chars) covers mandatory standards (TDD, schema change policy, security invariants, validation rules) plus pointers to the right modules for deeper work. Sub-directory CLAUDE.md files at /client/, /server/, etc. layer in directory-specific context. A .claude/CLAUDE.md adds AI-development-specific conventions.

Gotcha (from Anthropic): "Keep lean and layered; avoid context bloat. Root file should contain only essential pointers and critical gotchas." PM33's CLAUDE.md reviews CLAUDE.md every 2-3 months to prune things model improvements have made redundant.

2. Hooks (self-improvement + enforcement)

What it is: shell commands that execute on specific lifecycle events — session start, pre-commit, before/after tool calls, on stop.

Why it matters: deterministic enforcement. Linting, formatting, schema-drift validation, secrets scanning — these can't be left to "the agent should remember." Hooks make them automatic.

PM33's implementation: pre-commit hooks enforce per-agent git index activation, tree-shrink detection (prevents the absorption class of bugs), schema-drift validation, and TypeScript / ESLint gates. A stop hook prompts session reflection on what should land in CLAUDE.md.

Gotcha: hooks that fire too aggressively become noise. PM33's hooks are scoped — schema-drift only fires when shared/schema.ts is touched; per-agent index check only fires for write operations.

3. Skills (progressive disclosure)

What it is: domain-specific instruction packages the agent loads on-demand. Each skill has a description field; the agent decides which to load based on the task.

Why it matters: specialized expertise without bloating every session. Avoid the "every session loads everything" antipattern.

PM33's implementation: 30+ skills covering harness-coordinator, harness-discipline, harness-planner, harness-discovery, gauntlet-review (the 5 harness skills from module 3), plus brief (atomic spec authoring), pm33-mcp (conventions for PM33 MCP tool use), db-backup, simplify, and many more. The skills directory is ~/.claude/skills/ and .claude/skills/.

Anthropic's guidance — "Load specialized expertise on-demand without bloating every session. Scope skills to specific directory paths" — informs how PM33 scopes skills (e.g., harness-discipline only loads in specialist sessions, not coordinator sessions).

4. Plugins (distribution)

What it is: bundled skills + hooks + MCP configurations distributed as installable packages. Lets a team ship a "this is how we do AI dev here" snapshot to new engineers on day one.

Why it matters: stops tribal knowledge fragmentation. Without plugins, "how we use Claude here" becomes folklore that drifts across team members.

PM33's implementation: .claude/plugins/ collects the canonical PM33 plugin suite. New engineers run a one-command install and get the full harness ecosystem.

Anthropic's recommendation — "deploy proven plugin suites on day one for optimal first experiences" — aligns with PM33's onboarding flow.

5. MCP Servers (external integration)

What it is: Model Context Protocol servers that expose tools (functions) the agent can call. Not just for code — for any external system: Jira, Notion, internal APIs, the product's own backend.

Why it matters: agents that can't reach external systems are limited to file edits + grep. MCP turns them into operators that can query, mutate, and orchestrate.

PM33's implementation: the PM33 MCP server (pm33-staging) exposes 80+ tools — pm33_create_work_item, pm33_query_backlog, pm33_score_alignment, etc. The pm33-mcp skill captures conventions for using these tools (anti-duplication queries, batch parallel patterns, MCP instability handling).

Critically, PM33 dogfoods itself via MCP — the same MCP tools shipped to customers are how PM33's team manages PM33 development.

6. Subagents (task specialization)

What it is: dispatching a fresh agent context for a specific bounded task. Returns focused findings to the parent.

Why it matters: keeps the parent agent's context window clean. The parent doesn't need to hold the full state of "exploring 50 files looking for a pattern." It dispatches a subagent, gets the answer, moves on.

PM33's implementation: the coordinator/specialist pattern from module 3 IS this. The coordinator dispatches specialists via the Task tool for each Brief. Each specialist gets its own context, own worktree, own scope. Specialists don't talk to each other; the coordinator is the only shared context.

Anthropic's guidance — "Use read-only subagents to map subsystems independently. Split exploration from editing workflows. Return focused findings to parent agent" — describes the pattern PM33 uses for discovery and gauntlet review.

7. LSP (language server protocol)

What it is: a standardized server that provides symbol-level navigation (jump-to-definition, find-references) across a codebase.

Why it matters: in large or multi-language codebases, grep-by-string is fuzzy. LSP enables precise symbol resolution — "find every caller of THIS function" vs. "find every place that mentions a string that looks like the function name."

PM33's implementation: TypeScript LSP runs locally; agents use it for refactor work (renaming, finding usages). For the most part, PM33's codebase is TypeScript-mono so grep is sufficient; for multi-language teams, LSP becomes essential.

How PM33 thinks about the ecosystem (the meta-pattern)

The 7 components above aren't a checklist. They're an architecture. The way PM33 invests in each:

CLAUDE.md — high investment, frequent review (every 2-3 months)
Hooks — high investment for safety-critical enforcement (schema drift, absorption prevention)
Skills — heavy investment; the skills directory is treated as a load-bearing API
Plugins — moderate investment; emerges naturally from skills + hooks once stable
MCP — heavy investment; PM33's MCP server is shipped product, dogfooded internally
Subagents — pattern-level investment; encoded in coordinator/specialist skills
LSP — low investment for mono-language TypeScript; would be high for polyglot

The Anthropic post's organizational recommendation — assign a DRI ("agent manager") for the ecosystem — is what PM33 does: this is a hybrid PM/engineering function, not a side-of-desk responsibility.

What this means for your team

If you're adopting AI agents at scale, you'll discover the same thing PM33 did: the model is a fixed cost. The ecosystem is what compounds. Investing in CLAUDE.md, hooks, skills, and MCP early is what separates teams that use AI well from teams that use it as fast autocomplete.

PM33's product encodes this insight. The harness skills (module 3) are the operational layer. The Brief schema (module 2) is the work-item layer. The closed loop (module 4) is the outcome layer. The governance (module 5) is the runtime layer. And the harness ecosystem above is the development-time configuration layer.

All 5 layers compose. None of them can be skipped without paying a tax somewhere else.

Hands-on (optional, 5 minutes)

If you have a PM33 staging account:

# Pull up Sarah's view
mcp__pm33-staging__pm33_query_backlog filterTypes='["epic"]' limit=10
mcp__pm33-staging__pm33_score_alignment workItemId='<an-epic-id>'

The score_alignment call shows you what the "strategic alignment" stage of the orchestrator is actually doing. It returns the epic's alignment score (0-1), matched objectives, and a suggestedAction if scores are stale or absent.