Workflow Intelligence

The workflow engine processes multi-step LLM workflows as an AsyncGenerator, yielding WorkflowEvents that drive the SSE stream to the client. Each step can reference upstream outputs, inject external context, pause for human review, fan out into parallel sections, or generate competing alternatives.

Template Variable System

All step prompts are templates with {{variable}} placeholders resolved at execution time by resolvePrompt(template, ctx).

Complete Variable Reference

Variable	Resolution	Fallback
`{{input}}`	`ctx.initialInput` as string	`""`
`{{input.fieldName}}`	`ctx.initialInput[fieldName]`	`""`
`{{steps.Label.content}}`	Digest if available, else raw output	`""`
`{{steps.Label.full}}`	Always raw uncompressed output	`""`
`{{steps.Label.digest}}`	Digest, or truncated fallback (3000 chars)	`""`
`{{steps.Label.notes}}`	User review notes from checkpoint	`""`
`{{steps.Label.sections}}`	Fan-out headings as numbered list	`""`
`{{steps.Label.section[N].content}}`	Nth sub-output content	`""`
`{{steps.Label.section[N].heading}}`	Nth sub-output heading	`""`
`{{consolidations}}`	`ctx.consolidationContent` (sanitized)	`[No consolidations linked...]`
`{{memory}}`	`ctx.memoryContent` (sanitized)	`[No memory insights linked...]`
`{{project.wisdom}}`	`ctx.projectWisdom` (sanitized)	`[No project linked...]`
`{{claudemd}}`	`ctx.claudeMdContent` (sanitized)	`[No CLAUDE.md found...]`
`{{fileTree}}`	`ctx.fileTreeContent`	`[No file tree available...]`
`{{docs}}`	`ctx.documentationContent` (sanitized)	`[No repository documentation...]`
`{{docs.ProjectName}}`	`ctx.documentationByProject.get(ProjectName)`	`[No documentation found...]`
`{{fanOut.section}}`	Set per-section during fan-out loop	`[No fan-out section available]`
`{{fanOut.heading}}`	Set per-section during fan-out loop	`[No fan-out heading available]`

Resolution Logic

The resolver processes each {{token}} match in order:

Context injections (consolidations, memory, project.wisdom, claudemd, docs) — sanitized
Fan-out context — transient, set per-section during fan-out execution
Input resolution — ctx.initialInput or specific fields
Step output resolution — digest vs. full vs. notes vs. sections
Unresolved tokens — returned as-is {{token}}

Content Sanitization

All user-controlled content injected via context variables passes through sanitizeContextContent():

Template escape: {{ is replaced with \{\{ to prevent recursive variable resolution. Without this, content from GitHub files or memory entries could contain strings like {{steps.Analysis.content}} that the resolver would interpret.
Suspicious pattern detection: Six regex patterns check for prompt override attempts (ignore previous instructions, you are now, [SYSTEM], etc.). These trigger a console.warn log but do not block the content — observe-and-log, not reject.

ChainContext Architecture

The ChainContext accumulates state across the entire workflow execution. Each loader populates a specific portion; downstream steps reference upstream data through it.

interface ChainContext {
  workflowId: string;
  tenantId: string;
  userId: string;
  initialInput: unknown;
  stepOutputs: Map<string, unknown>;        // keyed by step label
  stepLabels: Map<string, string>;
  stepDigests: Map<string, string>;         // compressed versions
  consolidationContent: string;
  memoryContent: string;
  documentationContent: string;
  documentationByProject: Map<string, string>;
  claudeMdContent: string;
  fileTreeContent: string;
  personaSystemPrompts: string[];
  projectWisdom: string;
  fanOutSection?: string;                   // transient, per-section
  fanOutHeading?: string;                   // transient, per-section
  stepSubOutputs: Map<string, SubOutput[]>; // fan-out results
  stepReviewNotes: Map<string, string>;     // checkpoint notes
}

Loaders run before step execution and populate the context:

Consolidation content loaded from the linked consolidation
Memory content loaded from linked insights via MemoryPicker selection
Project wisdom assembled from consolidation + code patterns for the linked project
Documentation loaded from linked repository files
Persona system prompts concatenated from all assigned personas

Digest Compression

For steps producing outputs exceeding 2000 characters, an automatic digest is generated via Claude Haiku. The digest retains approximately 15–25% of the original output size.

compression_ratio ≈ 0.15 - 0.25

When a downstream step uses {{steps.Label.content}}, the resolver automatically returns the digest if available. To force the full output, use {{steps.Label.full}}.

The token savings compound across long workflows. For a 5-step workflow where each step produces 3,000 completion tokens:

Without digests: 4,000 (sys) + 5 × 3,000 = 19,000 tokens
With digests: 4,000 (sys) + 5 × 600 = 7,000 tokens
Savings: 63.2%

Digests are generated as a separate async operation after the step completes. They're stored in WorkflowStep.digest and linked from the step output.

Fan-Out Parallelism

Fan-out steps split a source step's output into sections and process each section independently via separate LLM calls.

graph TB SRC["Source Output\n'### 1. Auth\n### 2. API\n### 3. UI'"] SRC --> SPLIT["Section Splitter\nsplitPattern: '###\\s+\\d+\\.'"] SPLIT --> S1["Section 1"] SPLIT --> S2["Section 2"] SPLIT --> S3["Section 3"] S1 --> LLM1["LLM Call 1\n{{fanOut.section}}"] S2 --> LLM2["LLM Call 2"] S3 --> LLM3["LLM Call 3"] LLM1 --> SUB["subOutputs[]"] LLM2 --> SUB LLM3 --> SUB

Configuration on the step:

{
  stepType: "fan-out",
  sourceStepLabel: "Research",          // which step to split
  splitPattern: "###\\s+\\d+\\.",       // regex to detect section boundaries
  promptTemplate: "Analyze this section in depth:\n\n{{fanOut.section}}"
}

Sub-outputs are stored in stepSubOutputs keyed by step label. Downstream steps can reference individual sections via {{steps.FanOut.section[0].content}} or the full sections list via {{steps.FanOut.sections}}.

Sub-outputs appear as tabs in the workflow UI, letting you review each section's analysis independently.

Step Types

Type	Behavior
`llm`	Standard LLM call with template variable resolution
`review`	Pauses execution for human checkpoint (unless `yoloMode` is enabled)
`fan-out`	Splits source output into parallel LLM calls
`conditional`	Branches based on previous step output

Review steps additionally run key point extraction and inject all workflow personas as team context (the only step type that does this).

Context Compression Engine

For workflows that accumulate large amounts of context across many steps, the Context Compression Engine (CCE) manages token budgets deterministically.

The 80K token budget is allocated across sources with a recency-weighted strategy:

RECENCY_WINDOW = 5  // number of recent steps whose full output is preserved

Steps outside the recency window are replaced with their digests. The CCE never calls an LLM for compression — it's a deterministic selection algorithm. This guarantees token overflow prevention without adding latency or cost.

Event Types

The AsyncGenerator yields these event types over SSE:

Event	When
`step_start`	Before a step's LLM call begins
`step_output`	After a step's LLM call completes
`step_digest`	After a digest is generated for a step
`context_report`	After `buildInjectionReport()` runs for a step
`review_required`	When a review step pauses for human input
`fan_out_section`	For each sub-output in a fan-out step
`done`	When the workflow completes
`error`	On unrecoverable failure

The client uses an EventSource connection and the useSSE hook to receive and render these events in real time.