Workflow Intelligence
Workflow Intelligence
The workflow engine processes multi-step LLM workflows as an AsyncGenerator, yielding WorkflowEvents that drive the SSE stream to the client. Each step can reference upstream outputs, inject external context, pause for human review, fan out into parallel sections, or generate competing alternatives.
Template Variable System
All step prompts are templates with {{variable}} placeholders resolved at execution time by resolvePrompt(template, ctx).
Complete Variable Reference
| Variable | Resolution | Fallback |
|---|---|---|
{{input}} |
ctx.initialInput as string |
"" |
{{input.fieldName}} |
ctx.initialInput[fieldName] |
"" |
{{steps.Label.content}} |
Digest if available, else raw output | "" |
{{steps.Label.full}} |
Always raw uncompressed output | "" |
{{steps.Label.digest}} |
Digest, or truncated fallback (3000 chars) | "" |
{{steps.Label.notes}} |
User review notes from checkpoint | "" |
{{steps.Label.sections}} |
Fan-out headings as numbered list | "" |
{{steps.Label.section[N].content}} |
Nth sub-output content | "" |
{{steps.Label.section[N].heading}} |
Nth sub-output heading | "" |
{{consolidations}} |
ctx.consolidationContent (sanitized) |
[No consolidations linked...] |
{{memory}} |
ctx.memoryContent (sanitized) |
[No memory insights linked...] |
{{project.wisdom}} |
ctx.projectWisdom (sanitized) |
[No project linked...] |
{{claudemd}} |
ctx.claudeMdContent (sanitized) |
[No CLAUDE.md found...] |
{{fileTree}} |
ctx.fileTreeContent |
[No file tree available...] |
{{docs}} |
ctx.documentationContent (sanitized) |
[No repository documentation...] |
{{docs.ProjectName}} |
ctx.documentationByProject.get(ProjectName) |
[No documentation found...] |
{{fanOut.section}} |
Set per-section during fan-out loop | [No fan-out section available] |
{{fanOut.heading}} |
Set per-section during fan-out loop | [No fan-out heading available] |
Resolution Logic
The resolver processes each {{token}} match in order:
- Context injections (
consolidations,memory,project.wisdom,claudemd,docs) — sanitized - Fan-out context — transient, set per-section during fan-out execution
- Input resolution —
ctx.initialInputor specific fields - Step output resolution — digest vs. full vs. notes vs. sections
- Unresolved tokens — returned as-is
{{token}}
Content Sanitization
All user-controlled content injected via context variables passes through sanitizeContextContent():
-
Template escape:
{{is replaced with\{\{to prevent recursive variable resolution. Without this, content from GitHub files or memory entries could contain strings like{{steps.Analysis.content}}that the resolver would interpret. -
Suspicious pattern detection: Six regex patterns check for prompt override attempts (
ignore previous instructions,you are now,[SYSTEM], etc.). These trigger aconsole.warnlog but do not block the content — observe-and-log, not reject.
ChainContext Architecture
The ChainContext accumulates state across the entire workflow execution. Each loader populates a specific portion; downstream steps reference upstream data through it.
interface ChainContext {
workflowId: string;
tenantId: string;
userId: string;
initialInput: unknown;
stepOutputs: Map<string, unknown>; // keyed by step label
stepLabels: Map<string, string>;
stepDigests: Map<string, string>; // compressed versions
consolidationContent: string;
memoryContent: string;
documentationContent: string;
documentationByProject: Map<string, string>;
claudeMdContent: string;
fileTreeContent: string;
personaSystemPrompts: string[];
projectWisdom: string;
fanOutSection?: string; // transient, per-section
fanOutHeading?: string; // transient, per-section
stepSubOutputs: Map<string, SubOutput[]>; // fan-out results
stepReviewNotes: Map<string, string>; // checkpoint notes
}
Loaders run before step execution and populate the context:
- Consolidation content loaded from the linked consolidation
- Memory content loaded from linked insights via MemoryPicker selection
- Project wisdom assembled from consolidation + code patterns for the linked project
- Documentation loaded from linked repository files
- Persona system prompts concatenated from all assigned personas
Digest Compression
For steps producing outputs exceeding 2000 characters, an automatic digest is generated via Claude Haiku. The digest retains approximately 15–25% of the original output size.
compression_ratio ≈ 0.15 - 0.25
When a downstream step uses {{steps.Label.content}}, the resolver automatically returns the digest if available. To force the full output, use {{steps.Label.full}}.
The token savings compound across long workflows. For a 5-step workflow where each step produces 3,000 completion tokens:
- Without digests:
4,000 (sys) + 5 × 3,000 = 19,000 tokens - With digests:
4,000 (sys) + 5 × 600 = 7,000 tokens - Savings: 63.2%
Digests are generated as a separate async operation after the step completes. They're stored in WorkflowStep.digest and linked from the step output.
Fan-Out Parallelism
Fan-out steps split a source step's output into sections and process each section independently via separate LLM calls.
Configuration on the step:
{
stepType: "fan-out",
sourceStepLabel: "Research", // which step to split
splitPattern: "###\\s+\\d+\\.", // regex to detect section boundaries
promptTemplate: "Analyze this section in depth:\n\n{{fanOut.section}}"
}
Sub-outputs are stored in stepSubOutputs keyed by step label. Downstream steps can reference individual sections via {{steps.FanOut.section[0].content}} or the full sections list via {{steps.FanOut.sections}}.
Sub-outputs appear as tabs in the workflow UI, letting you review each section's analysis independently.
Step Types
| Type | Behavior |
|---|---|
llm |
Standard LLM call with template variable resolution |
review |
Pauses execution for human checkpoint (unless yoloMode is enabled) |
fan-out |
Splits source output into parallel LLM calls |
conditional |
Branches based on previous step output |
Review steps additionally run key point extraction and inject all workflow personas as team context (the only step type that does this).
Context Compression Engine
For workflows that accumulate large amounts of context across many steps, the Context Compression Engine (CCE) manages token budgets deterministically.
The 80K token budget is allocated across sources with a recency-weighted strategy:
RECENCY_WINDOW = 5 // number of recent steps whose full output is preserved
Steps outside the recency window are replaced with their digests. The CCE never calls an LLM for compression — it's a deterministic selection algorithm. This guarantees token overflow prevention without adding latency or cost.
Event Types
The AsyncGenerator yields these event types over SSE:
| Event | When |
|---|---|
step_start |
Before a step's LLM call begins |
step_output |
After a step's LLM call completes |
step_digest |
After a digest is generated for a step |
context_report |
After buildInjectionReport() runs for a step |
review_required |
When a review step pauses for human input |
fan_out_section |
For each sub-output in a fan-out step |
done |
When the workflow completes |
error |
On unrecoverable failure |
The client uses an EventSource connection and the useSSE hook to receive and render these events in real time.
