Injection Diagnostics

Developer5 min read

19. Injection Diagnostics

The injection diagnostics system (src/server/services/injection-diagnostics.ts) provides runtime observability into what context sources are being injected into LLM prompts during workflow execution. It measures the size of each context source, detects unresolved template variables, sanitizes user-controlled content against prompt injection attacks, and produces a structured report that is persisted on each workflow step and streamed to the client as a context_report event.

19.1 System Overview

flowchart TD A["executeStep()"] -->|"resolvedPrompt + ChainContext"| B["buildInjectionReport()"] B --> C["measureContextSources()"] B --> D["detectUnresolvedVariables()"] C --> E["InjectionReport.sources"] D --> F["InjectionReport.unresolvedVariables"] E --> G["InjectionReport"] F --> G G -->|"stored on step"| H["workflowStep.checkpoint\n{ injectionReport }"] G -->|"yielded as event"| I["WorkflowEvent\ntype: 'context_report'"] J["resolvePrompt()"] -->|"calls per context var"| K["sanitizeContextContent()"] K -->|"escapes {{ }}\nand logs suspicious patterns"| L["sanitized content\ninjected into prompt"]

The diagnostics system is called inside executeStep() in the workflow engine. After the prompt template is fully resolved, buildInjectionReport() is called with the resolved prompt and the full ChainContext. The resulting report is:

  1. Attached to the step result's injectionReport field
  2. Persisted in the checkpoint JSON column of the workflow_steps table
  3. Emitted as a context_report workflow event (JSON-serialized in the content field)

19.2 Context Source Measurement

The measureContextSources() function inspects the ChainContext object and returns an array of ContextSource entries, one for each context injection point:

interface ContextSource {
  name: string;           // Source identifier
  charCount: number;      // Raw character count
  estimatedTokens: number; // charCount / 4 (ceiling)
  isEmpty: boolean;       // charCount === 0
  truncated: boolean;     // Whether content was truncated before measurement
}

Measured Sources

Source Name ChainContext Field Template Variable
project.wisdom ctx.projectWisdom {{project.wisdom}}
memory ctx.memoryContent {{memory}}
consolidations ctx.consolidationContent {{consolidations}}
personas ctx.personaSystemPrompts (joined) System prompt injection
claudemd ctx.claudeMdContent {{claudemd}}
fileTree ctx.fileTreeContent {{fileTree}}
docs ctx.documentationContent {{docs}}

Token estimation uses the heuristic of chars / 4, which provides a rough but fast approximation without requiring a tokenizer. The truncated flag is set by the caller when content was cut (e.g., file trees capped at 500 entries, documentation capped at 50,000 chars).

19.3 Unresolved Variable Detection

After template resolution, some variables may resolve to placeholder strings rather than actual content. This happens when a context source is not configured (no linked project, no memory entries, no GitHub token, etc.). The detectUnresolvedVariables() function scans the fully resolved prompt for these placeholder patterns:

const PLACEHOLDER_PATTERNS = [
  /\[No .+? linked.*?\]/g,
  /\[No .+? available.*?\]/g,
  /\[No .+? found.*?\]/g,
  /\[No .+? configured.*?\]/g,
  /\[GitHub token not configured.*?\]/g,
  /\[Failed to fetch.*?\]/g,
];

These patterns match the fallback strings produced by resolvePrompt() in the workflow engine. For example:

Template Variable Fallback When Missing
{{project.wisdom}} [No project linked -- attach a project to enable wisdom injection]
{{memory}} [No memory insights linked to this workflow]
{{consolidations}} [No consolidations linked to this workflow]
{{claudemd}} [No CLAUDE.md or README.md found in linked repositories]
{{fileTree}} [No file tree available -- no repositories linked]
{{docs}} [No repository documentation linked to this workflow]

The function returns a deduplicated array of matched placeholder strings, allowing the UI to display which context sources failed to resolve.

19.4 Content Sanitization

The sanitizeContextContent() function is called from resolvePrompt() every time user-controlled content is injected into an LLM prompt. It provides two layers of defense:

Template Variable Escaping

All {{ sequences in injected content are escaped to \{\{, preventing recursive template resolution. Without this, content from GitHub files, consolidation summaries, or memory entries could contain strings like {{steps.Analysis.content}} that would be interpreted as template variables during prompt construction.

const sanitized = content.replace(/\{\{/g, "\\{\\{");

Prompt Injection Detection

Six regex patterns scan for common prompt override attempts:

const SUSPICIOUS_PATTERNS = [
  /ignore\s+(all\s+)?previous\s+instructions/i,
  /forget\s+(everything|all|your)\s+(above|previous)/i,
  /you\s+are\s+now\s+/i,
  /new\s+instructions?\s*:/i,
  /system\s*:\s*/i,
  /\[SYSTEM\]/i,
];

Important: Suspicious patterns trigger a console.warn log but do not block the content. The design philosophy is observe-and-log rather than reject, since false positives in legitimate technical content (e.g., a code review discussing prompt injection defenses) would be disruptive. The log message includes the matched pattern source for post-hoc investigation.

19.5 Injection Report Structure

The full report produced by buildInjectionReport():

interface InjectionReport {
  stepId: string;              // Workflow step UUID
  stepLabel: string;           // Human-readable step label
  sources: ContextSource[];    // 7 measured context sources
  totalPromptChars: number;    // Length of fully resolved prompt
  totalEstimatedTokens: number; // totalPromptChars / 4
  unresolvedVariables: string[]; // Matched placeholder strings
  timestamp: string;           // ISO 8601 timestamp
}

Report Lifecycle

sequenceDiagram participant WE as Workflow Engine participant ID as Injection Diagnostics participant DB as PostgreSQL participant SSE as SSE Stream WE->>WE: resolvePrompt(template, ctx) Note over WE: sanitizeContextContent() called<br/>per context variable WE->>ID: buildInjectionReport(stepId, label, prompt, ctx) ID->>ID: measureContextSources(ctx) ID->>ID: detectUnresolvedVariables(prompt) ID-->>WE: InjectionReport WE->>DB: workflowStep.update({ checkpoint: { injectionReport } }) WE->>SSE: yield { type: "context_report", content: JSON.stringify(report) }

The report is only emitted for standard LLM steps that complete successfully. Review steps, fan-out sub-steps, and alternative-generation steps do not currently produce injection reports (the buildInjectionReport call is inside executeStep, but the report is only persisted and emitted in the main step execution path, not in the alternatives or fan-out loops).

19.6 Integration Points

The injection diagnostics module integrates with the workflow engine at three points:

  1. Import -- buildInjectionReport and sanitizeContextContent are imported in workflow-engine.ts
  2. Sanitization -- sanitizeContextContent() is called inside resolvePrompt() for every context variable that contains actual content (consolidations, memory, project.wisdom, claudemd, docs, docs.ProjectName). Variables that resolved to fallback strings are not sanitized.
  3. Report generation -- buildInjectionReport() is called after prompt resolution inside executeStep(), and the report is returned alongside the LLM result.