Injection Diagnostics
19. Injection Diagnostics
The injection diagnostics system (src/server/services/injection-diagnostics.ts) provides runtime observability into what context sources are being injected into LLM prompts during workflow execution. It measures the size of each context source, detects unresolved template variables, sanitizes user-controlled content against prompt injection attacks, and produces a structured report that is persisted on each workflow step and streamed to the client as a context_report event.
19.1 System Overview
The diagnostics system is called inside executeStep() in the workflow engine. After the prompt template is fully resolved, buildInjectionReport() is called with the resolved prompt and the full ChainContext. The resulting report is:
- Attached to the step result's
injectionReportfield - Persisted in the
checkpointJSON column of theworkflow_stepstable - Emitted as a
context_reportworkflow event (JSON-serialized in thecontentfield)
19.2 Context Source Measurement
The measureContextSources() function inspects the ChainContext object and returns an array of ContextSource entries, one for each context injection point:
interface ContextSource {
name: string; // Source identifier
charCount: number; // Raw character count
estimatedTokens: number; // charCount / 4 (ceiling)
isEmpty: boolean; // charCount === 0
truncated: boolean; // Whether content was truncated before measurement
}
Measured Sources
| Source Name | ChainContext Field | Template Variable |
|---|---|---|
project.wisdom |
ctx.projectWisdom |
{{project.wisdom}} |
memory |
ctx.memoryContent |
{{memory}} |
consolidations |
ctx.consolidationContent |
{{consolidations}} |
personas |
ctx.personaSystemPrompts (joined) |
System prompt injection |
claudemd |
ctx.claudeMdContent |
{{claudemd}} |
fileTree |
ctx.fileTreeContent |
{{fileTree}} |
docs |
ctx.documentationContent |
{{docs}} |
Token estimation uses the heuristic of chars / 4, which provides a rough but fast approximation without requiring a tokenizer. The truncated flag is set by the caller when content was cut (e.g., file trees capped at 500 entries, documentation capped at 50,000 chars).
19.3 Unresolved Variable Detection
After template resolution, some variables may resolve to placeholder strings rather than actual content. This happens when a context source is not configured (no linked project, no memory entries, no GitHub token, etc.). The detectUnresolvedVariables() function scans the fully resolved prompt for these placeholder patterns:
const PLACEHOLDER_PATTERNS = [
/\[No .+? linked.*?\]/g,
/\[No .+? available.*?\]/g,
/\[No .+? found.*?\]/g,
/\[No .+? configured.*?\]/g,
/\[GitHub token not configured.*?\]/g,
/\[Failed to fetch.*?\]/g,
];
These patterns match the fallback strings produced by resolvePrompt() in the workflow engine. For example:
| Template Variable | Fallback When Missing |
|---|---|
{{project.wisdom}} |
[No project linked -- attach a project to enable wisdom injection] |
{{memory}} |
[No memory insights linked to this workflow] |
{{consolidations}} |
[No consolidations linked to this workflow] |
{{claudemd}} |
[No CLAUDE.md or README.md found in linked repositories] |
{{fileTree}} |
[No file tree available -- no repositories linked] |
{{docs}} |
[No repository documentation linked to this workflow] |
The function returns a deduplicated array of matched placeholder strings, allowing the UI to display which context sources failed to resolve.
19.4 Content Sanitization
The sanitizeContextContent() function is called from resolvePrompt() every time user-controlled content is injected into an LLM prompt. It provides two layers of defense:
Template Variable Escaping
All {{ sequences in injected content are escaped to \{\{, preventing recursive template resolution. Without this, content from GitHub files, consolidation summaries, or memory entries could contain strings like {{steps.Analysis.content}} that would be interpreted as template variables during prompt construction.
const sanitized = content.replace(/\{\{/g, "\\{\\{");
Prompt Injection Detection
Six regex patterns scan for common prompt override attempts:
const SUSPICIOUS_PATTERNS = [
/ignore\s+(all\s+)?previous\s+instructions/i,
/forget\s+(everything|all|your)\s+(above|previous)/i,
/you\s+are\s+now\s+/i,
/new\s+instructions?\s*:/i,
/system\s*:\s*/i,
/\[SYSTEM\]/i,
];
Important: Suspicious patterns trigger a console.warn log but do not block the content. The design philosophy is observe-and-log rather than reject, since false positives in legitimate technical content (e.g., a code review discussing prompt injection defenses) would be disruptive. The log message includes the matched pattern source for post-hoc investigation.
19.5 Injection Report Structure
The full report produced by buildInjectionReport():
interface InjectionReport {
stepId: string; // Workflow step UUID
stepLabel: string; // Human-readable step label
sources: ContextSource[]; // 7 measured context sources
totalPromptChars: number; // Length of fully resolved prompt
totalEstimatedTokens: number; // totalPromptChars / 4
unresolvedVariables: string[]; // Matched placeholder strings
timestamp: string; // ISO 8601 timestamp
}
Report Lifecycle
The report is only emitted for standard LLM steps that complete successfully. Review steps, fan-out sub-steps, and alternative-generation steps do not currently produce injection reports (the buildInjectionReport call is inside executeStep, but the report is only persisted and emitted in the main step execution path, not in the alternatives or fan-out loops).
19.6 Integration Points
The injection diagnostics module integrates with the workflow engine at three points:
- Import --
buildInjectionReportandsanitizeContextContentare imported inworkflow-engine.ts - Sanitization --
sanitizeContextContent()is called insideresolvePrompt()for every context variable that contains actual content (consolidations, memory, project.wisdom, claudemd, docs, docs.ProjectName). Variables that resolved to fallback strings are not sanitized. - Report generation --
buildInjectionReport()is called after prompt resolution insideexecuteStep(), and the report is returned alongside the LLM result.
