Energy & Cost Models
7. Energy & Cost Models (Energie- und Kostenmodelle)
nyxcore tracks three derived metrics for every workflow step: energy consumption (Wh), monetary cost ($), and time saved (minutes). All formulas are pure functions in src/lib/workflow-metrics.ts.
7.1 Energy Computation (Energieberechnung)
Energy is computed per step based on prompt and completion token counts, with model-specific rates sourced from empirical data (arXiv:2505.09598, Epoch AI 2025, Google disclosure).
Formula
$$E_{step} = \frac{T_{prompt}}{10^6} \cdot r_{input} + \frac{T_{completion}}{10^6} \cdot r_{output}$$
where:
- $E_{step}$ = energy in Watt-hours (Wh)
- $T_{prompt}$ = prompt token count
- $T_{completion}$ = completion token count
- $r_{input}$ = energy rate for input tokens (Wh per million tokens)
- $r_{output}$ = energy rate for output tokens (Wh per million tokens)
Energy Rates Table
Rates are matched by model family prefix (case-insensitive startsWith matching):
| Model Family Prefix | $r_{input}$ (Wh/M) | $r_{output}$ (Wh/M) | Source |
|---|---|---|---|
claude-sonnet |
168 | 840 | Epoch AI / Anthropic |
claude-haiku |
40 | 200 | Epoch AI / Anthropic |
claude-opus |
168 | 840 | Epoch AI / Anthropic |
gpt-4o-mini |
15 | 75 | Epoch AI / OpenAI |
gpt-4o |
120 | 600 | Epoch AI / OpenAI |
gpt-4 |
120 | 600 | Epoch AI / OpenAI |
gemini-flash |
48 | 240 | Google disclosure |
gemini-pro |
100 | 500 | Google disclosure |
kimi |
70 | 350 | Estimated |
ollama |
80 | 400 | Estimated (local GPU) |
Fallback rates (Epoch AI blended average for frontier models):
$$r_{input}^{fallback} = 110 \text{ Wh/M}, \quad r_{output}^{fallback} = 540 \text{ Wh/M}$$
Rate Resolution Logic
function getEnergyRate(model: string | null | undefined): { input: number; output: number } {
if (!model) return { input: 110, output: 540 }; // fallback
const lower = model.toLowerCase();
for (const rate of ENERGY_RATES) {
if (lower.startsWith(rate.prefix)) return rate; // first prefix match wins
}
return { input: 110, output: 540 }; // no match -> fallback
}
Note: gpt-4o-mini must appear before gpt-4o in the array because startsWith would match gpt-4o for a gpt-4o-mini model ID. The array order in the source code ensures correct precedence.
Display Formatting
function formatEnergy(wh: number | null): string {
if (wh < 0.01) return `${(wh * 1000).toFixed(1)} mWh`; // sub-milliwatt precision
return `${wh.toFixed(2)} Wh`;
}
7.2 Cost Estimation (Kostenschaetzung)
Cost is computed using per-model pricing rates (USD per million tokens):
$$C_{step} = \frac{T_{prompt}}{10^6} \cdot p_{input} + \frac{T_{completion}}{10^6} \cdot p_{output}$$
Cost Rates Table
From src/server/services/llm/types.ts (COST_RATES):
| Model ID | $p_{input}$ ($/M) | $p_{output}$ ($/M) | Tier |
|---|---|---|---|
claude-sonnet-4-20250514 |
3.00 | 15.00 | High |
claude-haiku-3-5-20241022 |
0.80 | 4.00 | Low |
gpt-4o |
2.50 | 10.00 | Medium |
gpt-4o-mini |
0.15 | 0.60 | Low |
gemini-2.0-flash |
0.10 | 0.40 | Low |
kimi-k2-0711-preview |
0.00 | 0.00 | Free |
Models not in the table return cost = 0 (e.g., Ollama local models).
Pre-Execution Cost Estimation
estimateWorkflowCost() provides a pre-run estimate using average token assumptions:
const AVG_TOKENS_PER_STEP = { prompt: 2000, completion: 2000 };
For multi-provider steps (compareProviders.length > 1), each provider's cost is estimated separately using its default model.
7.3 Time Saved (Eingesparte Zeit)
The time-saved metric estimates how long a human would take to produce equivalent written output.
Formula
$$t_{saved} = \frac{T_{completion} \cdot w_{ratio}}{v_{human} / 60}$$
where:
- $t_{saved}$ = time saved in minutes
- $T_{completion}$ = completion tokens
- $w_{ratio} = 0.75$ words per token (standard tokenizer ratio)
- $v_{human} = 300$ words per hour (analytical/technical writing benchmark, sourced from Publication Coach + academic benchmarks)
Simplified Form
$$t_{saved} = \frac{T_{completion} \cdot 0.75}{5} = 0.15 \cdot T_{completion}$$
So every 1,000 completion tokens represents approximately 2.5 minutes of human writing time saved.
Display Formatting
function formatTimeSaved(minutes: number | null): string {
if (minutes >= 60) return `${(minutes / 60).toFixed(1)} hrs`;
return `${minutes.toFixed(1)} min`;
}
7.4 Aggregate Metrics (Gesamtmetriken)
Individual step metrics roll up to workflow level via computeWorkflowAggregates(steps):
interface WorkflowAggregateMetrics {
totalCost: number; // Sum of all step costs ($)
totalTokens: number; // Sum of all step total tokens
totalDuration: number; // Sum of all step durations (ms)
totalEnergyWh: number; // Sum of all step energy (Wh)
totalTimeSavedMinutes: number; // Sum of all step time saved (min)
completedSteps: number; // Count of completed steps
totalSteps: number; // Total step count
}
The aggregation iterates through all steps, calling computeStepMetrics(step, totalSteps, index) for each, and sums across the relevant fields. Token usage is extracted from either step.output.tokenUsage or step.tokenUsage (the function checks both locations for backward compatibility).
Digest Token Savings Tracking
When a step has a digest, the per-step metrics include:
if (step.digest) {
const digestTokens = Math.round(tokenUsage.completion * DIGEST_COMPRESSION);
tokensSaved = tokenUsage.completion - digestTokens;
const downstreamSteps = Math.max(0, totalSteps - stepIndex - 1);
tokensSavedDownstream = downstreamSteps > 0 ? tokensSaved * downstreamSteps : null;
}
Where $DIGEST_COMPRESSION = 0.30$ (the digest retains 30% of original completion tokens).
7.5 Worked Example: 5-Step Mixed-Model Workflow
Consider a workflow with 5 steps using different models:
| Step | Model | $T_{prompt}$ | $T_{completion}$ | Has Digest? |
|---|---|---|---|---|
| 0: Analyze | claude-sonnet-4 | 1,500 | 3,000 | No (< 2000 chars) |
| 1: Research | claude-sonnet-4 | 4,000 | 8,000 | Yes |
| 2: Features | gpt-4o | 6,000 | 5,000 | Yes |
| 3: Review | claude-haiku-4.5 | 5,000 | 2,000 | No (review) |
| 4: Final | claude-sonnet-4 | 3,000 | 10,000 | Yes |
Energy Calculation
$$E_0 = \frac{1500}{10^6} \times 168 + \frac{3000}{10^6} \times 840 = 0.252 + 2.52 = 2.772 \text{ Wh}$$
$$E_1 = \frac{4000}{10^6} \times 168 + \frac{8000}{10^6} \times 840 = 0.672 + 6.72 = 7.392 \text{ Wh}$$
$$E_2 = \frac{6000}{10^6} \times 120 + \frac{5000}{10^6} \times 600 = 0.72 + 3.0 = 3.72 \text{ Wh}$$
$$E_3 = \frac{5000}{10^6} \times 40 + \frac{2000}{10^6} \times 200 = 0.20 + 0.40 = 0.60 \text{ Wh}$$
$$E_4 = \frac{3000}{10^6} \times 168 + \frac{10000}{10^6} \times 840 = 0.504 + 8.4 = 8.904 \text{ Wh}$$
$$E_{total} = 2.772 + 7.392 + 3.72 + 0.60 + 8.904 = \boxed{23.39 \text{ Wh}}$$
Cost Calculation
$$C_0 = \frac{1500}{10^6} \times 3 + \frac{3000}{10^6} \times 15 = 0.0045 + 0.045 = $0.0495$$
$$C_1 = \frac{4000}{10^6} \times 3 + \frac{8000}{10^6} \times 15 = 0.012 + 0.12 = $0.132$$
$$C_2 = \frac{6000}{10^6} \times 2.5 + \frac{5000}{10^6} \times 10 = 0.015 + 0.05 = $0.065$$
$$C_3 = \frac{5000}{10^6} \times 0.8 + \frac{2000}{10^6} \times 4 = 0.004 + 0.008 = $0.012$$
$$C_4 = \frac{3000}{10^6} \times 3 + \frac{10000}{10^6} \times 15 = 0.009 + 0.15 = $0.159$$
$$C_{total} = 0.0495 + 0.132 + 0.065 + 0.012 + 0.159 = \boxed{$0.4175}$$
Time Saved
$$t_{total} = 0.15 \times (3000 + 8000 + 5000 + 2000 + 10000) = 0.15 \times 28000 = \boxed{4200 \text{ min} = 70.0 \text{ hrs}}$$
Token Savings from Digests
Step 1 (index 1, has digest, 8000 completion tokens):
$$\Delta_{tokens} = 8000 \times 0.70 = 5600$$ $$\Delta_{downstream} = 5600 \times (5 - 1 - 1) = 5600 \times 3 = 16{,}800$$
Step 2 (index 2, has digest, 5000 completion tokens):
$$\Delta_{tokens} = 5000 \times 0.70 = 3500$$ $$\Delta_{downstream} = 3500 \times (5 - 2 - 1) = 3500 \times 2 = 7{,}000$$
Step 4 (index 4, has digest, 10000 completion tokens):
$$\Delta_{tokens} = 10000 \times 0.70 = 7000$$ $$\Delta_{downstream} = 7000 \times (5 - 4 - 1) = 7000 \times 0 = 0 \text{ (last step, no downstream)}$$
Total downstream tokens saved: $16{,}800 + 7{,}000 + 0 = 23{,}800$ tokens
