Neural Constellation
Neural Constellation Board
Technical Documentation — Architecture, Algorithms, and Scientific Foundations
Version: 1.0 Date: 2026-03-10 Classification: Internal Technical Reference
1. Neural Constellation Board
The Neural Constellation Board is a three-dimensional interactive knowledge visualization system that projects high-dimensional workflow insight embeddings into navigable 3D space. It transforms abstract semantic relationships between learned patterns — extracted from AI-powered code review workflows — into a spatially coherent particle system with geometric clustering, pairing arcs, and real-time filtering.
1.1 System Architecture
Data Pipeline Summary:
1.2 Dimensionality Reduction: UMAP Theory
The Constellation Board employs Uniform Manifold Approximation and Projection (UMAP) [1] for dimensionality reduction from the 1536-dimensional embedding space to 3D Euclidean coordinates. UMAP is chosen over alternatives (t-SNE, PCA) due to its superior preservation of both local and global topological structure, computational efficiency on moderate datasets, and deterministic convergence properties.
1.2.1 Mathematical Foundation
UMAP constructs a weighted k-nearest-neighbor graph in high-dimensional space and optimizes a low-dimensional representation that preserves the topological structure of this graph.
1.2.2 Configuration Parameters
| Parameter | Value | Rationale |
|---|---|---|
nComponents |
3 | Target dimensionality for 3D visualization |
nNeighbors |
$\min(15, \lfloor N/2 \rfloor)$ | Dynamic cap prevents over-smoothing on small datasets |
minDist |
0.1 | Tight local packing preserves cluster compactness |
spread |
1.0 | Standard inter-cluster repulsion |
The dynamic nNeighbors is critical: for datasets with $N < 30$ insights, using a fixed $k = 15$ would over-connect the graph, collapsing distinct semantic clusters. The $\lfloor N/2 \rfloor$ cap ensures the neighborhood size scales proportionally to dataset size.
Implementation (src/server/services/umap-projection.ts):
const umap = new UMAP({
nComponents: 3,
nNeighbors: Math.min(15, Math.floor(insights.length / 2)),
minDist: 0.1,
spread: 1.0,
});
const projected = umap.fit(embeddings); // number[][] → [x, y, z][]
1.2.3 Comparison with Alternative Methods
| Method | Local Structure | Global Structure | Speed | Determinism |
|---|---|---|---|---|
| UMAP | Excellent | Good | $O(n^{1.14})$ | High |
| t-SNE | Excellent | Poor | $O(n^2)$ | Low |
| PCA | Poor | Excellent | $O(nd^2)$ | Deterministic |
| Isomap | Good | Good | $O(n^3)$ | Deterministic |
1.3 Embedding Pipeline
Workflow insights are generated by the review step pipeline of nyxCore's workflow engine. Each insight carries a 1536-dimensional embedding vector produced by OpenAI's text-embedding-3-small model, stored via the pgvector extension.
Database Query (raw SQL via Prisma):
SELECT id, title, category, severity, "insightType",
"projectId", "pairedInsightId", tags,
insight_scope as "insightScope",
embedding::text
FROM workflow_insights
WHERE "tenantId" = $1::uuid
AND embedding IS NOT NULL
AND ($2::uuid IS NULL OR "projectId" = $2::uuid)
ORDER BY "createdAt" DESC
LIMIT 500
The embedding::text cast converts pgvector's binary format to a parseable string [0.123, -0.456, ...]. The 500-record limit bounds UMAP computation time while capturing the most recent insights.
1.4 Coordinate Normalization
Raw UMAP output coordinates are normalized to a $[-50, 50]^3$ cube centered at the origin to ensure consistent camera framing and prevent WebGL clipping.
For each axis $a \in {x, y, z}$:
$$\hat{c}i^a = \left(\frac{c_i^a - c{\min}^a}{R^a} - 0.5\right) \times 100$$
This maps the full range of each axis to $[-50, 50]$, preserving relative distances within the UMAP projection.
1.5 Proximity Clustering
After normalization, a greedy proximity-based clustering algorithm groups spatially adjacent points into semantic clusters.
Distance Metric: Euclidean (L2 norm) in normalized scene space with threshold $\tau = 20$ units.
Centroid Computation: Arithmetic mean of member positions.
Label Assignment: Modal category (most frequent category among members).
1.6 Visual Encoding System
The constellation encodes five semantic dimensions into visual properties:
Category → Color
| Category | Hex | Perceptual Intent |
|---|---|---|
| Security | #ff5c5c |
Danger (warm red) |
| Architecture | #60a5fa |
System (cool blue) |
| Performance | #fbbf24 |
Optimization (amber) |
| UX | #4ade80 |
User (green) |
| Code Quality | #c084fc |
Quality (purple) |
| Testing | #22d3ee |
Verification (cyan) |
| Operations | #fb923c |
Execution (orange) |
| Strategy | #a78bfa |
Planning (violet) |
Severity → Particle Size
| Severity | Multiplier | Effective Radius |
|---|---|---|
| Blocker | 3.0 | 2.40 units |
| High | 2.0 | 1.60 units |
| Medium | 1.0 | 0.80 units |
| Low | 0.7 | 0.56 units |
Insight Type → Pairing Arcs
Pain-solution relationships are encoded as quadratic Bezier curves connecting paired insights. Arc height is proportional to chord length ($h = 0.3 \cdot d(P_0, P_2)$), ensuring visual legibility at all distances.
1.7 Rendering Pipeline: React Three Fiber
The rendering layer uses React Three Fiber (R3F) with Three.js for hardware-accelerated WebGL rendering.
InstancedMesh Optimization: All particles are batched into a single draw call with per-instance transform matrices and colors. At 500 instances, this requires only 38 KB GPU buffer memory.
Material System: meshPhysicalMaterial with clearcoat (1.0), ultra-low roughness (0.08), and high environment map intensity (2.5) creates a liquid-glass aesthetic for particles.
Post-Processing: Bloom post-processing adds glow around highlighted particles (threshold: 0.4, intensity: 1.2). Disabled on devices with navigator.hardwareConcurrency < 4 to maintain 60fps.
1.8 Interaction Model
- Hover: Pointer over particle brightens it (160%) with a continuous pulse animation
- Click: Selects particle, opens DetailPanel with full insight text
- Background click: Deselects
- Filter: Type/severity/category filters dim non-matching particles to 10% opacity
- OrbitControls: Auto-rotates at 0.25 rad/s when idle; user drag interrupts rotation
1.9 Performance Optimization
- Instance batching: Single draw call regardless of particle count
- Lazy UMAP: Projection runs once on the server, not per-client
- 500-record cap: Bounds UMAP computation to <200ms for typical datasets
- Frame-rate scaling: Bloom disabled on low-end hardware
- Backface culling: Enabled by default on sphere geometry
References
- McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426.
- Bertin, J. (1983). Semiology of Graphics. University of Wisconsin Press.
- Ware, C. (2012). Information Visualization: Perception for Design (3rd ed.). Morgan Kaufmann.
