Neural Constellation

UserEvaluator6 min read

Neural Constellation Board

Technical Documentation — Architecture, Algorithms, and Scientific Foundations

Version: 1.0 Date: 2026-03-10 Classification: Internal Technical Reference


1. Neural Constellation Board

The Neural Constellation Board is a three-dimensional interactive knowledge visualization system that projects high-dimensional workflow insight embeddings into navigable 3D space. It transforms abstract semantic relationships between learned patterns — extracted from AI-powered code review workflows — into a spatially coherent particle system with geometric clustering, pairing arcs, and real-time filtering.

1.1 System Architecture

graph TB subgraph Database["PostgreSQL + pgvector"] WI[workflow_insights table<br/>embedding vector 1536] end subgraph Backend["Node.js Server"] TRPC["tRPC Procedure<br/>memory.constellation"] UMAP["UMAP Projection Service<br/>umap-projection.ts"] CLUSTER["Proximity Clustering<br/>buildClusters()"] end subgraph Frontend["React Client"] CV["ConstellationView<br/>Canvas + State"] PF["ParticleField<br/>InstancedMesh"] CF["ConstellationFilaments<br/>LineSegments"] PA["PairedArcs<br/>Bezier Curves"] HUD["ConstellationHUD<br/>2D Overlay"] DP["DetailPanel<br/>Selection Info"] end subgraph ThreeJS["Three.js / WebGL"] BLOOM["Bloom Post-Processing"] ENV["Environment Night Preset"] ORBIT["OrbitControls"] end WI -->|"Raw SQL + pgvector"| TRPC TRPC -->|"ConstellationInput[]"| UMAP UMAP -->|"3D coordinates"| CLUSTER CLUSTER -->|"ConstellationData"| CV CV --> PF CV --> CF CV --> PA CV --> HUD CV --> DP PF --> ThreeJS CF --> ThreeJS PA --> ThreeJS

Data Pipeline Summary:

flowchart LR A["1536-dim<br/>Embeddings"] -->|"UMAP fit()"| B["3D<br/>Coordinates"] B -->|"Min-Max<br/>Normalization"| C["Scene Space<br/>[-50, 50]³"] C -->|"Euclidean<br/>Distance < 20"| D["Clusters +<br/>Centroids"] D -->|"InstancedMesh<br/>+ LineSegments"| E["WebGL<br/>Render"]

1.2 Dimensionality Reduction: UMAP Theory

The Constellation Board employs Uniform Manifold Approximation and Projection (UMAP) [1] for dimensionality reduction from the 1536-dimensional embedding space to 3D Euclidean coordinates. UMAP is chosen over alternatives (t-SNE, PCA) due to its superior preservation of both local and global topological structure, computational efficiency on moderate datasets, and deterministic convergence properties.

1.2.1 Mathematical Foundation

UMAP constructs a weighted k-nearest-neighbor graph in high-dimensional space and optimizes a low-dimensional representation that preserves the topological structure of this graph.

1.2.2 Configuration Parameters

Parameter Value Rationale
nComponents 3 Target dimensionality for 3D visualization
nNeighbors $\min(15, \lfloor N/2 \rfloor)$ Dynamic cap prevents over-smoothing on small datasets
minDist 0.1 Tight local packing preserves cluster compactness
spread 1.0 Standard inter-cluster repulsion

The dynamic nNeighbors is critical: for datasets with $N < 30$ insights, using a fixed $k = 15$ would over-connect the graph, collapsing distinct semantic clusters. The $\lfloor N/2 \rfloor$ cap ensures the neighborhood size scales proportionally to dataset size.

Implementation (src/server/services/umap-projection.ts):

const umap = new UMAP({
  nComponents: 3,
  nNeighbors: Math.min(15, Math.floor(insights.length / 2)),
  minDist: 0.1,
  spread: 1.0,
});
const projected = umap.fit(embeddings);  // number[][] → [x, y, z][]

1.2.3 Comparison with Alternative Methods

Method Local Structure Global Structure Speed Determinism
UMAP Excellent Good $O(n^{1.14})$ High
t-SNE Excellent Poor $O(n^2)$ Low
PCA Poor Excellent $O(nd^2)$ Deterministic
Isomap Good Good $O(n^3)$ Deterministic

1.3 Embedding Pipeline

Workflow insights are generated by the review step pipeline of nyxCore's workflow engine. Each insight carries a 1536-dimensional embedding vector produced by OpenAI's text-embedding-3-small model, stored via the pgvector extension.

Database Query (raw SQL via Prisma):

SELECT id, title, category, severity, "insightType",
       "projectId", "pairedInsightId", tags,
       insight_scope as "insightScope",
       embedding::text
FROM workflow_insights
WHERE "tenantId" = $1::uuid
  AND embedding IS NOT NULL
  AND ($2::uuid IS NULL OR "projectId" = $2::uuid)
ORDER BY "createdAt" DESC
LIMIT 500

The embedding::text cast converts pgvector's binary format to a parseable string [0.123, -0.456, ...]. The 500-record limit bounds UMAP computation time while capturing the most recent insights.

1.4 Coordinate Normalization

Raw UMAP output coordinates are normalized to a $[-50, 50]^3$ cube centered at the origin to ensure consistent camera framing and prevent WebGL clipping.

For each axis $a \in {x, y, z}$:

$$\hat{c}i^a = \left(\frac{c_i^a - c{\min}^a}{R^a} - 0.5\right) \times 100$$

This maps the full range of each axis to $[-50, 50]$, preserving relative distances within the UMAP projection.

1.5 Proximity Clustering

After normalization, a greedy proximity-based clustering algorithm groups spatially adjacent points into semantic clusters.

Distance Metric: Euclidean (L2 norm) in normalized scene space with threshold $\tau = 20$ units.

Centroid Computation: Arithmetic mean of member positions.

Label Assignment: Modal category (most frequent category among members).

1.6 Visual Encoding System

The constellation encodes five semantic dimensions into visual properties:

Category → Color

Category Hex Perceptual Intent
Security #ff5c5c Danger (warm red)
Architecture #60a5fa System (cool blue)
Performance #fbbf24 Optimization (amber)
UX #4ade80 User (green)
Code Quality #c084fc Quality (purple)
Testing #22d3ee Verification (cyan)
Operations #fb923c Execution (orange)
Strategy #a78bfa Planning (violet)

Severity → Particle Size

Severity Multiplier Effective Radius
Blocker 3.0 2.40 units
High 2.0 1.60 units
Medium 1.0 0.80 units
Low 0.7 0.56 units

Insight Type → Pairing Arcs

Pain-solution relationships are encoded as quadratic Bezier curves connecting paired insights. Arc height is proportional to chord length ($h = 0.3 \cdot d(P_0, P_2)$), ensuring visual legibility at all distances.

1.7 Rendering Pipeline: React Three Fiber

The rendering layer uses React Three Fiber (R3F) with Three.js for hardware-accelerated WebGL rendering.

InstancedMesh Optimization: All particles are batched into a single draw call with per-instance transform matrices and colors. At 500 instances, this requires only 38 KB GPU buffer memory.

Material System: meshPhysicalMaterial with clearcoat (1.0), ultra-low roughness (0.08), and high environment map intensity (2.5) creates a liquid-glass aesthetic for particles.

Post-Processing: Bloom post-processing adds glow around highlighted particles (threshold: 0.4, intensity: 1.2). Disabled on devices with navigator.hardwareConcurrency < 4 to maintain 60fps.

1.8 Interaction Model

  • Hover: Pointer over particle brightens it (160%) with a continuous pulse animation
  • Click: Selects particle, opens DetailPanel with full insight text
  • Background click: Deselects
  • Filter: Type/severity/category filters dim non-matching particles to 10% opacity
  • OrbitControls: Auto-rotates at 0.25 rad/s when idle; user drag interrupts rotation

1.9 Performance Optimization

  • Instance batching: Single draw call regardless of particle count
  • Lazy UMAP: Projection runs once on the server, not per-client
  • 500-record cap: Bounds UMAP computation to <200ms for typical datasets
  • Frame-rate scaling: Bloom disabled on low-end hardware
  • Backface culling: Enabled by default on sphere geometry

References

  1. McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426.
  2. Bertin, J. (1983). Semiology of Graphics. University of Wisconsin Press.
  3. Ware, C. (2012). Information Visualization: Perception for Design (3rd ed.). Morgan Kaufmann.