Project Sync

UserDeveloper5 min read

Project Sync

Clicking "Sync" in a nyxCore project triggers a 9-phase pipeline. It doesn't just import files — it runs the entire intelligence extraction stack and computes new knowledge from what it finds.

The 9 Phases

Prepare → Scan → Import → Finalize → Code Analysis → Docs → Consolidation → Axiom → Embeddings

Phase 1 — Prepare: Clone the repository if it's new, or pull the latest commits on the active branch. Validates the GitHub token and resolves the target branch.

Phase 2 — Scan: Diff against the last sync to identify new, updated, and deleted files. Only changed files proceed to import — this keeps sync times proportional to change volume, not total repository size.

Phase 3 — Import: Write new and updated files to RepositoryFile records. Import .memory/letter_*.md files as MemoryEntry records, creating new ones and updating existing ones by sourceRef. Files are processed in batches of 50.

Phase 4 — Finalize: Update repository stats (total files, last synced timestamp). Clean up records for deleted files.

Phase 5 — Code Analysis: Run LLM-powered pattern detection across the codebase. Detects architecture patterns, naming conventions, security practices, test coverage approaches, dependency usage, and anti-patterns. Results are stored as CodePattern records on the repository.

Phase 6 — Docs Generation: Auto-generate documentation from detected code patterns. Produces README drafts, API documentation, architecture overviews, and coding standards documents. Each is stored as a GeneratedDoc record.

Phase 7 — Consolidation: Extract cross-project patterns from all MemoryEntry records associated with the project. The consolidation pipeline runs content budgeting (max 500K chars total, 15K per entry) then LLM extraction to produce structured ConsolidationPattern records.

Phase 8 — Axiom RAG: Chunk and process generated documentation into the Axiom knowledge base. Each document is split into ~500 token chunks, embedded via text-embedding-3-small, and stored with pgvector HNSW indexing for hybrid search retrieval.

Phase 9 — Embeddings: Generate vector embeddings for all new WorkflowInsight records created during this sync cycle. This makes new insights immediately available for semantic search and future memory injection.

Non-Fatal Design

Phases 5–9 are non-fatal. If code analysis fails (missing LLM key, API error, malformed repository), the pipeline logs a warning and continues to the next phase. Your files are always imported successfully even if the intelligence layers fail.

try { executePhase() }
catch (error) { yield { type: 'warning', message: '[WARN] Phase N failed: ...' } }
// continue to next phase

This means a sync never leaves your repository in a partial state — it either completes all core phases or reports exactly which intelligence phases failed.

Sync Statistics

After each sync, SyncStats records the outcome of every phase:

interface SyncStats {
  totalFiles: number;
  filesNew: number;
  filesUpdated: number;
  filesDeleted: number;
  memoryNew: number;
  memoryUpdated: number;
  patternsFound: number;
  docsGenerated: number;
  consolidationPatterns: number;
  axiomDocsProcessed: number;
  embeddingsGenerated: number;
}

These stats are displayed in the sync progress view and stored for historical comparison.

Typical Duration

For a repository with 500 files and 50 memory entries:

Phase Typical Duration Parallelism
Prepare 2–5s Sequential
Scan 1–3s Sequential
Import 5–15s Batch (50 files)
Finalize <1s Sequential
Code Analysis 10–30s Per-file
Docs 5–15s Sequential
Consolidation 3–10s Sequential
Axiom 5–20s Per-document
Embeddings 2–8s Batch

Full sync: 33–107 seconds. Incremental sync (few changed files): 5–20 seconds.

The Intelligence Flywheel

Each sync doesn't just import files — it compounds knowledge.

Sync
 → Code patterns (architecture, naming, security)
 → Auto-generated docs
 → Memory entries from session checkpoints
 → Consolidated cross-project patterns
 → RAG knowledge base chunks
 → Vector embeddings for semantic search
 → Smarter workflows ({{consolidations}}, {{project.wisdom}}, {{axiom}})
 → Better workflow insights
 → More embeddings
 → Back to smarter workflows

The flywheel effect means each workflow run produces insights, which get embedded, which make the next workflow smarter — all triggered by a sync.

Knowledge Growth Model

The knowledge base grows superlinearly with syncs. Each sync adds direct knowledge $K_d$ (files, memory entries) plus derived knowledge $K_i$ from analysis:

$$K_{total}(s) = \sum_{i=1}^{s} \left( K_d^{(i)} + K_i^{(i)} \right)$$

Where derived knowledge scales with the existing base:

$$K_i^{(s)} = \alpha \cdot K_d^{(s)} + \beta \cdot \log\left(1 + K_{total}(s-1)\right)$$

  • $\alpha$ ≈ 3–5x: each source file yields multiple patterns
  • $\beta$: cross-project patterns compound logarithmically

In practice, after $n$ syncs across $p$ projects:

$$\text{Searchable Knowledge} = n \cdot \bar{f} + n \cdot \bar{m} + p \cdot C_{patterns} + D_{generated} + E_{vectors}$$

Where $\bar{f}$ = avg files per sync, $\bar{m}$ = avg memory entries, and the remaining terms represent consolidation patterns, generated docs, and embedded insights.

Branch Management

Each repository tracks:

  • activeBranch — the branch currently selected for sync
  • lastSyncedBranch — the branch used in the last completed sync

When these differ, the dashboard shows a "Resync required" warning. Changing the active branch without resyncing means your knowledge base reflects the old branch's state.

Use the BranchSelector to switch branches, then sync immediately to keep the knowledge base current.

Manual Trigger vs. Webhook

You can sync manually from the project dashboard, or configure a GitHub webhook to trigger sync automatically on push events to the active branch.

Webhook sync triggers the same 9-phase pipeline. The only difference is the trigger source — the pipeline behavior is identical.