Project Sync
Project Sync
Clicking "Sync" in a nyxCore project triggers a 9-phase pipeline. It doesn't just import files — it runs the entire intelligence extraction stack and computes new knowledge from what it finds.
The 9 Phases
Prepare → Scan → Import → Finalize → Code Analysis → Docs → Consolidation → Axiom → Embeddings
Phase 1 — Prepare: Clone the repository if it's new, or pull the latest commits on the active branch. Validates the GitHub token and resolves the target branch.
Phase 2 — Scan: Diff against the last sync to identify new, updated, and deleted files. Only changed files proceed to import — this keeps sync times proportional to change volume, not total repository size.
Phase 3 — Import: Write new and updated files to RepositoryFile records. Import .memory/letter_*.md files as MemoryEntry records, creating new ones and updating existing ones by sourceRef. Files are processed in batches of 50.
Phase 4 — Finalize: Update repository stats (total files, last synced timestamp). Clean up records for deleted files.
Phase 5 — Code Analysis: Run LLM-powered pattern detection across the codebase. Detects architecture patterns, naming conventions, security practices, test coverage approaches, dependency usage, and anti-patterns. Results are stored as CodePattern records on the repository.
Phase 6 — Docs Generation: Auto-generate documentation from detected code patterns. Produces README drafts, API documentation, architecture overviews, and coding standards documents. Each is stored as a GeneratedDoc record.
Phase 7 — Consolidation: Extract cross-project patterns from all MemoryEntry records associated with the project. The consolidation pipeline runs content budgeting (max 500K chars total, 15K per entry) then LLM extraction to produce structured ConsolidationPattern records.
Phase 8 — Axiom RAG: Chunk and process generated documentation into the Axiom knowledge base. Each document is split into ~500 token chunks, embedded via text-embedding-3-small, and stored with pgvector HNSW indexing for hybrid search retrieval.
Phase 9 — Embeddings: Generate vector embeddings for all new WorkflowInsight records created during this sync cycle. This makes new insights immediately available for semantic search and future memory injection.
Non-Fatal Design
Phases 5–9 are non-fatal. If code analysis fails (missing LLM key, API error, malformed repository), the pipeline logs a warning and continues to the next phase. Your files are always imported successfully even if the intelligence layers fail.
try { executePhase() }
catch (error) { yield { type: 'warning', message: '[WARN] Phase N failed: ...' } }
// continue to next phase
This means a sync never leaves your repository in a partial state — it either completes all core phases or reports exactly which intelligence phases failed.
Sync Statistics
After each sync, SyncStats records the outcome of every phase:
interface SyncStats {
totalFiles: number;
filesNew: number;
filesUpdated: number;
filesDeleted: number;
memoryNew: number;
memoryUpdated: number;
patternsFound: number;
docsGenerated: number;
consolidationPatterns: number;
axiomDocsProcessed: number;
embeddingsGenerated: number;
}
These stats are displayed in the sync progress view and stored for historical comparison.
Typical Duration
For a repository with 500 files and 50 memory entries:
| Phase | Typical Duration | Parallelism |
|---|---|---|
| Prepare | 2–5s | Sequential |
| Scan | 1–3s | Sequential |
| Import | 5–15s | Batch (50 files) |
| Finalize | <1s | Sequential |
| Code Analysis | 10–30s | Per-file |
| Docs | 5–15s | Sequential |
| Consolidation | 3–10s | Sequential |
| Axiom | 5–20s | Per-document |
| Embeddings | 2–8s | Batch |
Full sync: 33–107 seconds. Incremental sync (few changed files): 5–20 seconds.
The Intelligence Flywheel
Each sync doesn't just import files — it compounds knowledge.
Sync
→ Code patterns (architecture, naming, security)
→ Auto-generated docs
→ Memory entries from session checkpoints
→ Consolidated cross-project patterns
→ RAG knowledge base chunks
→ Vector embeddings for semantic search
→ Smarter workflows ({{consolidations}}, {{project.wisdom}}, {{axiom}})
→ Better workflow insights
→ More embeddings
→ Back to smarter workflows
The flywheel effect means each workflow run produces insights, which get embedded, which make the next workflow smarter — all triggered by a sync.
Knowledge Growth Model
The knowledge base grows superlinearly with syncs. Each sync adds direct knowledge $K_d$ (files, memory entries) plus derived knowledge $K_i$ from analysis:
$$K_{total}(s) = \sum_{i=1}^{s} \left( K_d^{(i)} + K_i^{(i)} \right)$$
Where derived knowledge scales with the existing base:
$$K_i^{(s)} = \alpha \cdot K_d^{(s)} + \beta \cdot \log\left(1 + K_{total}(s-1)\right)$$
- $\alpha$ ≈ 3–5x: each source file yields multiple patterns
- $\beta$: cross-project patterns compound logarithmically
In practice, after $n$ syncs across $p$ projects:
$$\text{Searchable Knowledge} = n \cdot \bar{f} + n \cdot \bar{m} + p \cdot C_{patterns} + D_{generated} + E_{vectors}$$
Where $\bar{f}$ = avg files per sync, $\bar{m}$ = avg memory entries, and the remaining terms represent consolidation patterns, generated docs, and embedded insights.
Branch Management
Each repository tracks:
activeBranch— the branch currently selected for synclastSyncedBranch— the branch used in the last completed sync
When these differ, the dashboard shows a "Resync required" warning. Changing the active branch without resyncing means your knowledge base reflects the old branch's state.
Use the BranchSelector to switch branches, then sync immediately to keep the knowledge base current.
Manual Trigger vs. Webhook
You can sync manually from the project dashboard, or configure a GitHub webhook to trigger sync automatically on push events to the active branch.
Webhook sync triggers the same 9-phase pipeline. The only difference is the trigger source — the pipeline behavior is identical.
