GitHub Connector
17. GitHub Connector Service
The GitHub connector (src/server/services/github-connector.ts) provides authenticated access to the GitHub REST API using the BYOK (Bring Your Own Key) pattern. It handles repository enumeration, memory file synchronization, file tree retrieval, and write operations (branch creation, file commits, pull requests). Every call is scoped to a tenant's encrypted GitHub PAT stored in the api_keys table.
17.1 Token Resolution (BYOK)
GitHub authentication follows the same BYOK pattern as LLM providers. The tenant's personal access token is stored encrypted (AES-256-GCM) in the api_keys table and decrypted per-request via the shared decrypt() utility from the crypto service.
export async function resolveGitHubToken(tenantId: string): Promise<string> {
const apiKey = await prisma.apiKey.findFirst({
where: { tenantId, provider: "github" },
orderBy: { updatedAt: "desc" },
});
if (!apiKey) {
throw new Error(
"No GitHub token configured. Add a GitHub personal access token in Admin > API Keys."
);
}
return decrypt(apiKey.encryptedKey);
}
When multiple GitHub keys exist for a tenant, the most recently updated one is used (orderBy: { updatedAt: "desc" }).
Data Model: ApiKey
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key |
tenantId |
UUID |
Tenant scope (RLS enforced) |
userId |
UUID |
Key creator |
name |
String |
User-defined label |
provider |
String |
"github" for GitHub PATs |
encryptedKey |
Text |
AES-256-GCM format: v1:<iv>:<tag>:<ciphertext> |
lastUsedAt |
DateTime? |
Last access timestamp |
expiresAt |
DateTime? |
Optional expiration |
17.2 ghFetch Helper
All GitHub API calls go through a typed generic helper that handles URL construction, auth headers, and error handling:
async function ghFetch<T>(
token: string,
endpoint: string,
options?: { method?: string; body?: unknown }
): Promise<T>
Key behaviors:
- Accepts both relative paths (
/user/repos) and absolute URLs (https://api.github.com/...) - Sets
Authorization: Bearer,Accept: application/vnd.github+json, andX-GitHub-Api-Version: 2022-11-28 - Adds
Content-Type: application/jsononly when a body is present - Throws on non-2xx responses with the status code and full response body text
17.3 Repository Enumeration (fetchRepos)
fetchRepos builds a comprehensive list of repositories accessible to the authenticated user. It combines two data sources in parallel for completeness, then deduplicates.
Pagination limits:
- User repos: up to 2 pages of 100 (200 repos max)
- Org repos: up to 3 pages per org (300 repos/org), capped at 10 organizations
- Organization enumeration: 1 page (100 orgs)
Deduplication: Repos are deduped by full_name. User repos take priority -- an org-fetched duplicate only enters the map if the key is not already present. The fetchUserOrgs call fails gracefully (returns []) if the PAT lacks read:org scope. Individual fetchOrgRepos calls also fail gracefully, returning whatever was fetched before the error.
17.4 Memory Path Operations
checkMemoryPath
Checks if a .memory directory (or custom path) exists in a repo and counts letter*.md files:
export async function checkMemoryPath(
token: string, owner: string, repo: string, path = ".memory"
): Promise<{ exists: boolean; fileCount: number }>
Files are matched with the regex /^letter.*\.md$/i. Returns { exists: false, fileCount: 0 } on any error (missing path, auth failure, 404).
fetchMemoryFiles
Fetches full content of all letter*.md files from the memory directory, sorted newest-first by filename (localeCompare descending). Each file is parsed into a GitHubMemoryEntry:
interface GitHubMemoryEntry {
title: string; // From markdown heading, YAML frontmatter, or filename
content: string; // Raw markdown file content
sourceRef: string; // "github://owner/repo/.memory/letter_YYYYMMDD_XXXX.md"
tags: string[]; // From YAML frontmatter or default ["memory", "imported"]
createdAt: Date; // From filename date pattern or current date
}
Content is fetched via download_url from the Contents API response, with the Bearer token passed in the request headers. Files without a download_url are silently skipped.
listMemoryFiles
Metadata-only variant for selection UIs. Returns { name, path, sha, sourceRef } without fetching file content. Uses the same filtering regex and descending sort as fetchMemoryFiles.
17.5 File Content & Tree Operations
fetchFileContent
Fetches a single file's raw content. Two-step process: first resolves the download_url from the Contents API endpoint, then fetches the raw text with the auth token. Throws if download_url is null or the download request fails.
fetchRepoTree
Uses the Git Trees API (/git/trees/{branch}?recursive=1) to get a flat list of all file paths in a repository. Only blob entries (files, not directories) are returned.
Branch fallback: If the branch parameter is "main" and the API call fails, the function automatically retries with "master". Any other branch value throws on failure.
IGNORE_PATTERNS -- the following paths and files are filtered out:
| Pattern | Matches |
|---|---|
^node_modules/ |
Node.js dependencies |
^\.git/ |
Git internals |
^\.next/ |
Next.js build output |
^dist/, ^build/ |
Compiled output |
^coverage/ |
Test coverage |
^\.cache/, ^\.turbo/ |
Build caches |
^vendor/ |
Vendored dependencies |
^__pycache__/ |
Python bytecode |
\.lock$ |
Generic lockfiles |
^package-lock\.json$ |
npm lockfile |
^yarn\.lock$ |
Yarn lockfile |
^pnpm-lock\.yaml$ |
pnpm lockfile |
The workflow engine further caps tree output at FILE_TREE_MAX_ENTRIES = 500 entries when injecting into prompts via {{fileTree}}.
17.6 Write Operations
Three functions support creating branches, committing files, and opening PRs. These are used by the auto-fix and refactor pipelines to push generated fixes back to GitHub.
createBranch
Resolves the SHA of fromRef (default "main") via /git/ref/heads/{fromRef}, then creates a new ref at that SHA via POST to /git/refs.
createOrUpdateFile
Creates or updates a file in a repository. Content is Base64-encoded via Buffer.from(content).toString("base64"). If sha is provided, the existing file is updated (required by the GitHub API for updates); otherwise a new file is created.
createPullRequest
Creates a pull request and returns { number, html_url }. Parameters: title, body, head branch, and base branch (default "main").
getFileSha
Retrieves the SHA of an existing file at a specific path and branch. Returns null if the file does not exist, making it safe to call before createOrUpdateFile to determine create-vs-update behavior.
17.7 Database Sync (syncToDatabase)
syncToDatabase persists fetched GitHubMemoryEntry records to the memory_entries table. It uses sourceRef as the deduplication key scoped to tenantId + userId, so the same letter file is never imported twice for the same user.
export async function syncToDatabase(
tenantId: string, userId: string, projectId: string,
entries: GitHubMemoryEntry[]
): Promise<{ created: number; skipped: number }>
Each entry is created as a MemoryEntry with:
source: "github"sourceRef: "github://owner/repo/.memory/letter_YYYYMMDD_XXXX.md"metadata: { projectId }linking back to the nyxCore project
The return value reports how many entries were created vs. skipped (already existing).
17.8 Memory File Parsing Helpers
Three private helpers extract structured metadata from raw letter files:
| Helper | Strategy | Fallback |
|---|---|---|
extractTitle(content, filename) |
1. First # heading in markdown, 2. YAML frontmatter title: field |
Filename with .md stripped and _ replaced by spaces |
extractTags(content) |
YAML frontmatter tags: [...] array, comma-separated |
["memory", "imported"] |
extractDateFromFilename(filename) |
Regex match on letter_YYYYMMDD pattern |
new Date() (current timestamp) |
The tag extraction strips surrounding quotes (single or double) from each tag value. The date extraction specifically matches the letter_YYYYMMDD prefix -- the trailing _XXXX sequence number is ignored for date purposes.
