RAGed

In-Cluster Indexing

Run the indexer as a Kubernetes Job inside the cluster, close to the API and Postgres — no need to upload repo contents from your laptop.

How It Works

sequenceDiagram
    participant H as Helm
    participant J as Indexer Job
    participant G as GitHub
    participant A as RAG API
    participant P as Postgres

    H->>J: Create Job (helm install/upgrade)
    J->>G: git clone --depth 1
    J->>J: Scan and read files
    loop Batches
        J->>A: POST /ingest (in-cluster HTTP)
        A->>P: upsert chunks & vectors
    end
    J-->>H: Job complete

Enable

helm upgrade --install rag ./chart -n rag --create-namespace \
  --set api.auth.enabled=true \
  --set api.auth.token=REPLACE_ME \
  --set indexer.enabled=true \
  --set indexer.repoUrl=https://github.com/<org>/<repo>.git \
  --set indexer.repoId=my-repo \
  --set indexer.branch=main

Note: The chart now defaults to official GHCR images (ghcr.io/mfittko/raged-api and ghcr.io/mfittko/raged). Override with --set api.image.repository and --set indexer.image.repository if using custom builds.

Indexer Values

Value Default Description
indexer.enabled false Create the indexer Job
indexer.image.repository ghcr.io/mfittko/raged Indexer image
indexer.image.tag latest Indexer image tag
indexer.repoUrl "" Git repository URL to index
indexer.repoId "" Stable identifier for the repo
indexer.branch "" Branch to clone
indexer.collection docs Target collection name
indexer.token "" API auth token

Benefits

Private Repositories

For private repos, you need to provide Git SSH credentials to the indexer Job. This requires mounting an SSH key as a Kubernetes Secret.

Status: Planned for a future iteration. For now, use public repos or run the CLI locally with SSH access.

Monitoring the Job

# Check Job status
kubectl get jobs -n rag

# View indexer logs
kubectl logs -n rag -l app=rageder --tail 100