raged is a multi-component system for RAG with enrichment and knowledge graph capabilities.
graph TD
CLI[raged CLI] -->|"POST /ingest"| API[RAG API<br/>:8080]
CLI -->|"POST /query"| API
CLI -->|"POST /enrichment/enqueue"| API
CLI -->|"GET /graph/entity/:name"| API
API -->|"POST /api/embeddings"| OL[Ollama<br/>:11434]
API -->|"upsert / search"| PG[Postgres + pgvector<br/>:5432]
WK[Enrichment Worker] -->|"SKIP LOCKED dequeue"| PG
WK -->|"read/update payload"| PG
WK -->|"NLP + LLM extraction"| OL
WK -->|"upsert entities"| PG
subgraph Storage
PG
OL
end
style API fill:#e1f5fe
style PG fill:#f3e5f5
style OL fill:#e8f5e9
style WK fill:#e0f2f1
Stateless HTTP service exposing core endpoints:
Ingestion & Query:
POST /ingest — Receives text items or URLs (code, docs, PDFs, images, web pages, etc.), optionally fetches URL content server-side with SSRF protection, runs tier-1 extraction, chunks, embeds via Ollama, upserts vectors into Postgres, optionally enqueues enrichmentPOST /query — Classifies query intent via multi-strategy router (semantic, metadata, graph, hybrid), then executes the selected path: embeds query text via Ollama + pgvector similarity search, metadata-only filter scan, graph traversal, or hybrid blended retrieval. Returns a unified response with results, routing, and optional graph fields.POST /query/download-first — Runs query and returns the first match as a downloadable binary (from raw_data or blob store key)POST /query/fulltext-first — Runs query and returns concatenated chunk text for the first matching documentGET /collections — Returns collection-level document/chunk/enrichment countsEnrichment:
GET /enrichment/status/:baseId — Get enrichment status for a documentGET /enrichment/stats — System-wide enrichment statisticsPOST /enrichment/enqueue — Manually trigger enrichment for existing chunksPOST /enrichment/clear — Clear pending/processing/dead enrichment tasks (optional text filter)Knowledge Graph:
GET /graph/entity/:name — Lookup entity details and connections in PostgresHealth:
GET /healthz — Always unauthenticated, returns { ok: true }Stores embedding vectors with metadata in Postgres tables using the pgvector extension. Each collection holds vectors of a fixed dimension (768 for nomic-embed-text) with cosine distance search support.
Metadata per chunk:
text — the original chunk textsource — source URL or pathchunkIndex — position of chunk within the original documentenrichmentStatus — none, pending, processing, enriched, or failedtier1, tier2, tier3 — metadata from extraction tiers (when enriched)repoId, repoUrl, path, lang, bytes — indexing metadata (present when ingested via CLI)Runs the nomic-embed-text model locally for embeddings, and LLM models (llama3, llava) for tier-3 extraction. The API calls Ollama’s /api/embeddings endpoint for each text chunk. Produces 768-dimensional vectors.
Holds enrichment tasks using a Postgres table with SKIP LOCKED for concurrent processing:
task_queue table (queue = 'enrichment') — tasks with status trackingFOR UPDATE SKIP LOCKED to claim tasks without contentionStores entities and relationships extracted from documents in Postgres tables. Supports graph traversal for hybrid vector+graph retrieval.
Database schema includes: entities table (with columns: name, type, description), relationships table (with source, target, relationship type)
Relationship types: Configurable based on extraction (e.g., uses, relates_to, mentions)
Async background service that:
FOR UPDATE SKIP LOCKEDtier2/tier3 metadataWhen a url field is provided (and text is omitted), the API performs server-side content fetching:
sequenceDiagram
participant Client
participant API as RAG API
participant SSRF as SSRF Guard
participant Web as External Web
participant Extract as Content Extractor
participant Embed as Ollama
participant PG as Postgres
Client->>API: POST /ingest {url}
API->>SSRF: Validate URL (no private IPs, DNS rebind check)
SSRF-->>API: ✓ Safe
API->>Web: Fetch URL content
Web-->>API: HTML/PDF/JSON/text
API->>Extract: Extract text (Readability/pdf-parse/passthrough)
Extract-->>API: Extracted text + metadata
API->>API: Chunk text
API->>Embed: Embed chunks
Embed-->>API: Vectors
API->>PG: Upsert with fetch metadata
PG-->>API: Success
API-->>Client: {ok: true, upserted: N}
SSRF Protection:
Supported Content Types:
text/html — Readability article extraction (jsdom + @mozilla/readability)application/pdf — pdf-parse text extraction with page metadatatext/plain, text/markdown — passthroughapplication/json — pretty-printed JSON as textError Handling:
Partial success model — successfully fetched items are ingested, failures are returned in errors array with per-URL status and reason.
Command-line tool with five commands:
index — Clone Git repo and index filesquery — Search for similar chunksingest — Ingest arbitrary files (PDFs, images, text) or URLs with --url flagenrich — Trigger/monitor enrichment tasksgraph — Query knowledge graph entitiessequenceDiagram
participant U as User / Agent
participant C as CLI
participant A as RAG API
participant O as Ollama
participant P as Postgres
U->>C: raged index --repo <url>
C->>C: git clone (shallow)
C->>C: Scan files, filter, read text
loop Batch of 50 files
C->>A: POST /ingest { items, enrich: true }
A->>A: Detect doc type, tier-1 extraction
A->>A: chunkText() per item
loop Per chunk
A->>O: POST /api/embeddings
O-->>A: 768d vector
end
A->>P: upsert(chunks with enrichmentStatus: pending)
P-->>A: OK
A->>P: INSERT enrichment task
A-->>C: { ok, upserted }
end
C-->>U: Done. repoId=<id>
W->>P: SELECT FOR UPDATE SKIP LOCKED
P-->>W: Task { baseId, collection, totalChunks }
W->>P: Get chunks by baseId
P-->>W: Chunk records
W->>W: Tier-2: spaCy NER, keywords, lang
W->>O: Tier-3: LLM extraction
O-->>W: Summary, entities
W->>P: Update chunks with tier2/tier3
W->>P: Upsert entities + relationships
P-->>W: OK
sequenceDiagram
participant U as User / Agent
participant C as CLI
participant A as RAG API
participant O as Ollama
participant P as Postgres
U->>C: raged index --repo <url>
C->>C: git clone (shallow)
C->>C: Scan files, filter, read text
loop Batch of 50 files
C->>A: POST /ingest { items }
A->>A: chunkText() per item
loop Per chunk
A->>O: POST /api/embeddings
O-->>A: 768d vector
end
A->>P: upsert(chunks)
P-->>A: OK
A-->>C: { ok, upserted }
end
C-->>U: Done. repoId=<id>
sequenceDiagram
participant U as User / Agent
participant C as CLI
participant A as RAG API
participant O as Ollama
participant P as Postgres
U->>C: raged query --q "auth flow"
C->>A: POST /query { query, topK }
A->>O: POST /api/embeddings { prompt }
O-->>A: 768d vector
A->>P: search(vector, limit, filter) using pgvector
P-->>A: Ranked results
A-->>C: { results: [...], routing: { strategy: "semantic", ... } }
C-->>U: Display results
The router classifies each query into one of four strategies before execution.
sequenceDiagram
participant C as Client
participant R as Router
participant E as Rule Engine
participant L as LLM (optional)
C->>R: POST /query { query, filter?, strategy? }
alt Explicit override
R-->>C: strategy = explicit
else Rule engine
R->>E: Evaluate rules (keyword, filter-only, graph pattern)
E-->>R: strategy + confidence
R-->>C: strategy = rule
else LLM fallback
R->>L: Classify intent
L-->>R: strategy
R-->>C: strategy = llm
else Default
R-->>C: strategy = semantic (default)
end
Note over R,C: routing.strategy: semantic | metadata | graph | hybrid
Strategies:
1.0sequenceDiagram
participant U as User / Agent
participant C as CLI
participant A as RAG API
participant O as Ollama
participant P as Postgres
U->>C: raged query --q "auth flow" --strategy graph
C->>A: POST /query { query, topK, strategy: "graph" }
A->>O: POST /api/embeddings { prompt }
O-->>A: 768d vector
A->>P: search(vector, limit, filter) using pgvector
P-->>A: Ranked results with tier2/tier3 metadata
A->>A: Extract entities from results
A->>P: expandEntities(entities, depth=2) via relationships table
P-->>A: Expanded entity graph
A-->>C: { results: [...], graph: {...}, routing: { strategy: "graph", ... } }
C-->>U: Display results + related entities
flowchart LR
R[Request] --> H{Has<br/>Authorization<br/>header?}
H -->|No| R401[401 Unauthorized]
H -->|Yes| P{Bearer prefix?}
P -->|No| R401
P -->|Yes| T{Timing-safe<br/>compare with<br/>RAGED_API_TOKEN}
T -->|Mismatch| R401
T -->|Match| OK[Request proceeds]
HZ[GET /healthz] --> OK
style R401 fill:#ffcdd2
style OK fill:#c8e6c9
RAGED_API_TOKEN is empty)/healthz always bypasses auth