RAGed

Vision

raged is a multi-agent memory hub: a shared retrieval-augmented generation (RAG) layer with enrichment and knowledge graph capabilities.

Why

AI agents work best with relevant context, but stuffing entire knowledge bases into a model’s context window is wasteful and expensive. raged keeps the heavy retrieval work outside the model loop: ingest once, query many times, return only what’s relevant.

Vector search alone finds semantically similar content. But real knowledge has structure — docs reference code, emails discuss designs, repos depend on libraries. The knowledge graph captures these relationships, enabling retrieval that follows connections, not just similarity. The combination is more powerful than either alone:

Query type	Vector DB	+ Graph DB
“Find code about auth”	Semantic match	Same
“What docs reference this function?”	Can’t	Follow edges
“Show the email thread behind this design”	Can’t	Traverse relationships
“What depends on this library?”	Can’t	Dependency graph
“Find auth code AND everything connected to it”	Partial	Hybrid: similarity + graph neighbors

Architecture Overview

graph TD
    A1[AI Agent 1<br/>Claude Code] -->|query| CLI[raged CLI]
    A2[AI Agent 2<br/>OpenClaw] -->|query| CLI
    A3[AI Agent N] -->|HTTP| API
    CLI -->|HTTP| API[RAG API<br/>Fastify]
    API -->|embed| OL[Ollama<br/>nomic-embed-text]
    API -->|similarity search| PG[Postgres + pgvector]
    API -->|entity traversal| PG
    API -->|enqueue task| PG
    WK[Enrichment Worker] -->|extract entities| PG
    WK -->|process tasks| PG
    WK -->|update chunks| PG
    CLI -->|ingest| API

    style API fill:#e1f5fe
    style PG fill:#f3e5f5
    style OL fill:#e8f5e9
    style WK fill:#e0f2f1

Roadmap

v0.5 — MVP (completed)

What exists:

HTTP API: /ingest and /query endpoints — content-agnostic, accepts any text
CLI indexer: bulk-index Git repos (clone, chunk, ingest via API)
Bearer token authentication
Docker Compose for local development
Helm chart for Kubernetes deployment
In-cluster indexing Job
Agent integrations: Claude Code skill, OpenClaw AgentSkill

v1.0 — Enrichment & Knowledge Graph ✅ (completed)

Metadata extraction pipeline:

✅ Tiered extraction: tier-1 (sync, heuristic/AST/EXIF) → tier-2 (async, spaCy NLP) → tier-3 (async, LLM)
✅ 8 document types: code, Slack, email, meeting notes, images, PDFs, articles, text
✅ Auto-detection: Document type inference from file extension and content
✅ Pluggable LLM adapter: Ollama (local), Anthropic, OpenAI with smart model routing
✅ Async enrichment worker in Python with Postgres task queue
✅ Retry logic: Exponential backoff with dead-letter queue for failed tasks
✅ Status tracking: Per-document enrichment status via /enrichment/status/:baseId

Knowledge graph:

✅ Postgres graph storage: Entity and relationship storage with indexed lookups
✅ Entity extraction: NER via spaCy (tier-2) and LLM-based extraction (tier-3)
✅ Relationship extraction: Automatic discovery of entity relationships
✅ Hybrid retrieval: Vector search + graph expansion via graphExpand parameter
✅ Graph queries: Direct entity lookup via /graph/entity/:name
✅ Document linking: Track which documents mention which entities

API endpoints:

✅ GET /enrichment/status/:baseId — Check enrichment status for a document
✅ GET /enrichment/stats — System-wide enrichment statistics
✅ POST /enrichment/enqueue — Manually trigger enrichment tasks
✅ GET /graph/entity/:name — Query entity details and connections

CLI enhancements:

✅ raged ingest — Ingest arbitrary files (PDFs, images, Slack exports)
✅ raged enrich — Trigger and monitor enrichment with --force and --stats-only flags
✅ raged graph — Query knowledge graph entities
✅ --no-enrich / --doc-type flags on ingest commands (enrichment is on by default when enabled server-side)

Infrastructure:

✅ Docker Compose profiles: --profile enrichment for full stack (Postgres, worker)
✅ Helm chart: all enrichment resources gated on enrichment.enabled
✅ Backwards-compatible: existing API/CLI behavior unchanged when enrichment disabled
✅ Environment-driven configuration: ENRICHMENT_ENABLED, POSTGRES_URL

v2.0 — Production Hardening + Multi-Agent Hub (planned)

Production hardening:

Testing: Comprehensive unit and integration test coverage
Input validation: JSON Schema on all API routes (in progress)
Multiple embedding providers: Adapter pattern — swap Ollama for OpenAI, Cohere, or local alternatives
Alternative storage backends: Postgres today, support for Pinecone/Weaviate/Qdrant via adapters
Rate limiting and request throttling
Structured logging and health checks (beyond /healthz)
API versioning (/v1/ingest, /v1/query)

Multi-agent hub:

Multi-tenancy: Isolated collections per team/project with scoped tokens
Agent authentication: Per-agent API keys with fine-grained permissions
Cross-collection search: Federated queries across multiple collections
Real-time sync: Webhook-triggered re-indexing when content sources change
Agent collaboration: Shared memory spaces where multiple agents contribute and query
Observability: Distributed tracing, query analytics, embedding cache hit rates
SDK/client libraries: TypeScript, Python, Go clients (beyond CLI)

Principles

Stateless API, stateful storage. The API process holds no state. Scale it horizontally.
Local-first. Docker Compose must always work. Cloud deployment is optional.
Agent-agnostic. Not tied to any single agent. Claude Code, OpenClaw, or any agent that can call HTTP or shell out to a CLI can use raged.
Content-agnostic. Not just for code. Any text content — docs, articles, emails, transcripts, PDFs, images — is a first-class citizen.
Minimal dependencies. Every dependency must justify its existence.
Security by default. Auth is optional locally, mandatory in production.

This site is open source. Improve this page.