REST API Reference

All endpoints are served by ingest serve (default: http://localhost:8081).

Authentication: when INGEST_API_KEYS is set, all requests require Authorization: Bearer <key>. Health, docs, and metrics endpoints are exempt.

Rate limiting: all endpoints are rate-limited. Defaults: ingestion 10/min, search 60/min, others 120/min. Returns 429 Too Many Requests when exceeded. Configure via INGEST_RATE_LIMIT_* env vars.

Upload limits: file uploads are capped at 500 MB by default (INGEST_MAX_UPLOAD_BYTES). Returns 413 when exceeded.

Document IDs: all {doc_id} path parameters are validated to prevent path traversal. Only lowercase alphanumeric characters, hyphens, and underscores are allowed.

Access control: when INGEST_ACCESS_CONTROL_ENABLED=true, search endpoints filter results by the X-Access-Tags header (comma-separated tags). Documents tagged during ingestion are only returned to users with matching tags. Untagged documents are public.

User identity: extracted from X-User-ID header (if set by auth proxy), then bearer token prefix, then anonymous. Used for audit logging.

Interactive docs: FastAPI auto-generates OpenAPI docs at /docs (Swagger UI), /redoc, and /openapi.json.


Ingestion

Method Path Description
POST /ingest Ingest a file upload
POST /ingest/url Ingest from a URL
POST /ingest/batch Ingest multiple files
POST /ingest/stream Ingest with SSE progress streaming
POST /ingest/async Submit file for background ingestion (returns 202)

POST /ingest

Upload a document for ingestion. Supports 25+ file formats (PDF, DOCX, HTML, Markdown, audio, video, spreadsheets, etc.). Runs the full 6-stage pipeline: parse → structure → chunk → enrich → embed → store.

  • Content-Type: multipart/form-data
  • Form fields:
Field Type Default Description
file file (required) Document file to ingest
skip_enrichment bool false Skip LLM enrichment (faster, but no summaries/concepts/questions)
profile string "auto" Extraction profile: auto, fast, accurate, or economy
access_tags string "" Comma-separated access control tags
weight float 0.0 Corpus search weight boost (0.0 = neutral)
pinned bool false Pin document to always appear in corpus search results
tags string "" Comma-separated metadata tags for filtering
  • Returns: IngestResponse
{
  "doc_id": "my-document",
  "title": "My Document Title",
  "total_pages": 42,
  "total_chunks": 156,
  "l0_tokens": 1200,
  "enriched": true
}

POST /ingest/url

Ingest a document from a URL. Downloads the resource, detects content type, and runs the full pipeline.

  • Content-Type: application/json
  • Body:
Field Type Default Description
url string (required) URL of the document to ingest
skip_enrichment bool false Skip LLM enrichment
profile string "auto" Extraction profile
  • Returns: IngestResponse

POST /ingest/batch

Ingest multiple files in a single request. Each file is processed independently with error isolation — one failure does not abort the batch.

  • Content-Type: multipart/form-data
  • Form fields:
Field Type Default Description
files file[] (required) One or more document files
skip_enrichment bool false Skip LLM enrichment for all files
  • Returns: BatchIngestResponse
{
  "total": 3,
  "succeeded": 2,
  "failed": 1,
  "elapsed_seconds": 45.2,
  "results": [
    { "filename": "doc1.pdf", "doc_id": "doc1", "error": null, "elapsed_seconds": 12.1 },
    { "filename": "doc2.pdf", "doc_id": "doc2", "error": null, "elapsed_seconds": 15.3 },
    { "filename": "bad.pdf", "doc_id": null, "error": "Parsing failed: corrupted PDF", "elapsed_seconds": 0.8 }
  ]
}

POST /ingest/stream

Ingest a document with real-time Server-Sent Events progress streaming.

  • Content-Type: multipart/form-data
  • Form fields:
Field Type Default Description
file file (required) Document file to ingest
skip_enrichment bool false Skip LLM enrichment
  • Returns: text/event-stream with JSON progress events
data: {"stage": "parsing", "step": 1, "total": 6}
data: {"stage": "structuring", "step": 2, "total": 6}
data: {"stage": "chunking", "step": 3, "total": 6}
data: {"stage": "enriching", "step": 4, "total": 6}
data: {"stage": "embedding", "step": 5, "total": 6}
data: {"stage": "storing", "step": 6, "total": 6}
data: {"status": "complete", "doc_id": "my-document"}

On error: data: {"status": "error", "error": "..."}


Background Jobs

Method Path Description
POST /ingest/async Submit file for background ingestion
GET /jobs/{job_id} Get job status and progress
GET /jobs List all ingestion jobs

POST /ingest/async

Submit a document for background ingestion. Returns immediately with a job ID.

  • Content-Type: multipart/form-data
  • Form fields:
Field Type Default Description
file file (required) Document file to ingest
skip_enrichment bool false Skip LLM enrichment
  • Returns: 202 Accepted
{
  "job_id": "a1b2c3d4",
  "status": "pending"
}

GET /jobs/{job_id}

Poll for job status and progress.

  • Returns: JobDetailResponse
{
  "job_id": "a1b2c3d4",
  "status": "running",
  "doc_id": null,
  "error": null,
  "created_at": 1711500000.0,
  "started_at": 1711500001.0,
  "finished_at": null,
  "progress_stage": "enriching",
  "progress_step": 12,
  "progress_total": 45
}
  • Status values: pending, running, complete, failed

GET /jobs

List all background ingestion jobs (pending, running, complete, and failed).

  • Returns: JobDetailResponse[]

Documents

Method Path Description
GET /documents List all ingested documents
GET /documents/{doc_id} Get document overview (L0 + metadata)
PATCH /documents/{doc_id}/meta Update mutable metadata
DELETE /documents/{doc_id} Delete a document and its indexes
GET /documents/{doc_id}/chunks Get full ChunkSet (L0-L3)
GET /documents/{doc_id}/structure Get document structure tree
GET /documents/{doc_id}/economics Get token economics
GET /documents/{doc_id}/citations Get extracted citations
GET /documents/{doc_id}/knowledge-graph Get knowledge graph triples and entities
GET /documents/{doc_id}/versions List archived versions
GET /documents/{doc_id}/export Export document to portable format
POST /documents/{doc_id}/enrich Re-enrich without re-parsing

GET /documents

List all ingested documents with pagination and optional tag filtering.

  • Query params:
Param Type Default Description
offset int 0 Pagination offset (0-based)
limit int 100 Max documents to return (1-1000)
tags string "" Comma-separated metadata tags to filter by
  • Returns: DocumentSummary[]
[
  {
    "doc_id": "my-document",
    "title": "My Document Title",
    "total_pages": 42,
    "ingestion_date": "2025-01-15T10:30:00"
  }
]

GET /documents/{doc_id}

Get document overview including L0 content, chapter list, and token count.

  • Returns: DocumentOverview
{
  "doc_id": "my-document",
  "title": "My Document Title",
  "total_pages": 42,
  "ingestion_date": "2025-01-15T10:30:00",
  "l0_content": "My Document Title\nAuthor: Jane Doe\nDomain: Machine Learning\n...",
  "l0_tokens": 1200,
  "total_chunks": 156,
  "chapters": ["Introduction", "Methods", "Results", "Discussion"]
}

PATCH /documents/{doc_id}/meta

Update mutable document metadata without re-ingesting. Only included fields are updated; omitted fields are left unchanged.

  • Content-Type: application/json
  • Body:
Field Type Description
weight float | null Corpus search weight boost (0.0 = neutral)
pinned bool | null Pin document in corpus search results
tags string[] | null Replace metadata tags (pass [] to clear)
  • Returns:
{
  "doc_id": "my-document",
  "weight": 1.5,
  "pinned": true,
  "tags": ["research", "ml"]
}

DELETE /documents/{doc_id}

Delete a document and all its indexes, chunks, and embeddings. This is irreversible.

  • Returns:
{ "deleted": "my-document" }

GET /documents/{doc_id}/chunks

Get the full ChunkSet (L0 through L3) for a document, including all enrichment data (summaries, concepts, hypothetical questions, knowledge graph triples).

  • Returns: ChunkSet — the complete hierarchical chunk tree

GET /documents/{doc_id}/structure

Get the hierarchical document structure tree (headings, sections, page ranges).

  • Returns: DocumentStructure

GET /documents/{doc_id}/economics

Get token economics: compares cost of placing the full document in an LLM context window vs. using hierarchical retrieval (L0 overview + top-k passages).

  • Returns: TokenEconomics
{
  "full_document_tokens": 85000,
  "l0_overview_tokens": 1200,
  "avg_query_tokens": 2400,
  "savings_percent": 97.2
}

GET /documents/{doc_id}/citations

Get bibliographic citations extracted during parsing.

  • Returns:
{
  "doc_id": "my-document",
  "citations": [ ... ]
}

GET /documents/{doc_id}/knowledge-graph

Get knowledge graph triples and entities extracted during enrichment. Requires KG extraction to have been enabled during ingestion. Entities are sorted by occurrence count.

  • Returns:
{
  "doc_id": "my-document",
  "total_triples": 45,
  "total_entities": 23,
  "triples": [
    { "subject": "GPT-4", "predicate": "outperforms", "object": "GPT-3.5", "subject_type": "Model", "object_type": "Model" }
  ],
  "entities": [
    { "name": "GPT-4", "type": "Model", "count": 12 }
  ]
}

GET /documents/{doc_id}/versions

List all archived versions of a document. When a document is re-ingested, the previous version is archived.

  • Returns:
{
  "doc_id": "my-document",
  "versions": [ ... ]
}

GET /documents/{doc_id}/export

Export a document's chunks to a portable format. Returns a file download.

  • Query params:
Param Type Description
format string Required. One of: jsonl, parquet, llamaindex, langchain, complete
  • Formats:

    • jsonl — one chunk per line (NDJSON)
    • parquet — columnar format for data tools
    • llamaindex — LlamaIndex-compatible nodes JSON
    • langchain — LangChain-compatible documents JSON
    • complete — full ChunkSet JSON with all metadata
  • Returns: file download with appropriate content type

POST /documents/{doc_id}/enrich

Re-enrich a document without re-parsing. Re-runs LLM enrichment (stage 4) and embedding (stage 5) on existing chunks. Useful after changing LLM provider or enrichment settings.

  • Returns: IngestResponse

Method Path Description
POST /documents/{doc_id}/search Hybrid search within a document
POST /search Search across all documents

POST /documents/{doc_id}/search

Hybrid search within a single document. Fuses vector search (ChromaDB, weight 0.4), BM25 (weight 0.6), and concept index using Reciprocal Rank Fusion.

  • Content-Type: application/json
  • Body:
Field Type Default Description
query string (required) Natural language search query
n_results int 5 Number of results to return (1-20)
hyde bool false Use HyDE query expansion — generates a hypothetical answer passage via LLM for better vector matching
auto_merge bool false Auto-merge L3 results into parent L2/L1 when 3+ passages from the same section match
kg_boost bool false Boost results using knowledge graph entity relationships
  • Headers: X-Access-Tags (optional, comma-separated) for access control filtering
  • Returns: SearchResponse with ranked results including scores, match sources, content/summaries, page numbers, and timestamps (for audio/video)

POST /search

Search across all ingested documents. Merges per-document hybrid search results using cross-document Reciprocal Rank Fusion.

  • Content-Type: application/json
  • Body:
Field Type Default Description
query string (required) Natural language search query
n_results int 10 Number of results to return (1-50)
doc_ids string[] | null null Restrict search to these document IDs (null = all)
tags string[] | null null Only search documents with these metadata tags
hyde bool false Use HyDE query expansion
auto_merge bool false Auto-merge L3 results into parent L2/L1
kg_boost bool false Boost results using knowledge graph relationships
  • Headers: X-Access-Tags (optional) for access control filtering
  • Returns: SearchResponse with cross-document RRF fusion

Evaluation

Method Path Description
POST /documents/{doc_id}/eval Metric-based retrieval evaluation
POST /documents/{doc_id}/eval/judge LLM-as-judge retrieval evaluation

POST /documents/{doc_id}/eval

Evaluate retrieval quality using synthetic queries generated during enrichment. Returns standard IR metrics.

  • Query params:
Param Type Default Description
k int 5 Number of results to retrieve per query (1-50)
  • Returns:
{
  "doc_id": "my-document",
  "total_queries": 30,
  "k": 5,
  "hit_rate": 0.93,
  "mrr": 0.87,
  "mean_precision_at_k": 0.76,
  "mean_recall_at_k": 0.82
}

POST /documents/{doc_id}/eval/judge

Evaluate retrieval quality using LLM-as-judge scoring. More accurate than metric-based eval but slower and requires LLM API credits.

  • Query params:
Param Type Default Description
k int 5 Number of results to retrieve per query (1-50)
  • Returns: Faithfulness, relevance, and completeness scores (0-1 scale) assessed by the configured LLM provider.

Webhooks

Method Path Description
POST /webhooks/cv Receive CognitiveVault entry updates

POST /webhooks/cv

Receives entry.updated events from CognitiveVault. When a CV entry with an ingest: sourcePath is edited, this endpoint updates the local chunk content and triggers selective re-enrichment.

  • Requires: INGEST_WEBHOOK_SECRET env var for HMAC-SHA256 signature verification
  • Headers: X-Webhook-Signature — HMAC-SHA256 hex digest of the request body
  • Body: JSON envelope with event and payload fields
{
  "event": "entry.updated",
  "payload": {
    "entryId": "...",
    "sourcePath": "ingest:my-document/chunk-abc123",
    "content": "Updated chunk content...",
    "checksum": "sha256hex..."
  }
}
  • Returns:
{
  "status": "ok",
  "doc_id": "my-document",
  "chunk_id": "chunk-abc123",
  "re_enriched": true
}

Health & Observability

Method Path Description
GET /health Liveness check (fast, no deps)
GET /health/ready Readiness check (verifies all dependencies)
GET /metrics Prometheus metrics

GET /health

Fast liveness probe. Returns immediately without checking dependencies.

{ "status": "ok", "documents_count": 12 }

GET /health/ready

Deep readiness check. Verifies data directory is writable, disk space above threshold, embedding model loads, and LLM API is configured.

  • Returns:
{
  "status": "ready",
  "checks": {
    "data_dir": { "status": "ok", "path": "/app/data" },
    "disk_space": { "status": "ok", "free_mb": 4500 },
    "embedding_model": { "status": "ok", "model": "intfloat/e5-large-v2" },
    "llm_api": { "status": "ok", "provider": "anthropic" },
    "documents": { "status": "ok", "count": 12 }
  }
}
  • Overall status is "ready" when all checks pass, "degraded" when any check fails.
  • Individual check statuses: ok, warning, or error.

GET /metrics

Prometheus-format metrics endpoint. Key metrics:

Metric Type Description
ingestible_ingest_duration_seconds histogram Ingestion latency by stage
ingestible_active_ingestions gauge Currently running ingestion jobs
ingestible_llm_calls_total counter LLM API calls by provider and status
ingestible_search_duration_seconds histogram Search query latency

Error Responses

All endpoints return errors as JSON:

{
  "detail": "Human-readable error message"
}
Status Meaning
400 Bad request (invalid doc ID, unknown format, malformed JSON)
401 Missing or invalid API key / webhook signature
404 Document or job not found
413 File too large (exceeds INGEST_MAX_UPLOAD_BYTES)
429 Rate limit exceeded
500 Internal server error