Documents in.
Knowledge out.

Turn documents into token-efficient, searchable knowledge stores for AI. Hybrid search across vector, BM25, and concept indexes — each query costs ~1,000 tokens instead of 90,000.

513-page book

92,598~1,317-99%

tokens per query

55-page paper

4,975~585-88%

tokens per query

Everything you need for RAG

From parsing to search, Ingestible handles the full pipeline so you can focus on building your AI application.

25+ formats

PDF, DOCX, HTML, EPUB, PPTX, XLSX, CSV, Markdown, audio, video, images, email, ZIP archives, and more.

4-level hierarchy

Document overview, chapters, sections, passages. No mid-paragraph splits. Tables and code stay atomic.

Hybrid search

Vector + BM25 + concept index fused with RRF. Optional cross-encoder reranking, HyDE, and knowledge graph retrieval.

LLM enrichment

Summaries, concepts, hypothetical questions, entities, and knowledge graph triples. Bottom-up from passages to document.

Production ready

Rate limiting, Prometheus metrics, auth middleware, background ingestion with checkpoint/resume, Docker-ready.

MCP + API + CLI

Three entry points. REST API with FastAPI, CLI for automation, MCP server for AI agent integration.

25+ formats supported

Drop in documents, media, archives, or markup. Ingestible handles parsing, cleaning, and structure extraction automatically.

Documents

PDFDOCXPPTXXLSXCSVEPUB

Web & Markup

HTMLMarkdownRSTAsciiDocXMLJSON/JSONL

Media

MP3WAVFLACMP4MKVAVIPNGJPGTIFF

Other

Email (EML/MSG)ZIP ArchivesPlain Text

How it works

A six-stage pipeline transforms raw documents into queryable knowledge stores with hybrid search.

01

Parse

Format-specific extraction to clean markdown. PDF uses IBM Docling for deep layout analysis.

02

Structure

Builds hierarchy tree from TOC, heading patterns, or page range heuristics.

03

Chunk

4-level split (L0-L3). Tables and code blocks stay atomic. ~10% overlap between chunks.

04

Enrich

Bottom-up LLM pass generates summaries, concepts, hypothetical questions, and entities.

05

Embed & Index

E5-large-v2 vectors + BM25 sparse index + concept-to-chunk mapping. CUDA/MPS/CPU auto-detected.

06

Store

JSON file hierarchy with versioning, content-hash dedup, and checkpoint/resume.

How Ingestible Compares

A purpose-built ingestion pipeline vs. general-purpose frameworks.

FeatureIngestibleLangChainLlamaIndexUnstructured
Hierarchical chunking (L0–L3)Flat onlyFlat onlyFlat only
Built-in hybrid searchVector + BM25 + ConceptRequires separate setupRequires separate setup
Token-efficient retrieval88–99% reductionDepends on setupDepends on setup
LLM enrichmentSummaries, concepts, questions, KGManualManual
25+ format parsersVia integrationsVia integrations
Cross-document searchManual
MCP server
REST APIVia hosted
Production featuresRate limiting, auth, metrics, DockerFrameworkFrameworkVia hosted
Knowledge graphManualVia KG index

Start in 30 seconds

Install from PyPI or pull the Docker image.

pip
# Base install (~50MB)
pip install ingestible

# With local embeddings (no API keys)
pip install ingestible[local]

# Ingest a document
ingest add /path/to/doc.pdf -v

# Search
ingest search <doc_id> "your query"
Docker
# Pull and run
docker run -d \
  -p 8081:8081 \
  -v ingestible-data:/app/data \
  ghcr.io/simplyliz/ingestible:latest

# API + Web UI at localhost:8081
# Health check: /health/ready
# Metrics: /metrics
Python 3.11–3.13ChromaDB / pgvector / QdrantAnthropic / OpenAI / Gemini / OllamaMCP / REST API / CLI

See it in action

Three ways to integrate — pick what fits your stack.

Python API
from ingestible import Ingestible

ing = Ingestible()

# Ingest a document
doc = ing.ingest("/path/to/report.pdf")
print(f"Ingested: {doc.title} ({doc.total_chunks} chunks)")

# Search
results = ing.search(doc.doc_id, "quarterly revenue")
for r in results:
    print(f"[{r.score:.2f}] {r.content[:100]}...")
REST API
# Ingest a file
curl -X POST http://localhost:8081/ingest \
  -F "file=@report.pdf"

# Search
curl http://localhost:8081/documents/report/search \
  -d '{"query": "quarterly revenue", "top_k": 5}'

# Response
# {"results": [{"chunk_id": "l3_042", "score": 0.94,
#   "content": "Q3 revenue grew 23% YoY to..."}]}
MCP (AI Agents)
// Claude/GPT can search via MCP
{
  "tool": "ingestible_search",
  "input": {
    "doc_id": "report",
    "query": "quarterly revenue"
  }
}

// Agent gets structured results
// with chunk hierarchy context

Ready to stop burning tokens?

Ingestible is open-source under the PolyForm Small Business license. Free for small teams, affordable for everyone else.