Documents in.
Knowledge out.
Turn documents into token-efficient, searchable knowledge stores for AI. Hybrid search across vector, BM25, and concept indexes — each query costs ~1,000 tokens instead of 90,000.
513-page book
tokens per query
55-page paper
tokens per query
Everything you need for RAG
From parsing to search, Ingestible handles the full pipeline so you can focus on building your AI application.
25+ formats
PDF, DOCX, HTML, EPUB, PPTX, XLSX, CSV, Markdown, audio, video, images, email, ZIP archives, and more.
4-level hierarchy
Document overview, chapters, sections, passages. No mid-paragraph splits. Tables and code stay atomic.
Hybrid search
Vector + BM25 + concept index fused with RRF. Optional cross-encoder reranking, HyDE, and knowledge graph retrieval.
LLM enrichment
Summaries, concepts, hypothetical questions, entities, and knowledge graph triples. Bottom-up from passages to document.
Production ready
Rate limiting, Prometheus metrics, auth middleware, background ingestion with checkpoint/resume, Docker-ready.
MCP + API + CLI
Three entry points. REST API with FastAPI, CLI for automation, MCP server for AI agent integration.
25+ formats supported
Drop in documents, media, archives, or markup. Ingestible handles parsing, cleaning, and structure extraction automatically.
Documents
Web & Markup
Media
Other
How it works
A six-stage pipeline transforms raw documents into queryable knowledge stores with hybrid search.
Parse
Format-specific extraction to clean markdown. PDF uses IBM Docling for deep layout analysis.
Structure
Builds hierarchy tree from TOC, heading patterns, or page range heuristics.
Chunk
4-level split (L0-L3). Tables and code blocks stay atomic. ~10% overlap between chunks.
Enrich
Bottom-up LLM pass generates summaries, concepts, hypothetical questions, and entities.
Embed & Index
E5-large-v2 vectors + BM25 sparse index + concept-to-chunk mapping. CUDA/MPS/CPU auto-detected.
Store
JSON file hierarchy with versioning, content-hash dedup, and checkpoint/resume.
How Ingestible Compares
A purpose-built ingestion pipeline vs. general-purpose frameworks.
| Feature | Ingestible | LangChain | LlamaIndex | Unstructured |
|---|---|---|---|---|
| Hierarchical chunking (L0–L3) | Flat only | Flat only | Flat only | |
| Built-in hybrid search | Vector + BM25 + Concept | Requires separate setup | Requires separate setup | — |
| Token-efficient retrieval | 88–99% reduction | Depends on setup | Depends on setup | — |
| LLM enrichment | Summaries, concepts, questions, KG | Manual | Manual | — |
| 25+ format parsers | Via integrations | Via integrations | ||
| Cross-document search | Manual | — | ||
| MCP server | — | — | — | |
| REST API | — | — | Via hosted | |
| Production features | Rate limiting, auth, metrics, Docker | Framework | Framework | Via hosted |
| Knowledge graph | Manual | Via KG index | — |
Start in 30 seconds
Install from PyPI or pull the Docker image.
# Base install (~50MB)
pip install ingestible
# With local embeddings (no API keys)
pip install ingestible[local]
# Ingest a document
ingest add /path/to/doc.pdf -v
# Search
ingest search <doc_id> "your query"# Pull and run
docker run -d \
-p 8081:8081 \
-v ingestible-data:/app/data \
ghcr.io/simplyliz/ingestible:latest
# API + Web UI at localhost:8081
# Health check: /health/ready
# Metrics: /metricsSee it in action
Three ways to integrate — pick what fits your stack.
from ingestible import Ingestible
ing = Ingestible()
# Ingest a document
doc = ing.ingest("/path/to/report.pdf")
print(f"Ingested: {doc.title} ({doc.total_chunks} chunks)")
# Search
results = ing.search(doc.doc_id, "quarterly revenue")
for r in results:
print(f"[{r.score:.2f}] {r.content[:100]}...")# Ingest a file
curl -X POST http://localhost:8081/ingest \
-F "file=@report.pdf"
# Search
curl http://localhost:8081/documents/report/search \
-d '{"query": "quarterly revenue", "top_k": 5}'
# Response
# {"results": [{"chunk_id": "l3_042", "score": 0.94,
# "content": "Q3 revenue grew 23% YoY to..."}]}// Claude/GPT can search via MCP
{
"tool": "ingestible_search",
"input": {
"doc_id": "report",
"query": "quarterly revenue"
}
}
// Agent gets structured results
// with chunk hierarchy contextReady to stop burning tokens?
Ingestible is open-source under the PolyForm Small Business license. Free for small teams, affordable for everyone else.