v1.1.1 — Now with pluggable vector backends

Documents in.
Knowledge out.

Turn documents into token-efficient, searchable knowledge stores for AI. Hybrid search across vector, BM25, and concept indexes — each query costs ~1,000 tokens instead of 90,000.

Get started View on GitHub

513-page book

92,598~1,317-99%

tokens per query

55-page paper

4,975~585-88%

tokens per query

Everything you need for RAG

From parsing to search, Ingestible handles the full pipeline so you can focus on building your AI application.

25+ formats

PDF, DOCX, HTML, EPUB, PPTX, XLSX, CSV, Markdown, audio, video, images, email, ZIP archives, and more.

4-level hierarchy

Document overview, chapters, sections, passages. No mid-paragraph splits. Tables and code stay atomic.

Hybrid search

Vector + BM25 + concept index fused with RRF. Optional cross-encoder reranking, HyDE, and knowledge graph retrieval.

LLM enrichment

Summaries, concepts, hypothetical questions, entities, and knowledge graph triples. Bottom-up from passages to document.

Production ready

Rate limiting, Prometheus metrics, auth middleware, background ingestion with checkpoint/resume, Docker-ready.

MCP + API + CLI

Three entry points. REST API with FastAPI, CLI for automation, MCP server for AI agent integration.

25+ formats supported

Drop in documents, media, archives, or markup. Ingestible handles parsing, cleaning, and structure extraction automatically.

Documents

PDFDOCXPPTXXLSXCSVEPUB

Web & Markup

HTMLMarkdownRSTAsciiDocXMLJSON/JSONL

Media

MP3WAVFLACMP4MKVAVIPNGJPGTIFF

Other

Email (EML/MSG)ZIP ArchivesPlain Text

How it works

A six-stage pipeline transforms raw documents into queryable knowledge stores with hybrid search.

Parse

Format-specific extraction to clean markdown. PDF uses IBM Docling for deep layout analysis.

Structure

Builds hierarchy tree from TOC, heading patterns, or page range heuristics.

Chunk

4-level split (L0-L3). Tables and code blocks stay atomic. ~10% overlap between chunks.

Enrich

Bottom-up LLM pass generates summaries, concepts, hypothetical questions, and entities.

Embed & Index

E5-large-v2 vectors + BM25 sparse index + concept-to-chunk mapping. CUDA/MPS/CPU auto-detected.

Store

JSON file hierarchy with versioning, content-hash dedup, and checkpoint/resume.

How Ingestible Compares

A purpose-built ingestion pipeline vs. general-purpose frameworks.

Feature	Ingestible	LangChain	LlamaIndex	Unstructured
Hierarchical chunking (L0–L3)		Flat only	Flat only	Flat only
Built-in hybrid search	Vector + BM25 + Concept	Requires separate setup	Requires separate setup	—
Token-efficient retrieval	88–99% reduction	Depends on setup	Depends on setup	—
LLM enrichment	Summaries, concepts, questions, KG	Manual	Manual	—
25+ format parsers		Via integrations	Via integrations
Cross-document search		Manual		—
MCP server		—	—	—
REST API		—	—	Via hosted
Production features	Rate limiting, auth, metrics, Docker	Framework	Framework	Via hosted
Knowledge graph		Manual	Via KG index	—

Start in 30 seconds

Install from PyPI or pull the Docker image.

pip

# Base install (~50MB)
pip install ingestible

# With local embeddings (no API keys)
pip install ingestible[local]

# Ingest a document
ingest add /path/to/doc.pdf -v

# Search
ingest search <doc_id> "your query"

Docker

# Pull and run
docker run -d \
  -p 8081:8081 \
  -v ingestible-data:/app/data \
  ghcr.io/simplyliz/ingestible:latest

# API + Web UI at localhost:8081
# Health check: /health/ready
# Metrics: /metrics

Python 3.11–3.13ChromaDB / pgvector / QdrantAnthropic / OpenAI / Gemini / OllamaMCP / REST API / CLI

See it in action

Three ways to integrate — pick what fits your stack.

Python API

from ingestible import Ingestible

ing = Ingestible()

# Ingest a document
doc = ing.ingest("/path/to/report.pdf")
print(f"Ingested: {doc.title} ({doc.total_chunks} chunks)")

# Search
results = ing.search(doc.doc_id, "quarterly revenue")
for r in results:
    print(f"[{r.score:.2f}] {r.content[:100]}...")

REST API

# Ingest a file
curl -X POST http://localhost:8081/ingest \
  -F "file=@report.pdf"

# Search
curl http://localhost:8081/documents/report/search \
  -d '{"query": "quarterly revenue", "top_k": 5}'

# Response
# {"results": [{"chunk_id": "l3_042", "score": 0.94,
#   "content": "Q3 revenue grew 23% YoY to..."}]}

MCP (AI Agents)

// Claude/GPT can search via MCP
{
  "tool": "ingestible_search",
  "input": {
    "doc_id": "report",
    "query": "quarterly revenue"
  }
}

// Agent gets structured results
// with chunk hierarchy context

Ready to stop burning tokens?

Ingestible is open-source under the PolyForm Small Business license. Free for small teams, affordable for everyone else.

Install from PyPI Read the docs

Documents in.Knowledge out.