Security-first architecture

Security &
Privacy

Your documents contain your most sensitive knowledge. Data governance isn’t an afterthought in Ingestible — it’s a first-class design constraint. Deploy on your infrastructure, control every data flow, meet any compliance requirement.

Your data, your infrastructure

Deploy Ingestible wherever your security policy demands — on-premise, private cloud, or VPC. No vendor trust required.

# Deploy with Docker
$ docker pull ghcr.io/simplyliz/ingestible:latest
$ docker compose up -d
# Or with Kubernetes
$ helm install ingestible ./chart
All data stays within your network boundary.
No telemetry. No phone-home. No external dependencies.

Self-hosted deployment

Run on your own servers via Docker, Kubernetes, or bare metal. One command to deploy, same codebase as cloud.

Documents never leave

All parsing, chunking, embedding, and storage happens inside your perimeter. Zero data exfiltration surface.

Air-gapped support

No external API calls required. Use local embedding models and skip LLM enrichment for fully isolated operation.

One codebase

The self-hosted version is the full product — not a stripped-down or feature-gated fork. Same repo, same releases.

Data governance

Full control over where your data lives, how it moves, and who can access it.

Data residency

Deploy in any region or jurisdiction. Meet GDPR, LGPD, PIPEDA, CCPA, and sector-specific residency requirements by running Ingestible where your data already lives.

No vendor lock-in

Standard JSON output. Pluggable vector backends — ChromaDB, pgvector, or Qdrant. Switch storage, switch providers, or go fully offline at any time.

Encryption

TLS for all data in transit. Configurable encryption at rest through your infrastructure’s native encryption (EBS, LUKS, BitLocker) or database-level encryption.

Access control

API key authentication, per-key rate limiting, role-based access, and full audit logging. Every document operation is traceable to a principal.

Security practices

Built in the open with defense-in-depth. Trust but verify — the code is right there.

Open source

Every line of the pipeline is inspectable. No proprietary black boxes processing your documents.

Automated security scanning

CI runs Bandit (SAST), dependency vulnerability scanning, and import-level security checks on every commit.

No telemetry

Zero analytics, tracking, or phone-home behavior. The self-hosted binary makes no outbound connections unless you configure LLM enrichment.

Enterprise compliance

SOC 2 and HIPAA compliance documentation available for Enterprise SLA customers. Security review and custom DPA on request.

LLM data handling

Full transparency on what leaves your infrastructure during the enrichment stage — and how to prevent it entirely.

Minimal data exposure

When enrichment calls external LLM APIs (Anthropic, OpenAI), only individual chunk text is sent — never full documents, metadata, or file names.

Self-hosted LLM option

Point the enrichment stage at your own model endpoint (vLLM, Ollama, TGI). Documents never leave your network.

BYOK — bring your own keys

Your API keys, your accounts, your usage tracking. Ingestible never proxies through our servers on self-hosted deployments.

Skip enrichment entirely

For maximum isolation, disable the LLM enrichment stage. You still get parsing, hierarchical chunking, embedding, and hybrid search — no external calls at all.

Data flow: what stays local vs. what can leave

Always local
  • Document parsing & cleaning
  • Structure analysis & hierarchy
  • Chunking (all strategies)
  • Embedding (E5-large-v2)
  • Index building (vector + BM25)
  • Search & retrieval
  • Storage (JSON files / vector DB)
Optional external calls
  • LLM enrichment (summaries, concepts)
  • Hypothetical question generation
  • Knowledge graph triple extraction
  • HyDE query expansion
  • LLM-as-judge evaluation

All optional. Disable enrichment or use a self-hosted LLM to eliminate external calls entirely.

Ready to deploy on your infrastructure?

Get started with a single Docker command, or contact our team for enterprise deployment planning, custom SLAs, and compliance documentation.