What hardware is required to run LATCH?
LATCH requires an NVIDIA GPU with 80GB VRAM, with H100 and A100 GPUs being recommended and benchmarked. It runs as a Docker container on Linux, providing a self-hosted solution for document intelligence.
How does LATCH differ from RAG and KV cache?
LATCH compiles entire document sets into persistent model-level memory, unlike RAG which chunks and retrieves per query, or KV caches which are session-bound. This eliminates chunking artifacts, enables full cross-document reasoning, and allows for persistent, portable memory files.
What document formats does LATCH support for compilation?
LATCH accepts a wide range of document formats including PDF, DOCX, XLSX, PPTX, TXT, MD, HTML, CSV, JSON, and XML. This allows for comprehensive compilation of diverse enterprise document sets into LLM memory.
What is the difference between a .latch and a .latchdoc file?
A .latch file contains only the compiled model-level memory, without source text, making it privacy-first and shareable. A .latchdoc file includes everything in a .latch file plus embedded raw text for full-text search and automatic quality fallback, serving as the recommended default.
Is LATCH offered as a hosted service?
No, LATCH is self-hosted by default, running as a Docker container on your own infrastructure. This ensures your documents remain within your environment. A managed hosted option is planned for future development.