SepLLM
Visit ToolSepLLM is an AI Frameworks & Infra tool that accelerates large language models by compressing segments into separators. It offers an easy-to-use native sparse attention baseline method, significantly reducing KV cache and processing sequences up to 4 million tokens.
At a glance
Trending