AQLM
Visit ToolAQLM is an open-source PyTorch implementation for extreme compression of large language models. It uses additive quantization to reduce LLM size and improve inference speed, supporting various LLaMA, Mistral, and Mixtral models.
At a glance
Trending