Llm-Awq
Visit Toolllm-awq is an AI Frameworks & Infra tool that provides activation-aware weight quantization for LLM compression and acceleration. It supports efficient and accurate low-bit weight quantization (INT3/4) for various LLMs, including instruction-tuned and multi-modal models.
At a glance
Trending