OmniQuant
Visit ToolOmniQuant is a quantization technique for large language models (LLMs). It optimizes models to reduce memory footprint and accelerate inference.
At a glance
Pricing
—
Free tier
—
API
—
Skill level
Technical
Trending
     Â