Vllm-Omni
Visit Toolvllm-omni is a Coding & Development tool that provides a framework for efficient inference with omni-modality models. It supports text, image, video, and audio data processing, extending vLLM's capabilities.
At a glance
Trending
vllm-omni is a Coding & Development tool that provides a framework for efficient inference with omni-modality models. It supports text, image, video, and audio data processing, extending vLLM's capabilities.
Trending
About
vllm-omni is a framework designed for efficient model inference and serving of omni-modality models, building upon the foundation of vLLM. It expands support beyond text-based autoregressive generation to include text, image, video, and audio data processing. The framework also accommodates non-autoregressive architectures like Diffusion Transformers (DiT) and other parallel generation models, enabling heterogeneous outputs. Key features include state-of-the-art autoregressive support through efficient KV cache management, pipelined stage execution for high throughput, and fully disaggregated architecture with dynamic resource allocation. It offers flexibility with heterogeneous pipeline abstraction, seamless integration with Hugging Face models, and support for various parallelism techniques for distributed inference. vllm-omni also provides streaming outputs and an OpenAI-compatible API server.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending
Also listed in