Ppl.Nn
Visit Toolppl.nn is an AI Frameworks & Infra tool that provides a high-performance deep-learning inference engine. It efficiently runs various ONNX models and offers better support for OpenMMLab.
At a glance
Trending
Also listed in
ppl.nn is an AI Frameworks & Infra tool that provides a high-performance deep-learning inference engine. It efficiently runs various ONNX models and offers better support for OpenMMLab.
Trending
Also listed in
About
PPLNN, short for "Primitive Library for Neural Network," is a high-performance deep-learning inference engine designed for efficient AI inferencing. It supports running various ONNX models and offers enhanced compatibility with OpenMMLab. Key features include a new LLM Engine with Flash Attention, Group-query Attention, and Dynamic Batching, alongside Tensor Parallelism and Graph Optimization. It also supports INT8 groupwise KV Cache and INT8 per token per channel Quantization for improved performance and accuracy. The library provides comprehensive documentation for building from source, integrating APIs, and developing new engines and operations across X86, CUDA, RISCV, and ARM platforms. It is an open-source project, welcoming contributions and providing resources for developers.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending