Ppl.Nn

Visit Tool

ppl.nn is an AI Frameworks & Infra tool that provides a high-performance deep-learning inference engine. It efficiently runs various ONNX models and offers better support for OpenMMLab.

Claim this tool

2Views

At a glance

Pricing

Open Source

Free tier

Yes

API

Yes

Skill level

Technical

About

What is ppl.nn?

PPLNN, short for "Primitive Library for Neural Network," is a high-performance deep-learning inference engine designed for efficient AI inferencing. It supports running various ONNX models and offers enhanced compatibility with OpenMMLab. Key features include a new LLM Engine with Flash Attention, Group-query Attention, and Dynamic Batching, alongside Tensor Parallelism and Graph Optimization. It also supports INT8 groupwise KV Cache and INT8 per token per channel Quantization for improved performance and accuracy. The library provides comprehensive documentation for building from source, integrating APIs, and developing new engines and operations across X86, CUDA, RISCV, and ARM platforms. It is an open-source project, welcoming contributions and providing resources for developers.

Best used for

Ideal for developers who need to accelerate deep learning inference, optimize ONNX model execution, and develop high-performance AI applications. Especially valuable for those working with large language models and requiring advanced features like Flash Attention and INT8 quantization.

Common actions

optimize AI inference

run ONNX models

develop neural networks

accelerate deep learning

face swappingautomated workflowopen-sourcelow-code/no-codecollaborationdeepfakegithub copilotworkflows"AI Agents"

Capabilities

Key features

High-performance deep-learning inference
Supports various ONNX models
New LLM Engine
Flash Attention
Dynamic Batching
Tensor Parallelism
Graph Optimization

Target Audience

developer

Integrations

onnxruntimeonnxopenvinoonednntensorrtopenmmlab

Pricing & Plans

Open Source

Free

FAQs

What kind of models does ppl.nn support for inference?

ppl.nn is designed to run various ONNX models efficiently. It also provides better support for models developed within the OpenMMLab ecosystem, making it versatile for different deep learning architectures.

What advanced features does ppl.nn offer for Large Language Models (LLMs)?

For LLMs, ppl.nn includes a new LLM Engine with features like Flash Attention, Group-query Attention, and Dynamic Batching. It also supports Tensor Parallelism, Graph Optimization, and INT8 groupwise KV Cache for enhanced performance.

Which hardware platforms are supported by ppl.nn?

ppl.nn supports a range of hardware platforms including X86, CUDA, RISCV, and ARM. This broad compatibility allows developers to deploy their neural networks across diverse computing environments.

Trending

Subcategories trending in AI Agents & Automation

Chatbots & Conversational AI General-Purpose Agents Workflow Agents Personal Assistants RAG & Document AI Voice Agents

Trending

Also listed in

This tool also appears in

Coding & Development › Backend & APIs Coding & Development › Code Assistants Coding & Development › Open Source & Models

Explore

Browse AI tools by category

Content & Design Productivity & Business Coding & Development AI Agents & Automation Research & Education Wellness & Lifestyle Career Development Marketing & Growth Data & Analytics Customer Support & CX Finance E-commerce