Opencompass
Visit ToolOpenCompass is an LLM evaluation platform that supports a wide range of models and over 100 datasets. It helps assess the quality and effectiveness of large language models.
At a glance
Trending
OpenCompass is an LLM evaluation platform that supports a wide range of models and over 100 datasets. It helps assess the quality and effectiveness of large language models.
Trending
About
OpenCompass is an advanced LLM evaluation platform designed to guide users through the complex landscape of assessing large language models. It supports a diverse array of models, including Llama3, Mistral, InternLM2, GPT-4, and Claude, and offers compatibility with over 100 datasets for comprehensive benchmarking. The platform provides powerful algorithms and an intuitive interface to evaluate the quality and effectiveness of NLP models. Key features include support for various inference acceleration backends like LMDeploy and vLLM, flexible evaluation mechanisms such as CascadeEvaluator, and tools for LLM-as-judge and mathematical reasoning assessments. Users can install OpenCompass via pip or from source, and prepare datasets either offline or through automatic downloads from OpenCompass storage or ModelScope.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending