Indonlu
Visit ToolIndoNLU is an open-source natural language processing benchmark for Indonesian Language. It provides 12 downstream tasks, pre-trained IndoBERT models, and starter code for researchers and developers.
At a glance
Trending
IndoNLU is an open-source natural language processing benchmark for Indonesian Language. It provides 12 downstream tasks, pre-trained IndoBERT models, and starter code for researchers and developers.
Trending
About
IndoNLU is a comprehensive collection of Natural Language Understanding (NLU) resources specifically designed for Bahasa Indonesia. It features 12 distinct downstream tasks, offering a robust benchmark for evaluating Indonesian language processing models. The project provides code to reproduce results and includes large pre-trained IndoBERT and IndoBERT-lite models, which were trained on an extensive 4-billion-word corpus (Indo4B) comprising over 20 GB of text data. Developed through a collaboration between universities and industry partners, IndoNLU also offers access to the Indo4B dataset and various FastText models. It serves as a vital resource for researchers and developers working on Indonesian NLP.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending