ShypdShypd.ai
📚

Research & Education

Browsing page 57 of AI tools for Academic Research in Research & Education. Sorted by confidence score — our independent quality rating.

gemma

gemma

60%

Gemma is an open-weight Large Language Model (LLM) library developed by Google DeepMind, leveraging research and technology from the Gemini models. This repository offers the implementation of the gemma PyPI package, providing a JAX library for both using and fine-tuning Gemma models. It supports multi-turn, multi-modal conversations and offers various versions of Gemma. The library is designed to run on CPU, GPU, and TPU, with specific RAM recommendations for GPU usage (8GB+ for 2B checkpoint, 24GB+ for 7B checkpoint). Extensive documentation, Colabs, and tutorials are available for sampling, multi-modal fine-tuning, and LoRA.

Paper2Poster

Paper2Poster

60%

Paper2Poster is an open-source multi-agent system designed to automate the generation of academic posters from scientific papers. It takes a paper in PDF format and produces an editable poster in PPTX. The tool supports both local deployment via vLLM and API-based access (e.g., GPT-4o), offering flexibility in model choice for text and visual generation. Key features include automatic logo support for conferences and institutions, YAML-based style customization, and parallel content generation for faster processing. It also provides a Gradio demo and Docker support for streamlined deployment, making it accessible for researchers to efficiently create high-quality posters.

free-llm-api-resources

free-llm-api-resources

60%

free-llm-api-resources is a comprehensive list of services that provide free access or trial credits for API-based Large Language Model (LLM) usage. This resource is invaluable for developers, researchers, and students looking to experiment with LLMs without initial financial commitment. The list details various providers like OpenRouter, Google AI Studio, NVIDIA NIM, Mistral, HuggingFace, and others, specifying their free tiers, usage limits, and available models. It also includes providers offering trial credits such as Fireworks, Baseten, and AI21. The tool emphasizes legitimate services, explicitly excluding those that reverse-engineer existing chatbots, ensuring users find reliable and ethical resources for their projects.

GNN-Communication-Networks

GNN-Communication-Networks

60%

GNN-Communication-Networks is a dedicated repository for the collection of Graph-based Deep Learning for Communication Networks. It serves as a valuable resource for researchers and academics interested in the intersection of Graph Neural Networks (GNNs) and communication network applications. The repository compiles a wide array of academic papers, including surveys, journal articles, and conference proceedings, covering topics such as federated learning for network attack detection, routing optimization, resource allocation, and intelligent modeling in network management. It is regularly updated, providing a current overview of the literature in this rapidly evolving field. The resource also highlights related tools and competitions, making it a central hub for those working on or studying GNN applications in communication networks.

GNNPapers

GNNPapers

60%

GNNPapers is a comprehensive, open-source repository dedicated to curating essential papers on graph neural networks (GNNs). It serves as an invaluable resource for researchers, academics, and students seeking to explore the latest advancements and foundational works in the field. The collection is meticulously organized by topic, covering various aspects such as GNN models (basic, graph types, pooling methods), analysis, efficiency, and explainability. Additionally, it categorizes papers by diverse applications, including physics, chemistry, biology, knowledge graphs, recommender systems, computer vision, natural language processing, and more. This structured approach allows users to efficiently navigate and discover relevant literature, making it an indispensable tool for staying current with GNN research.

pointnet

pointnet

60%

PointNet is a novel deep learning architecture specifically designed for processing point clouds, which are an important type of geometric data structure. Unlike traditional methods that convert point clouds into regular 3D voxel grids or image collections, PointNet directly consumes unordered point sets, respecting their permutation invariance. This approach makes it highly efficient and effective for a range of applications, including object classification, part segmentation, and scene semantic parsing in 3D. Developed by researchers at Stanford University, PointNet is available as an open-source project on GitHub, providing code and data for training classification and part segmentation networks. It has also served as a foundational work for subsequent advancements like PointNet++.

practical-nlp-code

practical-nlp-code

60%

practical-nlp-code is the official GitHub repository for the code accompanying the 'Practical Natural Language Processing' book published by O'Reilly Media. This repository serves as a comprehensive resource for individuals looking to build real-world NLP systems, providing practical code examples and notebooks. It covers various NLP topics across its chapters, including NLP pipelines, text representation, text classification, information extraction, and applications in areas like chatbots, social media, e-commerce, retail, healthcare, finance, and law. The repository is actively maintained, with ongoing development to update notebooks for newer environments like Ubuntu 23 and future migration to TensorFlow 2.x, making it a valuable learning and development tool for those interested in natural language processing.

graph-adversarial-learning-literature

graph-adversarial-learning-literature

60%

graph-adversarial-learning-literature is an open-source curated list of academic papers focusing on adversarial attacks and defenses within graph-structured data. This resource is designed for researchers and machine learning engineers interested in the robustness and security of graph neural networks. Papers are meticulously sorted by their upload dates in descending order, offering a chronological view of advancements in the field. The repository also includes quick links to attack and defense papers sorted by year, and provides a search functionality to locate papers by conference name, task name, model name, or method name. It serves as a complement to a comprehensive survey on the topic, with citation information provided for both Arxiv and TKDE versions of the survey.

Graph-neural-networks

Graph-neural-networks

60%

Graph-neural-networks is a comprehensive GitHub repository dedicated to exploring and implementing graph neural networks (GNNs). It serves as a valuable resource for understanding GNNs from theoretical foundations to practical applications using TensorFlow. The repository highlights the utility of GNNs in modeling relationships and interactions within complex systems, particularly in molecular applications, network analysis, and physics modeling. It includes various papers and tutorials covering topics such as Geometric Deep Learning, Graph Convolution Networks (GCN), Attention mechanisms in GNNs, Message Passing Neural Networks (MPNN), Graph Autoencoders, and their diverse applications.

Artificial Intelligence International Institute (AIII)

Artificial Intelligence International Institute (AIII)

60%

The Artificial Intelligence International Institute (AIII) is a Singapore-based AI think tank dedicated to promoting sustainable artificial intelligence for humanity. Established in 2017, AIII focuses on three core pillars: technology, commercialization, and governance, aiming to balance economic development, autonomy, governance, and ethics in AI evolution. The institute conducts research on autonomous enterprise transformation, fusionovation for venture creation, AI governance and risk management, and next-generation AI. AIII actively collaborates with various organizations and hosts events to foster interdisciplinary collaboration and responsible AI development.

history-llms

history-llms

60%

history-llms is an information hub for a project focused on training the largest possible historical Large Language Models (LLMs). These models are designed to be fully time-locked, meaning they only access information up to a specific knowledge-cutoff date, such as 1913, 1929, or 1946. This approach allows researchers to explore historical discourse patterns without the hindsight contamination present in modern LLMs. The project aims to provide tools for exploring massive textual corpora and complements traditional archival research, serving as a window into past perspectives on various topics. The models are intended for scientific applications, enabling research in humanities, social sciences, and computer science, with a commitment to minimizing interference with the normative judgments acquired during pretraining.

self-refine

self-refine

60%

Self-Refine is an innovative AI research tool designed to empower Large Language Models (LLMs) with the ability to self-correct and enhance their output. The core mechanism involves LLMs generating feedback on their initial work, using this feedback to refine the output, and repeating this process iteratively. This iterative refinement process leads to improved quality and accuracy across various tasks. The tool provides examples and setups for diverse applications, including acronym generation, dialogue response generation, code readability improvement, and tasks like Commongen, GSM-8k, and Yelp. It utilizes 'prompt-lib' for querying LLMs and offers distinct prompt types for initialization, feedback generation, and iteration, making it a versatile platform for exploring self-improving AI systems.

self-critical.pytorch

self-critical.pytorch

60%

self-critical.pytorch provides a comprehensive codebase for image captioning research, offering an unofficial PyTorch implementation for Self-critical Sequence Training. Key features include support for bottom-up features, test-time ensemble, and multi-GPU training, with DistributedDataParallel now supported via pytorch-lightning. The codebase also integrates Transformer captioning models and offers a simple demo via a Colab notebook. Researchers can train networks on datasets like COCO and Flickr30k, with options for scheduled sampling and evaluation using metrics like BLEU, METEOR, and CIDEr. Pretrained models are available, and the tool facilitates generating image captions and evaluating them on various splits.

learning-papers

learning-papers

60%

learning-papers is a curated collection of landmark papers in machine learning, designed to highlight important techniques and foundational research. The repository categorizes papers by topic, such as Deep Learning, Ensemble Methods, Optimization, and Natural Language Processing, making it easier to navigate significant contributions. Each entry often includes the paper's title, authors, publication year, and links to the paper itself, sometimes with alternative free versions or associated code. It also provides icons to indicate paywalled papers, freely available versions, associated code, precursor papers, iterations, blog posts, websites, videos, or slides, offering a comprehensive resource for understanding the evolution of machine learning concepts.

TransformerLens

TransformerLens

60%

TransformerLens is an open-source Python library designed for the mechanistic interpretability of GPT-2 style language models. Maintained by Bryce Meyer and created by Neel Nanda, this tool enables users to load over 50 different open-source language models and expose their internal activations. Researchers can cache any internal activation and add functions to edit, remove, or replace these activations during model execution. The library supports in-depth analysis to reverse engineer the algorithms models learn from their weights, making it a crucial resource for understanding how large language models function internally. It also includes experimental support for Mamba / SSM architectures, providing bridge adapters for Mamba-1 and Mamba-2.

vecmap

vecmap

60%

vecmap is an open-source framework designed to learn cross-lingual word embedding mappings. It enables users to build cross-lingual word embeddings from monolingual embeddings, with or without parallel data, using various methods including supervised, semi-supervised, identical, and fully unsupervised approaches. The framework also includes comprehensive evaluation tools for tasks such as word translation induction, word similarity/relatedness, and word analogy. It supports CUDA for faster processing on NVIDIA GPUs and is suitable for researchers and developers working on multilingual natural language processing tasks, particularly those focused on unsupervised machine translation.

LLMSys-PaperList

LLMSys-PaperList

60%

LLMSys-PaperList is a comprehensive, curated list of academic resources focused on Large Language Model (LLM) systems. This open-source repository provides a valuable collection of papers, articles, tutorials, slides, and projects, covering various aspects of LLM systems. Users can explore topics such as LLM training (pre-training, post-training, fault tolerance), serving (LLM serving, agent systems, multi-modal serving), system efficiency optimization, and LLM frameworks. The list also includes sections on industrial LLM technical reports, ML conferences, ML systems survey papers, LLM benchmarks, and related ML readings. It serves as an essential resource for researchers and practitioners looking to keep abreast of the rapidly evolving LLM research landscape.

VideoCrafter

VideoCrafter

60%

VideoCrafter is an open-source video generation and editing toolbox developed by AILab-CVC, designed to overcome data limitations for high-quality video diffusion models. It features both Text2Video and Image2Video capabilities, allowing users to generate video content from text prompts or existing images. The tool has seen significant improvements with VideoCrafter2, offering better motion and concept combination even with limited data. It provides various checkpoints for different resolutions and models, including VideoCrafter1 and VideoCrafter2, available on Hugging Face. Researchers and developers can set up the environment via Anaconda and perform inference for text-to-video or image-to-video generation, or run a local Gradio demo. Technical reports and citations are provided for those interested in the underlying research.

ViTDet

ViTDet

60%

ViTDet offers an unofficial PyTorch implementation for object detection, leveraging plain Vision Transformer backbones. Based on the ECCV'22 paper "Exploring Plain Vision Transformer Backbones for Object Detection," this tool provides researchers and developers with a robust framework to experiment with advanced object detection models. It includes pre-trained weights and logs for various ViT-Base and ViTAE-Base models on MS COCO, supporting both detection and segmentation tasks. The implementation is designed for PyTorch and integrates with mmcv, timm, and einops, making it suitable for those working with modern deep learning architectures in computer vision.

vits2

vits2

60%

VITS2 is an unofficial implementation of a single-stage text-to-speech model designed to enhance the naturalness, efficiency, and quality of speech synthesis. It addresses limitations of previous models by proposing improved structures and training mechanisms, significantly reducing dependence on phoneme conversion for a fully end-to-end approach. The tool supports both single and multi-speaker TTS using datasets like LJ Speech and VCTK, or custom datasets. It provides installation instructions, environment setup with Conda, and examples for training and inference. VITS2 is a work in progress, with ongoing development to support features like speaker conditioning, high-resolution mel-spectrograms, and various architectural improvements.

vits

vits

60%

VITS (Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech) is an advanced open-source project designed to generate highly natural-sounding audio from text. Unlike traditional two-stage TTS systems, VITS offers single-stage training and parallel sampling, improving efficiency without compromising quality. It incorporates variational inference augmented with normalizing flows and an adversarial training process to enhance generative modeling. A key differentiator is its stochastic duration predictor, which allows for synthesizing speech with diverse rhythms and pitches, reflecting the natural one-to-many relationship between text input and spoken output. This enables the creation of varied speech styles from the same text, making it suitable for a wide range of applications requiring expressive voice generation.

llm-hallucination-survey

llm-hallucination-survey

60%

llm-hallucination-survey is an open-source repository offering a comprehensive reading list and survey paper focused on the critical issue of hallucination in large language models (LLMs). It categorizes hallucinations into input-conflicting, context-conflicting, and fact-conflicting types, providing extensive academic references for each. The resource is invaluable for researchers and academics seeking to understand the evaluation, explanation, and mitigation strategies for LLM hallucinations. It highlights how these issues undermine LLM reliability in real-world applications and serves as a central hub for cutting-edge research in this domain.

long-form-factuality

long-form-factuality

60%

long-form-factuality is an open-source project from Google DeepMind designed to benchmark the factuality of large language models (LLMs) in long-form responses. The repository provides the official code for their paper "Long-form factuality in large language models." Key components include LongFact, a comprehensive prompt set of 2,280 fact-seeking prompts specifically designed for long-form responses, and the Search-Augmented Factuality Evaluator (SAFE), an automatic system for evaluating model responses. It also features F1@K, an extension of the F1 score for long-form settings, and an experimentation pipeline for benchmarking models like OpenAI and Anthropic using LongFact and SAFE. This tool is essential for researchers and developers focused on improving the factual accuracy of LLMs.

DeepSearch Labs

DeepSearch Labs

60%

DeepSearch Labs is an AI-powered intelligence platform designed to help users find answers, quantify trends, and identify growth and risk impacts within their data. The platform supports various data formats, including Excel, PowerPoint, video, audio, JSON, and HTML, enabling comprehensive cross-database research. It assists users in generating insights and creating reports in multiple formats, allowing them to focus on strategic tasks. By fusing structured and unstructured data, DeepSearch Labs aims to transform complex information into actionable insights, providing recommendations for 'winners and losers' based on its analysis.