ShypdShypd.ai
📚

Research & Education

Browsing page 288 of AI tools for Research & Education. Sorted by confidence score — our independent quality rating.

GenerativeImage2Text

GenerativeImage2Text

59%

GenerativeImage2Text (GIT) is a repository from Microsoft that provides code examples and pre-trained models for generating text from images. It leverages a Generative Image-to-text Transformer for various vision and language tasks. Users can perform image captioning, where the model describes the content of an image, or visual question answering, where the model answers questions about an image. The tool supports inference on single images, multiple frames (for video analysis), and TSV files containing collections of images. It offers different model sizes (base and large) and fine-tuned versions for specific datasets like COCO, VQAv2, and TextCaps, allowing for tailored performance across diverse applications.

Get Worksheet

Get Worksheet

59%

Get Worksheet is an AI-powered platform designed to assist teachers in creating a wide range of educational materials. It offers a free worksheet generator that can produce customized worksheets for subjects like Math, English, Science, and Social Studies, catering to grades K-12. Beyond worksheets, the tool provides generators for comprehensive lesson plans, professional rubrics, IEP goals, matching quizzes, and report card comments, significantly reducing planning and grading time. All generated content is automatically saved to a user's library for easy access, editing, and re-downloading as print-ready PDFs. The platform emphasizes speed, customization, and comprehensive subject coverage, making it a valuable resource for educators.

gpt_paper_assistant

gpt_paper_assistant

59%

gpt_paper_assistant is an open-source, GPT-4 based tool designed to help researchers stay updated with the latest papers on ArXiv. It functions as a personalized daily scanner, identifying papers relevant to specified topics and authors. The tool leverages GPT-4 for evaluating paper relevance and novelty, and can filter papers based on author matches and semantic scholar IDs. It runs automatically via GitHub Actions, publishing daily summaries to a static GitHub Pages website or posting directly to Slack. Users can customize topics, authors, and filtering thresholds, making it a highly adaptable solution for academic research.

holodeck

holodeck

59%

Holodeck is a high-fidelity simulator designed for reinforcement learning and robotics research, leveraging the power of Unreal Engine 4. It offers a robust platform with over seven rich worlds and numerous scenarios for training AI agents. The simulator supports both Linux and Windows operating systems, allowing for easy extension and modification of training scenarios. A key feature is its ability to train and control multiple agents simultaneously, providing a flexible environment for complex research. It boasts a simple, OpenAI Gym-like Python interface for ease of use and high performance, capable of simulation speeds up to 2x real-time. Holodeck can run headless or with visual feedback, catering to different research needs.

GraphWaveletNeuralNetwork

GraphWaveletNeuralNetwork

59%

GraphWaveletNeuralNetwork is an open-source PyTorch implementation of the "Graph Wavelet Neural Network" (GWNN) as presented at ICLR 2019. This novel graph convolutional neural network addresses limitations of previous spectral graph CNN methods by utilizing graph wavelet transform, which avoids computationally expensive matrix eigendecomposition. The graph wavelets are sparse and localized, enhancing efficiency and interpretability for graph convolution tasks. The tool is designed for researchers and machine learning engineers working with graph-based semi-supervised classification, demonstrating superior performance on benchmark datasets like Cora, Citeseer, and Pubmed. It includes command-line arguments for easy configuration of training parameters and model options.

ImageCaptioning.pytorch

ImageCaptioning.pytorch

59%

ImageCaptioning.pytorch is a comprehensive open-source codebase designed for advanced image captioning research. It offers robust support for self-critical training, a technique crucial for optimizing caption generation. Researchers can leverage bottom-up features for more detailed image understanding and utilize multi-GPU training for efficient model development, including DistributedDataParallel with pytorch-lightning. The codebase also supports Transformer captioning models, providing a flexible framework for experimenting with state-of-the-art architectures. It includes functionalities for evaluating models on various datasets like COCO and Flickr30k, generating captions for raw images, and performing beam search for improved decoding. With detailed instructions for installation, data preparation, and training, it serves as a valuable resource for academics and developers in the field of computer vision and natural language processing.

kornia

kornia

59%

Kornia is a differentiable computer vision library built on PyTorch, designed for spatial AI applications. It offers a comprehensive suite of differentiable image processing and geometric vision algorithms, allowing users to leverage powerful batch transformations, auto-differentiation, and GPU acceleration. Key features include a wide range of image processing operators like filters, transformations, and enhancements, as well as advanced augmentation pipelines for training AI models. Kornia also provides access to pre-trained AI models for tasks such as face detection, feature matching, segmentation, and classification. The library is expanding its focus towards end-to-end vision models, with a particular emphasis on integrating state-of-the-art Vision Language Models (VLM) and Vision Language Agents (VLA). It supports multi-framework usage, including TensorFlow, JAX, and NumPy, making it a versatile tool for developers and researchers in the AI and computer vision fields.

image_captioning

image_captioning

59%

image_captioning is an open-source TensorFlow implementation of a neural image caption generation system, based on the "Show, Attend and Tell" paper. This tool takes an image as input and outputs a descriptive sentence. It leverages a convolutional neural network (CNN) to extract visual features from the image, which are then decoded into a sentence by an LSTM recurrent neural network (RNN). A soft attention mechanism is integrated to enhance the quality and relevance of the generated captions. The project supports end-to-end training of both CNN and RNN components, allowing for fine-tuning with datasets like COCO train2014. Users can evaluate models, generate captions for new images, and monitor training progress with TensorBoard.

Klu

Klu

59%

Klu is a meeting automation platform designed to enhance productivity for modern teams. It focuses on automating workflows and integrating with existing tools to streamline meeting management. The platform aims to help users take meeting notes with no effort, suggesting a focus on efficiency and ease of use. By connecting to various tools, Klu seeks to centralize meeting-related tasks and information, ultimately leading to more productive team interactions. Its core offering appears to be around simplifying the often-tedious aspects of meetings, allowing teams to concentrate on core discussions and decisions.

LLMRec

LLMRec

59%

LLMRec is a novel framework implemented in PyTorch, designed to significantly improve recommendation systems through the application of three distinct LLM-based graph augmentation strategies. These strategies include reinforcing user-item interactive edges, enhancing item node attributes, and conducting user node profiling, all from a natural language perspective. The tool leverages content within online platforms like Netflix and MovieLens to augment interaction graphs. It provides code, original data, and augmented data, making it a valuable resource for researchers and data scientists working on recommendation systems. LLMRec also offers multi-modal datasets, including textual and visual data, and supports LLM-augmented textual data and embeddings for comprehensive research.

NeuralPDE.jl

NeuralPDE.jl

59%

NeuralPDE.jl is an open-source solver package designed for Scientific Machine Learning (SciML) that utilizes Physics-Informed Neural Networks (PINNs) to solve various types of differential equations, including Ordinary, Stochastic, and Partial Differential Equations (ODE, SDE, PDE). It offers a greatly increased generality compared to classical methods by leveraging neural stochastic differential equations. Key features include automated construction of physics-informed loss functions from a high-level symbolic interface, compatibility with machine learning libraries like Flux.jl and Lux.jl for GPU-powered layers, and integration with NeuralOperators.jl for mixing deep neural operators with physics-informed loss functions. The tool also supports advanced techniques such as quadrature training strategies, adaptive loss functions, and neural adapters to accelerate training, making it suitable for complex scientific simulations and data fitting.

OpenFace

OpenFace

59%

OpenFace is a state-of-the-art, open-source toolkit designed for comprehensive facial behavior analysis. It enables real-time facial landmark detection, accurate head pose estimation, robust facial action unit recognition, and precise eye-gaze estimation. Developed by Tadas Baltrušaitis in collaboration with CMU MultiComp Lab, OpenFace is intended for computer vision and machine learning researchers, as well as the affective computing community. The tool stands out for its ability to run efficiently from a simple webcam without requiring specialized hardware, making advanced facial analysis accessible. It provides source code for both running and training models, ensuring flexibility and extensibility for research and application development.

pearai-master

pearai-master

59%

PearAI aims to be a comprehensive inventory that curates leading, cutting-edge AI tools in one centralized location. It provides a unified interface, allowing users to seamlessly switch between different AI solutions without the need to constantly search for alternatives. The project is a conglomeration of several repositories, including `pearai-app` for VSCode integration and editor functionalities, `pearai-submodule` for AI chat features, `pear-landing-page` for the website, and `pearai-documentation` for user guides. It also includes `pearai-server` for optional convenience, allowing users to avoid using their own API keys. PearAI is built using TypeScript/Electron.js for the app, Next.js/React for the landing page, and a Python FastAPI server for the backend, utilizing Supabase for authentication and database management.

parameter_efficient_instruction_tuning

parameter_efficient_instruction_tuning

59%

parameter_efficient_instruction_tuning is an open-source repository dedicated to the systematic comparison of various parameter-efficient fine-tuning (PEFT) methods for instruction tuning tasks. The project utilizes the SuperNI dataset as its primary benchmark for training and evaluation. Implementations of PEFT methods are adapted from well-known libraries such as adapter-transformers and peft. The repository includes bash scripts for running experiments, optimized for the hfai HPC platform, supporting features like experiment configuration, checkpoint management, and training state validation. It also addresses platform-specific considerations like PyTorch and CUDA compatibility, making it a valuable resource for researchers and developers working on efficient large language model fine-tuning.

Point-BERT

Point-BERT

59%

Point-BERT is a PyTorch implementation of a novel pre-training paradigm for 3D point cloud Transformers, introduced in CVPR 2022. Inspired by BERT, it utilizes a Masked Point Modeling (MPM) task where point clouds are divided into local patches, and a discrete Variational AutoEncoder (dVAE) tokenizes these patches. The pre-training objective involves recovering original point tokens at masked locations, supervised by the dVAE's output. This method significantly advances the capabilities of Transformers for 3D data, facilitating tasks like classification on ModelNet40 and ScanObjectNN, few-shot learning, and part segmentation on ShapeNetPart. It is an essential tool for researchers and engineers working with 3D point cloud analysis.

rome

rome

59%

ROME (Rank-One Model Editing) is an open-source tool designed for researchers and developers to precisely locate and modify factual associations within large language models, specifically GPT-2 XL and GPT-J. This GPU-only implementation allows for targeted editing of model knowledge without extensive retraining. It provides functionalities for causal tracing to understand model behavior and a straightforward API for specifying rewrite requests. The repository includes evaluation suites for benchmarking editing methods against CounterFact, making it a valuable resource for advancing research in model interpretability and editability. Users can also integrate new editing methods for comparative analysis.

SEAL

SEAL

59%

SEAL (learning from Subgraphs, Embeddings, and Attributes for Link prediction) is a novel framework designed for link prediction. It systematically transforms the link prediction task into a subgraph classification problem. For each target link, SEAL extracts its h-hop enclosing subgraph and constructs a node information matrix, which can include structural node labels, latent embeddings, and explicit attributes. This data is then fed into a graph neural network (GNN) to classify the existence of the link, allowing the model to learn from both graph structure features and latent/explicit node features simultaneously. The framework is implemented in both MATLAB and Python, with a PyTorch Geometric version available for testing on OGB, Planetoid, and custom datasets. Notably, SEAL can achieve strong performance even without node embeddings or attributes, leveraging purely graph structures, and can function as an inductive link prediction model.

Spleen 3D Segmentation With MONAI

Spleen 3D Segmentation With MONAI

59%

Spleen 3D Segmentation With MONAI is an AI-powered application hosted on Hugging Face Spaces, designed for medical image analysis. This tool allows users to upload a 3D medical image containing a spleen, and it will process the image to generate a segmented output. The segmentation highlights the spleen, making it easier for medical professionals to analyze its structure and identify potential issues. Built with MONAI, a PyTorch-based framework for deep learning in healthcare imaging, this tool demonstrates the application of AI in assisting diagnostics and research within the medical domain. While the current live website indicates a runtime error, the intended functionality is to provide a clear, segmented view of the spleen from complex 3D medical scans.

segmentation_models.pytorch

segmentation_models.pytorch

59%

segmentation_models.pytorch is an Open Source Python library designed for semantic image segmentation using PyTorch. It provides a high-level API that allows users to create neural networks with minimal code, supporting 12 encoder-decoder model architectures such as Unet, Unet++, Segformer, and DPT. The library boasts an extensive collection of over 800 pretrained convolutional and transformer-based encoders, including timm support, which helps achieve faster and more stable convergence during training. It also includes popular metrics and losses for training routines, such as Dice and Jaccard, and is compatible with ONNX export and torch script/trace/compile. This makes it a versatile tool for researchers and practitioners in computer vision.

Self-Driving Delivery Agent

Self-Driving Delivery Agent

59%

Self-Driving Delivery Agent, also known as DriVLMe, is an open-source project providing the official implementation of the IROS 2024 paper: "Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experience." This tool is designed for researchers and developers working on autonomous driving systems, particularly those interested in integrating large language models (LLMs) with real-world driving experiences. It offers a framework for setting up a conda environment, preparing LLaVA weights, and training/finetuning models on datasets like bddx and SDN. The project includes scripts for pretraining, finetuning, and evaluating autonomous driving agents, making it a valuable resource for advancing the field of AI-driven autonomous vehicles.

Focoos AI

Focoos AI

59%

Focoos AI reshapes computer vision by offering ultra-efficient models designed to reduce costs, automate hardware integration, and ensure peak performance across various devices. The platform allows ML Engineers to train, deploy, and iterate models faster than ever, supporting both cloud and edge environments. Its models are engineered for speed, delivering up to 10x faster inference and being 4x lighter in compute and memory compared to mainstream alternatives. Focoos AI provides pre-trained, production-ready models that can be instantly deployed and easily fine-tuned. It features an all-in-one platform for managing, comparing, monitoring, and deploying models, alongside an open-source library for community collaboration and local use. The tool emphasizes security, control, and sustainability, making it suitable for applications in manufacturing, smart cities, and autonomous systems.

TensorLayer

TensorLayer

59%

TensorLayer is a powerful, open-source deep learning and reinforcement learning library built for scientists and engineers. It offers an extensive collection of customizable neural layers, enabling rapid development of advanced AI models. Inspired by PyTorch, TensorLayer provides transparent and flexible APIs, making it easier to build and train complex AI models compared to other TensorFlow wrappers. It supports multiple backends including TensorFlow, PyTorch, MindSpore, PaddlePaddle, OneFlow, and Jittor, allowing deployment on various hardware like Nvidia-GPU and Huawei-Ascend. The library is recognized for its simplicity, flexibility, and high performance, with comprehensive documentation and a large community.

tiefvision

tiefvision

59%

tiefvision is an integrated end-to-end image-based search engine powered by deep learning. It offers comprehensive functionalities including image classification, image location (based on OverFeat), and image similarity (based on Deep Ranking). The system is built using Torch for its deep learning modules and the Play Framework (Scala version) for its tooling modules. It currently supports Linux operating systems with CUDA-enabled GPUs, indicating a focus on performance-intensive image processing tasks. Beyond its core deep learning capabilities, tiefvision also provides a suite of web tools designed to streamline dataset generation and enhance productivity, such as visual database editors and automated dataset generation for training and testing.

Top2Vec

Top2Vec

59%

Top2Vec is an open-source Python library designed for advanced topic modeling and semantic search. It automatically detects topics within text data and generates jointly embedded topic, document, and word vectors. The library offers a 'classic' version for general topic modeling and a newer 'contextual' version that leverages contextual token embeddings to identify multiple topics per document and even detect topic segments within documents. This contextual approach provides a more nuanced understanding of complex texts. Key features include automatic topic number detection, hierarchical topic generation, keyword-based topic search, and document search by topic or keywords. Top2Vec eliminates the need for stop word lists, stemming, or lemmatization, and works effectively on short texts. It also supports various embedding models like Doc2Vec, Universal Sentence Encoder, and BERT Sentence Transformer for flexible deployment.