ShypdShypd.ai
📚

Research & Education

Browsing page 140 of AI tools for Academic Research in Research & Education. Sorted by confidence score — our independent quality rating.

Online-3D-BPP-DRL

Online-3D-BPP-DRL

55%

Online-3D-BPP-DRL is an open-source project that provides the implementation of the paper "Online 3D Bin Packing with Constrained Deep Reinforcement Learning." This tool is designed for researchers and developers interested in optimizing 3D bin packing problems using AI. It allows users to train new models on randomly generated sequences or test existing models with various data sets. The repository includes code for user-study applications, multi-bin algorithms, and MCTS for comparison, offering a comprehensive environment for experimentation and development in this domain. Users can adjust network architectures and parameters to suit their specific needs, making it a flexible platform for advanced AI research in logistics and optimization.

Online-3D-BPP-PCT

Online-3D-BPP-PCT

55%

Online-3D-BPP-PCT is an open-source tool that implements a method for efficient online 3D bin packing. It leverages deep reinforcement learning (DRL) on a hierarchical packing configuration tree to enhance the practical applicability of the online 3D Bin Packing Problem (BPP). This approach makes the DRL model adept at dealing with practical constraints and performing well even in continuous solution spaces. Key features include arbitrary container and item sizes, support for continuous online 3D-BPP, algorithms for approximating stability, and improved performance with complex constraints. It also offers more adequate heuristic baselines for domain development and stable training.

PyGCL

PyGCL

55%

PyGCL is a PyTorch-based open-source library specifically designed for Graph Contrastive Learning (GCL). It provides a comprehensive framework for researchers and developers to implement and experiment with various GCL algorithms. The library features modularized GCL components, including graph augmentation techniques like Edge Adding, Feature Masking, and Node Dropping, as well as different contrasting architectures and modes (single-branch, dual-branch, bootstrapped, within-embedding). PyGCL also implements a variety of contrastive objectives such as InfoNCE, JSD, and Barlow Twins, alongside negative sampling strategies. It supports standardized evaluation with evaluators like Logistic Regression and SVM, and offers utilities for managing experiments, making it a valuable tool for advancing graph representation learning.

PMRF

PMRF

55%

PMRF (Posterior-Mean Rectified Flow) is an open-source implementation of a novel photo-realistic image restoration algorithm, presented at ICLR 2025. It provably approximates the optimal estimator that minimizes the Mean Squared Error (MSE) while maintaining a perfect perceptual quality constraint. The tool provides capabilities for blind face image restoration and controlled experiments, offering model checkpoints and test datasets for evaluation. It supports various architectures, including HDiT and UNet, and includes installation instructions for setting up a conda environment. PMRF is ideal for researchers and developers focused on advancing image restoration techniques.

Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

55%

Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions is an open-source project offering solutions to the exercises found in the second edition of the seminal book 'Reinforcement Learning, An Introduction' by Richard S. Sutton and Andrew G. Barto. This resource is particularly useful for self-learners and students who lack official solution manuals or proper learning environments. The project covers mathematical proofs and some challenging coding problems, with contributions from various collaborators. It aims to provide a comprehensive guide for understanding the theoretical backbone of reinforcement learning, acknowledging that solutions may contain errors and encouraging community contributions for corrections and new solutions.

street_gaussians

street_gaussians

55%

Street Gaussians is an open-source project presented at ECCV 2024, focusing on modeling dynamic urban scenes using Gaussian Splatting. This tool provides a framework for researchers and developers to reconstruct complex, moving urban environments from video data. It includes functionalities for data preparation, such as converting Waymo Open Dataset, generating LiDAR depth, and creating sky masks. Users can configure parameters based on 3D Gaussian Splatting, train models, render scenes, and visualize results. The project offers scripts for training and rendering on example and experimental Waymo scenes, making it a valuable resource for advancing research in dynamic 3D scene reconstruction.

SEAM

SEAM

55%

SEAM (Self-supervised Equivariant Attention Mechanism) is an open-source implementation designed for weakly supervised semantic segmentation. This tool addresses the challenge of generating accurate object masks from image-level supervision, a common limitation in advanced class activation map (CAM) solutions. SEAM introduces a self-supervised approach by enforcing consistency regularization on predicted CAMs across various transformed images, effectively narrowing the gap between full and weak supervisions. Additionally, it incorporates a pixel correlation module (PCM) to refine predictions by leveraging context appearance information and similar neighbors. Extensive experiments on the PASCAL VOC 2012 dataset demonstrate SEAM's superior performance compared to state-of-the-art methods using the same level of supervision, making it a valuable resource for AI researchers and computer vision engineers.

wespeaker

wespeaker

55%

wespeaker is a comprehensive, open-source toolkit primarily focused on speaker embedding learning, with applications in speaker verification, recognition, and diarization. It supports both online feature extraction and the loading of pre-extracted features in Kaldi format. The toolkit offers command-line and Python programming interfaces for tasks like embedding extraction, similarity computation, and diarization. It boasts continuous development with recent updates including support for various models like w2v-bert2, Xi-vector, SimAM_ResNet, and Whisper-PMFA, as well as advanced features like quality-aware score calibration and MNN inference engine integration. wespeaker also provides detailed recipes for popular datasets like VoxCeleb, CnCeleb, and NIST SRE16, making it a robust solution for researchers and developers in the speech technology domain.

SAM3D Body with Rerun

SAM3D Body with Rerun

55%

SAM3D Body with Rerun is an AI tool designed for 3D body reconstruction, providing capabilities to visualize and analyze human bodies in three dimensions. This tool is particularly valuable for researchers and developers involved in AI model testing, offering a platform to interact with 3D body data. Hosted on Hugging Face, it aims to facilitate advancements in areas requiring detailed human body analysis. While the current live website indicates a runtime error, suggesting it's not fully operational, its intended purpose is to serve as a resource for those working with 3D human body models.

SEED-Bench Leaderboard

SEED-Bench Leaderboard

55%

SEED-Bench Leaderboard is a platform designed for evaluating and comparing the performance of various AI models. Users can submit their model evaluation results in JSON format, providing details such as the model name, type, size, and the evaluation method used. The platform then analyzes and displays the model's performance on a public leaderboard. This tool serves as a centralized hub for researchers and developers to track advancements and benchmark their models against others in the AI field. While the current live website indicates a build error, the intended functionality is to facilitate transparent and comparable evaluation of AI models.

SAM3 VLM-FO1

SAM3 VLM-FO1

55%

SAM3 VLM-FO1 is an AI tool designed for complex text label detection and object identification within images. Users can upload an image and provide natural language descriptions of the objects they wish to identify. The tool, leveraging SAM3 with VLM-FO1, then processes this input to highlight and label the specified objects directly on the image. This functionality makes it particularly useful for computer vision tasks and AI research, offering a practical application for detailed image annotation and understanding based on textual queries. It simplifies the process of identifying and categorizing visual elements through intuitive natural language interaction.

Awesome-Autonomous-Driving

Awesome-Autonomous-Driving

55%

Awesome-Autonomous-Driving is a comprehensive GitHub repository maintained by the Autonomous Driving Heart team, serving as a central hub for resources related to the autonomous driving industry. It meticulously organizes surveys, research papers, educational courses, and community discussions, covering the entire technology stack of autonomous driving. The repository provides in-depth learning paths for various sub-domains, including perception (BEV, multimodal, occupancy, radar-vision fusion), localization and mapping (online HD maps, SLAM), multi-sensor calibration, NeRF, visual language models, world models, planning and control, trajectory prediction, and AI model deployment. Additionally, it offers insights into industry-specific technical solutions and facilitates career opportunities through internal referral channels with numerous autonomous driving companies. This platform is designed to foster learning and collaboration among algorithm engineers and researchers.

Awesome-Vision-Mamba-Models

Awesome-Vision-Mamba-Models

55%

Awesome-Vision-Mamba-Models is an open-source GitHub repository dedicated to the rapidly evolving field of visual Mamba models. It functions as a comprehensive resource, offering a survey of existing models and exploring new outlooks and advancements in the domain. The repository is actively maintained and updated with the latest research papers and developments, making it an invaluable hub for researchers, academics, and practitioners working with or interested in visual Mamba. Its structure allows for easy navigation through various models and related information, fostering knowledge sharing and collaboration within the AI community.

Awesome-VLA4AD

Awesome-VLA4AD

55%

Awesome-VLA4AD is a comprehensive and continuously updated repository dedicated to Vision–Language–Action models for Autonomous Driving (VLA4AD). It serves as the companion resource to a survey paper, offering a curated collection of research papers, datasets, and tools in the field. The repository categorizes VLA4AD advancements into stages, from explanatory perception modules to end-to-end reasoning and control architectures. It details various models, their key features, and links to their respective papers and codebases. Additionally, it lists relevant datasets and benchmarks, making it an invaluable resource for researchers, academics, and engineers working on autonomous driving systems.

Gaussian-SLAM

Gaussian-SLAM

55%

Gaussian-SLAM is an open-source project available on GitHub, designed for photo-realistic dense Simultaneous Localization and Mapping (SLAM). It leverages Gaussian splatting to achieve high-quality 3D reconstruction, offering a robust solution for researchers and engineers in computer vision and robotics. The tool supports various datasets including Replica, TUM_RGBD, ScanNet, and ScanNet++, and provides scripts for easy setup and data downloading. Users can configure and run SLAM experiments, reproduce results, and even generate fly-through videos based on reconstructed scenes. It's tested on powerful GPUs like RTX3090 and RTX A6000, ensuring performance for demanding tasks.

ObjectDetectionImbalance

ObjectDetectionImbalance

55%

ObjectDetectionImbalance is a comprehensive repository dedicated to cataloging research papers that address imbalance problems within the field of object detection. This resource is meticulously maintained to provide an up-to-date list of relevant studies, organized according to a specific taxonomy outlined in a key review paper. Researchers and practitioners in computer vision can leverage this repository to quickly find papers on topics such as class imbalance (foreground-background, foreground-foreground), scale imbalance (object/box-level, feature-level), spatial imbalance (regression loss, IoU distribution, object location), and objective imbalance. It serves as a valuable reference for understanding and tackling common challenges in object detection research.

Transfer Learning Time Series

Transfer Learning Time Series

55%

Transfer Learning Time Series is an AI tool hosted on Hugging Face Spaces, designed for exploring and experimenting with transfer learning in the context of time series analysis. This platform allows users to apply knowledge gained from one time series dataset to another, which can be particularly useful for improving model performance on new or limited datasets. While the current live website indicates a runtime error, the tool's intent is to provide a space for researchers and practitioners to test and develop advanced time series forecasting and analysis methods using state-of-the-art AI techniques. It aims to facilitate the understanding and application of transfer learning principles in real-world time series challenges.

nerfies.github.io

nerfies.github.io

55%

Nerfies is an open-source project that hosts the source code for the Nerfies website, which is dedicated to Deformable Neural Radiance Fields. This repository serves as a valuable resource for researchers and developers working with neural radiance fields, particularly those interested in creating dynamic and deformable 3D scenes from 2D images. The project is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, encouraging collaboration and further development within the AI community. It provides the foundational code for understanding and implementing Nerfies, making it an essential reference for advancing research in computer vision and graphics.

pgmpy

pgmpy

55%

pgmpy is an open-source Python library designed for causal and probabilistic reasoning through graphical models. It offers comprehensive implementations of data structures for various models including DAGs, PDAGs, MAGs, PAGs, Bayesian Networks, Dynamic Bayesian Networks, and Structural Equation Models. The toolkit includes algorithms for key tasks such as causal discovery, causal identification, causal and probabilistic inference, model validation, parameter estimation, and simulations. Its modular and extensible API ensures compatibility with scikit-learn, allowing direct use, integration into sklearn pipelines, or building higher-level tools. pgmpy supports both discrete and linear Gaussian data, as well as mixture data with arbitrary relationships.

Grounding Dino Inference

Grounding Dino Inference

55%

Grounding Dino Inference is an AI tool hosted on Hugging Face Spaces, designed for advanced object detection and image analysis. Users can upload an image and then provide text descriptions of the objects they wish to identify. The application leverages the Grounding Dino model to accurately locate and highlight these specified objects within the uploaded image. This tool is particularly useful for researchers and developers working in computer vision, offering a straightforward interface to perform complex inference tasks. It provides a practical demonstration of the Grounding Dino model's capabilities in identifying diverse objects based on natural language input.

VisIT Bench Leaderboard

VisIT Bench Leaderboard

55%

VisIT Bench Leaderboard is a platform designed to benchmark and compare the performance of various AI models. Hosted on Hugging Face Spaces, it provides a centralized location for researchers and engineers to assess the state-of-the-art in different AI tasks. Users can access the latest evaluation results and contribute their own model predictions by running the provided auto-evaluation code and submitting their outputs via email. This collaborative approach fosters transparency and accelerates progress within the AI community by offering clear performance metrics and facilitating direct comparisons between different methodologies and models.

ZeroEval Leaderboard

ZeroEval Leaderboard

55%

ZeroEval Leaderboard is an AI tool developed by AllenAI, available as a Hugging Face Space, designed for evaluating and comparing the performance of various AI models. This application embeds ZeroEval, allowing users to integrate and utilize its evaluation tools directly on their websites without requiring any input. It serves as a centralized platform for researchers and developers to assess and benchmark AI model capabilities, fostering transparency and progress in the AI community. The tool is freely accessible and operates as a web application.

Z3D E621 Convnext Space

Z3D E621 Convnext Space

55%

Z3D E621 Convnext Space is a Hugging Face Space designed to analyze images and provide relevant tags. Users can either upload an image or capture one directly through the application. The tool then processes the image using a Convnext model and returns a comprehensive list of tags, each accompanied by a confidence score. This functionality is particularly useful for organizing image libraries, enhancing searchability, or understanding the content of an image through automated tagging. It offers a straightforward interface for quick image analysis.

Market Price Simulator

Market Price Simulator

55%

Market Price Simulator is a browser-based trading sandbox designed for exploring financial market dynamics. Users can create multiple traders, place buy and sell orders, and observe how trades are automatically matched and prices evolve in real time. This simulator provides a visible order book and a history of trades, making it an ideal platform for understanding price formation, supply and demand, and order volume without financial risk. It's a valuable resource for students, researchers, and anyone interested in the mechanics of financial markets.