Research & Education
Browsing page 438 of AI tools for Research & Education. Sorted by confidence score — our independent quality rating.
IntrotoAI Mental Health Project
The IntrotoAI Mental Health Project is an educational AI tool developed by aiEDUcurriculum and hosted on Hugging Face Spaces. It aims to raise awareness about mental health by demonstrating how AI models can process user input to provide personalized recommendations. Users can enter their data, and the application will utilize an AI model to predict the best outcome from a provided dataset. This project serves as a practical example for understanding AI's application in sensitive domains like mental health, offering insights into data-driven predictions and model interpretation. It is designed to be accessible and free to use, making AI education in mental health more approachable.
CVPR-2019-Paper-Statistics
CVPR-2019-Paper-Statistics is an open-source project offering detailed statistics and visualizations for papers accepted at the CVPR 2019 conference. Inspired by ICLR2019-OpenReviewData, this tool analyzes the acceptance rate trends from 2015 to 2019, highlighting the significant increase in paper submissions and the corresponding decrease in acceptance rates. It also provides insights into the most frequent keywords in accepted papers, such as 'Image', 'detection', '3d', 'object', 'video', 'segmentation', 'adversarial', 'recognition', and 'visual'. The project includes Jupyter Notebook code for analysis and visualization, supporting both CSV and website data formats, and requires Python 3.5 with libraries like selenium, wordcloud, and matplotlib.
deep-reinforcement-learning-papers
deep-reinforcement-learning-papers is a comprehensive, open-source GitHub repository dedicated to cataloging papers and resources related to deep reinforcement learning. The collection is organized into categories such as Deep Value Function, Deep Policy, Deep Actor-Critic, Deep Model, and Application to Non-RL Tasks, making it easier for users to navigate specific areas of interest. It also includes sections for talks, slides, and other miscellaneous resources. The project is actively maintained with a stated goal to continuously add more papers and improve classification methods, welcoming contributions from the community. This resource is ideal for anyone looking to explore the foundational and cutting-edge research in deep reinforcement learning.
Deep-Reinforcement-Stock-Trading
Deep-Reinforcement-Stock-Trading is a light-weight, open-source framework designed for applying deep reinforcement learning algorithms to stock trading and portfolio management. This project offers a highly modular and scalable environment for researchers and developers to explore advanced AI strategies in finance. It includes features for training and evaluating DDPG and DQN agents, with built-in metrics and visualizations. The framework supports single stock types and basic actions like buy, hold, and sell, with plans to integrate more sophisticated algorithms, complex state representations, and high-quality data sources for backtesting. It's ideal for those looking to experiment with AI in financial markets.
Video-XL
Video-XL is an open-source project offering a family of efficient vision-language models (VLMs) specifically designed for understanding extremely long videos, capable of processing content at an hour scale. The project includes models like Video-XL2 and Video-XL-Pro, which have achieved state-of-the-art results on various long video understanding benchmarks. Video-XL-Pro, for instance, can process up to 10,000 frames on an 80G GPU with only 3 billion parameters. The project provides models, training, and evaluation code, making it a valuable resource for researchers and developers working with extensive video data. It builds upon existing codebases like LongVA and LMMs-Eval for its development and evaluation processes.
Embodied_AI_Paper_List
Embodied_AI_Paper_List is an open-source repository maintained by HCPLab at SYSU and Pengcheng Laboratory, offering a comprehensive collection of papers and resources focused on Embodied AI. This resource is designed to serve as a foundational reference for researchers and practitioners, bridging the gap between cyberspace and the physical world through intelligent systems. The repository covers key areas such as embodied perception, interaction, agent development, and sim-to-real adaptation, including state-of-the-art methods, essential paradigms, and comprehensive datasets. It also explores the role of Multi-modal Large Models (MLMs) and World Models (WMs) in facilitating interactions for embodied agents, highlighting their significance in both digital and physical environments. The list is regularly updated with the latest advancements and includes a survey paper accepted by IEEE/ASME Transactions on Mechatronics.
Moonshot Math
Moonshot Math is a formal reasoning model available as a Hugging Face Space, designed to assist users in solving complex mathematical problems. It functions by taking a user-provided math problem or formal statement and generating a detailed, step-by-step solution in Lean 4 code. This capability makes it a valuable resource for individuals seeking to understand or verify mathematical proofs. The tool leverages advanced AI to reason and prove theorems, offering a unique approach to mathematical problem-solving and exploration. Its focus on formal proofs in Lean 4 distinguishes it as a specialized tool for those involved in advanced mathematics or formal verification.
fpn.pytorch
fpn.pytorch offers a pure PyTorch implementation of the Feature Pyramid Network (FPN) for object detection, building upon the properties of a faster R-CNN implementation. This project stands out for its complete conversion of all NumPy implementations to PyTorch, ensuring a consistent and efficient environment. A key feature is its support for training with batch sizes greater than one, achieved by revising all relevant layers including dataloader, RPN, and ROI-pooling. It also leverages a multiple GPU wrapper (nn.DataParallel) for flexible scaling across one or more GPUs. The implementation integrates three pooling methods—ROI pooling, ROI align, and ROI crop—all adapted for multi-image batch training. Benchmarking has been conducted on datasets like PASCAL VOC and COCO, demonstrating its performance.
gaussian_splatting_notes
Gaussian Splatting Notes is a free, open-source educational resource offering a comprehensive breakdown of the mathematical formulae behind Gaussian Splatting. This guide, presented as a text version of an explanatory stream, delves into the intricacies of the rasterization process, specifically covering the forward and backward passes. It aims to provide as many details as possible, highlighting core algorithmic concepts and referencing original code snippets to aid understanding. The resource also includes important insights marked with '💡' and clarifies complex topics like 3D covariance reparametrization and 2D Gaussian projection, making it an invaluable aid for those studying this advanced 3D rendering technique.
Object Detection Web
Object Detection Web is a free, web-based AI tool hosted on Hugging Face Spaces, developed by Xenova. It provides a straightforward way to perform object detection on images. Users can easily upload their own images or select from example images to see the application identify and label various objects present. This tool is particularly useful for individuals interested in learning about object detection technology, exploring its capabilities, or for simple task automation where identifying objects in images is required. Its accessible web interface makes it suitable for educational purposes and fun exploration without requiring any technical setup.
hate-speech-and-offensive-language
The hate-speech-and-offensive-language repository is an Open Source project associated with the paper "Automated Hate Speech Detection and the Problem of Offensive Language" from ICWSM 2017. It offers a valuable dataset, lexicons, and Python 2.7 code for researchers and developers interested in analyzing and detecting hate speech and offensive language in online content, particularly from Twitter. The repository also includes a classifier script and instructions for running it on new data. While the project is no longer actively maintained, it serves as a foundational resource for understanding and addressing the complexities of offensive language detection, with a focus on the nuances of racial bias in such datasets.
Noteey
Noteey is a visual note-taking application designed for deep thinking and knowledge management, offering an infinite canvas to learn, brainstorm, and transform ideas into insights. It supports a wide array of content, including text, images, sticky notes, weblinks, PDFs, mind maps, videos, and sketches, all unified in one space. Key features include a comprehensive highlight system for breaking down documents and videos, timestamped video and audio notes, and drawing tools for creating diagrams. Noteey operates offline-first, storing data locally on your device for security and speed, and allows for local backups and sharing of projects. It also offers AI tools like YouTube and PDF summarizers.
IsaacGymEnvs
IsaacGymEnvs is a collection of reinforcement learning environments specifically designed for the NVIDIA Isaac Gym platform. These environments are optimized for high-performance GPU-based physics simulation, as detailed in the NeurIPS 2021 Datasets and Benchmarks paper. The repository offers an easy-to-use API for creating vectorized environments, supporting various tasks like Ant locomotion, Cartpole, and AllegroHand manipulation. It includes features such as headless training, checkpoint loading, multi-GPU training, population-based training, and integration with Weights & Biases for experiment tracking. The framework also incorporates domain randomization to enhance sim-to-real transfer of trained policies, making it a powerful tool for advanced robot learning research and development.
Image-Adaptive-YOLO
Image-Adaptive-YOLO is an open-source implementation of an object detection model specifically engineered to perform robustly in adverse weather conditions. Based on the research paper "Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions (AAAI 2022)", this tool incorporates image-adaptive filtering techniques to enhance detection accuracy in scenarios like fog, darkness, or other challenging visual environments. The project provides code for installation, dataset preparation (including VOC PASCAL, RTTS, ExDark, and custom foggy/dark datasets), and both training and evaluation scripts. It is built on Python and TensorFlow, making it accessible for researchers and developers working on computer vision tasks in difficult conditions.
Question Generation Using T5
Question Generation Using T5 is an AI tool hosted on Hugging Face Spaces, designed to generate questions from provided text input. Leveraging the T5 model, this tool is useful for various applications such as creating quizzes, developing study materials, and generating educational content. However, the live website indicates that this Space is currently paused. Users interested in utilizing this tool would need to contact the author(s) via the community tab on Hugging Face to request its restart. While the core functionality is question generation, its current availability is limited.
morphsnakes
morphsnakes is an open-source Python library providing an implementation of Morphological Snakes for image segmentation and tracking. This tool is designed for both 2D images and 3D volumes, offering a robust alternative to traditional active contour methods like Geodesic Active Contours or Active Contours without Edges. Unlike these traditional approaches that rely on solving PDEs over floating-point arrays, morphsnakes utilizes morphological operators such as dilation and erosion on binary arrays, leading to faster execution and improved numerical stability. The library includes two main methods: Morphological Geodesic Active Contours (MorphGAC) for images with visible contours requiring preprocessing, and Morphological Active Contours without Edges (MorphACWE) which is more robust to noise and suitable when pixel values of inside and outside regions differ significantly. Installation is straightforward via pip or by directly copying the `morphsnakes.py` file.
openarm
OpenArm is a fully open-source 7DOF humanoid arm specifically engineered for physical AI research and deployment, particularly in contact-rich environments. Its design emphasizes high backdrivability and compliance, making it suitable for safe human-robot interaction while still providing practical payload capabilities for real-world applications. The arm features human-scale proportions and is available as a complete bimanual system for $6,500 USD, offering a flexible platform for teleoperation, imitation learning, simulation, and real-world data collection. OpenArm is under continuous development, actively seeking contributors, research partners, and company collaborators to advance practical humanoid systems.
robomimic
robomimic is a comprehensive, modular framework designed for robot learning from demonstration. It offers a wide array of demonstration datasets specifically collected for robot manipulation domains, alongside robust offline learning algorithms to effectively learn from these datasets. The primary goal of robomimic is to enhance the accessibility and reproducibility of robot learning research, enabling researchers and practitioners to benchmark tasks and algorithms consistently. This framework facilitates the development of the next generation of robot learning algorithms, supporting features like Diffusion Policy, multi-dataset training, language-conditioned policies, and integration with robosuite and DeepMind MuJoCo bindings. It also supports various observation modalities, pre-trained image representations, and logging with wandb.
SimpleVLA-RL
SimpleVLA-RL is an open-source reinforcement learning (RL) framework designed to efficiently scale the training of Vision-Language-Action (VLA) models. It provides an end-to-end RL pipeline built on veRL, incorporating VLA-specific optimizations such as multi-environment parallel rendering for accelerated trajectory sampling. The framework leverages state-of-the-art infrastructure for efficient distributed training, hybrid communication patterns, and optimized memory management. SimpleVLA-RL supports various VLA models like OpenVLA and OpenVLA-OFT, and benchmarks including LIBERO and RoboTwin 1.0/2.0. It emphasizes minimal reward engineering with binary outcome rewards and includes exploration strategies like dynamic sampling and adaptive clipping. The modular architecture allows for easy integration of new VLA models, benchmarks, and RL algorithms, making it a powerful tool for researchers and developers in the field.
semantic-segmentation-editor
Semantic Segmentation Editor is an open-source, web-based labeling tool designed for creating AI training datasets from both 2D bitmap images and 3D point clouds. Developed by Hitachi Automotive And Industry Lab, it is particularly useful for autonomous driving research. The tool supports various image formats like JPG and PNG, and point cloud formats including ASCII, Binary, and Binary compressed. It offers a comprehensive set of tools for polygon drawing, magic tool for contrast detection, manipulation, cutting/expanding, and contiguous polygon creation for bitmap images. For point clouds, it provides functionalities for rotation, zooming, and point selection. The editor is built using Meteor, React, Paper.js, and three.js, and can be run via Docker Compose or from source.
SensorsCalibration
SensorsCalibration, also known as OpenCalib, is a comprehensive open-source toolbox designed for multi-sensor calibration in autonomous driving applications. Accurate sensor calibration is a foundational requirement for any autonomous system, enabling precise sensor fusion and subsequent processing steps like obstacle detection, localization, mapping, and control. This toolbox addresses the critical need for reliable calibration of various sensors, including IMU, LiDAR, Camera, and Radar. It offers both road scene-based calibration tools for parameters like camera intrinsics, lidar2imu, and surround-camera, as well as factory calibration tools supporting different board types such as chessboard, circle board, and Apriltag board. Additionally, it includes SensorX2car for online calibration of sensor-to-car coordinate systems.
SelfExSR
SelfExSR is a research code implementation for single image super-resolution, based on the paper "Single Image Super-Resolution from Transformed Self-Exemplars" (CVPR 2015). This algorithm stands out by achieving state-of-the-art performance in image super-resolution without requiring any external training dataset, complex feature extraction, or complicated learning algorithms. It operates by learning from transformed self-exemplars within the image itself. The repository provides the MATLAB source code, testing images for various datasets (Set5, Set14, Urban 100, BSD 100, Sun-Hays 80), and precomputed results for comparison with other state-of-the-art methods. While designed as educational code and not optimized for speed, users can adjust iteration numbers for a trade-off between speed and visual quality.
super-resolution
This open-source project provides a Tensorflow 2.x based implementation of state-of-the-art models for single image super-resolution, including Enhanced Deep Residual Networks (EDSR), Wide Activation for Efficient and Accurate Image Super-Resolution (WDSR), and Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (SRGAN). It offers a high-level training API, enabling users to train models as described in the respective papers and fine-tune EDSR and WDSR models within an SRGAN context. The tool includes a DIV2K data provider for automatic dataset downloads and offers pre-trained weights for quick setup. It's ideal for developers and researchers working on image processing and computer vision tasks.
sphereface
SphereFace offers a comprehensive open-source implementation of the SphereFace algorithm, a deep hypersphere embedding method for face recognition. This tool provides a full pipeline covering face detection, alignment, and recognition, making it valuable for researchers and developers in computer vision. It includes detailed instructions for installation and usage, demonstrating how to train models on datasets like CASIA-WebFace and evaluate performance on LFW. The repository also features various network architectures, including SphereFace-20, and highlights its state-of-the-art verification performance in challenges like MegaFace. Additionally, it provides insights into the underlying mathematical concepts and practical considerations for training, such as gradient normalization and convergence difficulties, along with links to third-party re-implementations and related angular margin learning resources.