Research & Education
Browsing page 33 of AI tools for Scientific Computing in Research & Education. Sorted by confidence score — our independent quality rating.
HighwayEnv
HighwayEnv offers a comprehensive collection of environments specifically designed for autonomous driving and tactical decision-making tasks. Developed and maintained by Edouard Leurent, this tool is ideal for researchers and developers working on AI algorithms for self-driving vehicles. It includes diverse scenarios such as highway driving, merging traffic, roundabouts, parking, intersections, and racetracks. Users can implement and test various reinforcement learning agents like Deep Q-Network, Deep Deterministic Policy Gradient, Value Iteration, and Monte-Carlo Tree Search. The environment is compatible with Gymnasium and provides a flexible platform for simulating complex driving situations, making it a valuable resource for advancing autonomous driving research.
R1-V
R1-V is an open-source project focused on enhancing the super generalization ability of Vision Language Models (VLM) with minimal computational cost. It aims to improve the perception and reasoning capabilities of VLMs through reinforcement learning. The project provides new VLM-RL environments, a comprehensive training codebase, and research papers. R1-V supports various models like Qwen2-VL and Qwen2.5-VL, and offers training datasets for tasks such as item counting and geometry reasoning. It also includes evaluation scripts for benchmarks like SuperClevr and GEOQA, making it a valuable resource for researchers and developers in the VLM domain.
seq2seq-signal-prediction
seq2seq-signal-prediction is an open-source project designed to teach users how to implement Sequence-to-Sequence (seq2seq) Recurrent Neural Networks (RNNs) for time series forecasting using TensorFlow. The project includes a series of four exercises of increasing difficulty, starting with deterministic signal prediction and progressing to more complex tasks like denoising and Bitcoin price forecasting. It provides a Jupyter notebook and a Python script version, with instructions for running the code locally or on Google Colab with GPU support. The exercises guide users through adjusting hyperparameters and modifying network architectures to achieve accurate predictions, making it a practical learning resource for those with some prior knowledge of RNNs.
TextGAN-PyTorch
TextGAN-PyTorch is a comprehensive PyTorch framework designed for Generative Adversarial Networks (GANs) based text generation models. It supports both general and category-specific text generation, making it a versatile tool for researchers and developers. The framework serves as a benchmarking platform, facilitating the evaluation and comparison of various GAN-based text generation models. It is particularly beneficial for those familiar with PyTorch, enabling them to quickly engage with the text generation field. The repository includes implementations of several prominent models like SeqGAN, LeakGAN, and RelGAN, along with detailed instructions for setup and usage, including real data experiments and visualization tools.
ViT-pytorch
ViT-pytorch offers a PyTorch reimplementation of the Vision Transformer (ViT) model, based on the paper 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'. This tool allows users to leverage the power of Transformers for image recognition, demonstrating that applying them directly to image patches and pre-training on large datasets yields state-of-the-art results. It includes various pre-trained models like ViT-B_16, R50+ViT-B_16, and ViT-L_32, which can be downloaded and used for training. The repository provides scripts for training models on datasets like CIFAR-10 and CIFAR-100, with options for mixed precision training and gradient accumulation. Additionally, it supports visualization of attention maps, offering insights into how the model processes images.
ViTPose
ViTPose is an official PyTorch implementation for human pose estimation, based on the NeurIPS'22 paper "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and the TPAMI'23 paper "ViTPose++: Vision Transformer for Generic Body Pose Estimation." This tool achieves impressive accuracy, including 81.1 AP on the MS COCO Keypoint test-dev set. It supports both single-task and multi-task training, covering human, animal, and whole-body pose estimation. ViTPose provides pre-trained models, detailed configurations, and a web demo integrated into Huggingface Spaces for easy experimentation with videos and images. It's built on PyTorch and utilizes mmcv, making it a robust solution for researchers and developers in computer vision.
Flexcompute
Flexcompute is a physics intelligence platform offering high-fidelity physics simulation from the ground up, created by engineers from MIT and Stanford. The platform redefines simulation technology to help users innovate faster, cut costs, and reduce risks, making hardware development as easy as software. Key products include AutoInsight for AI-driven aerodynamic optimization, PhotonForge for Photonic Integrated Circuit (PIC) design, Flow for Computational Fluid Dynamics (CFD), RF for Electromagnetics, and Photonics for Integrated Photonics Simulation. It also features Geometry AI for automated geometry processing and Nexus for on-premises simulation. Flexcompute is trusted by over 250 companies and academic institutions, providing GPU-accelerated solutions for various engineering and scientific applications.
Metakosmos
Metakosmos specializes in the design and production of advanced planetary exploration suits, with its flagship product being the Kosmosuit. This innovative spacesuit incorporates patent-pending technology and proprietary software for real-time data analysis, including bioastronautics telemetry and environmental sensing. The Kosmosuit aims to democratize spacesuit access and astronaut training, offering variants customized for different environments to make the experience accessible to a wider audience. The accompanying software allows users to create a digital twin of their suit and experience simulations based on their unique body proportions, enhancing training and performance in extreme conditions.
OrdinaryDiffEq.jl
OrdinaryDiffEq.jl is a high-performance component package within the DifferentialEquations ecosystem, specifically designed for solving ordinary differential equations (ODE) and differential-algebraic equations (DAE). While it can be used independently, it integrates seamlessly with DifferentialEquations.jl. The tool offers a wide range of solvers, including those for neural ordinary differential equations (neural ODEs) and is integral to scientific machine learning (SciML) applications. It supports both out-of-place and in-place syntax for defining differential equations, with optimized versions for static arrays. Additionally, it provides specialized methods for refined ODEs like dynamical equations and SecondOrderODEProblems, enabling the use of symplectic integrators for Hamiltonian dynamics. The package is written in Julia, offering a robust and efficient environment for complex numerical simulations.
ANTsPy
ANTsPy is a powerful Python library that wraps the well-established C++ ANTs (Advanced Normalization Tools) framework, providing blazing-fast medical image processing capabilities. It enables users to perform advanced operations such as image registration, segmentation, and statistical learning. The library also includes functions for efficient reading and writing of medical images, as well as tools to create publication-ready visualizations. For those interested in deep learning, ANTsPyNet is available for training and visualizing deep learning models on medical imaging datasets. It supports installation via pre-compiled binaries or building from source, making it accessible for various development environments.
Arabic Tokenizer Arena
Arabic Tokenizer Arena is a specialized platform designed for in-depth analysis of Arabic text tokenization. Users can input their own Arabic text or select from pre-made samples, then choose one or more tokenizers to observe how they split the text. The tool offers comprehensive metrics such as token count, fertility, and Out-Of-Vocabulary (OOV) rate, providing valuable insights into the tokenization process. Additionally, it generates visual representations to help users understand the tokenization results more intuitively. This tool is particularly useful for researchers, developers, and linguists working with Arabic language processing, offering a robust environment for comparing and evaluating different tokenization strategies.
GPU Poor LLM Arena
GPU Poor LLM Arena is a platform designed for the comparison and evaluation of compact language models, specifically those with up to 14 billion parameters. It offers a battle arena format where users can input a text prompt and receive side-by-side answers from two different language models. This setup facilitates direct comparison, allowing users to vote for the better reply and contribute to a community-driven ranking. The tool is ideal for researchers, developers, and enthusiasts interested in understanding the practical performance of smaller, more resource-efficient AI models without requiring extensive GPU resources. It provides insights into the capabilities of frugal AI options.
FLUX.1 Dev ControlNet Union Pro
FLUX.1 Dev ControlNet Union Pro is an AI tool designed for generating customized art from images using ControlNet technology. It allows users to upload an image and provide a descriptive prompt, then select from various control modes such as Canny, Depth, or OpenPose to guide the AI in creating the desired output. This tool leverages the power of ControlNet to offer precise control over the generated images, making it suitable for a range of creative applications. While the specific use cases are broad, its core functionality revolves around transforming existing images into new artistic interpretations based on user input and chosen control parameters.
cosine_metric_learning
cosine_metric_learning offers a repository with code for training a metric feature representation, specifically tailored for person re-identification tasks. This tool is intended to be used in conjunction with the deep_sort tracker, implementing the approach described in the 'Deep Cosine Metric Learning for Person Re-identification' paper. It includes functionalities to train models on datasets like Market1501 and MARS, with options for different loss modes such as cosine-softmax. Users can monitor training progress and evaluation metrics using TensorBoard, export features for testing, and freeze trained models for deployment with Deep SORT. The repository provides detailed instructions for setting up datasets, initiating training, and evaluating model performance.
SMARTS
SMARTS (Scalable Multi-Agent Reinforcement Learning Training School) is an open-source simulation platform developed by Huawei Noah's Ark Lab, designed for multi-agent reinforcement learning (RL) and autonomous driving research. It provides a robust environment for simulating complex traffic scenarios and testing autonomous vehicle algorithms. The platform emphasizes realistic and diverse interactions, making it a valuable tool for researchers and developers in the field. As part of the XingTian suite of RL platforms, SMARTS offers a scalable solution for training and evaluating RL agents in dynamic driving environments. It is available on GitHub, allowing for community contributions and widespread use.
Facetorch App
Facetorch App is a Python library designed for comprehensive facial analysis, available as a Hugging Face Space. It allows users to upload photos or use a webcam to detect faces, generate 3D facial landmarks, and analyze various facial attributes. The app provides detailed reports on detected facial expressions, action units, and emotion scores. It also includes capabilities for extracting facial embeddings and performing face recognition. This tool is particularly useful for developers and researchers in computer vision who require advanced facial analysis functionalities for their projects.
Geocalc MCP
Geocalc MCP is an AI-powered geospatial tool developed during the Agents-MCP-Hackathon, designed to execute various geo-calculations independently, without relying on external third-party APIs. This application offers core functionalities such as converting addresses into precise geographical coordinates, calculating distances between points, and planning optimal routes. Users can also visualize these calculations and routes on maps, and identify nearby points of interest. It provides a self-contained solution for geospatial computations, making it suitable for projects requiring independent geo-processing capabilities.
K-Tech CoE Data Science & AI - NASSCOM
NASSCOM serves as the apex body for India's $315 billion technology industry, encompassing over 3,000 member companies across services, products, and startups. The organization plays a crucial role in policy advocacy, shaping regulations that foster innovation and technological advancement. It provides valuable industry knowledge through flagship publications and insights, empowering members with a deeper understanding of both the Indian tech landscape and the global economy. NASSCOM also emphasizes skilling and training, co-creating programs to develop industry-ready talent and establish India as a digital hub. Furthermore, it facilitates powerful connections among global innovators and visionaries through various events, promoting collaboration and growth opportunities within the tech sector.
sports
sports is an open-source project by SkalskiP dedicated to exploring the intersection of Computer Vision and Sports. It features various experiments, including football player tracking using YOLOv5 and ByteTrack, 3D football player pose estimation with YOLOv7, and assigning players to teams based on uniform color using GPT-4V. The project is designed for researchers and developers interested in applying advanced AI techniques to sports analytics, offering practical examples and code for implementing these vision-based solutions. It serves as a valuable resource for understanding and replicating complex computer vision tasks in a sports context.
SupContrast
SupContrast offers a PyTorch implementation of "Supervised Contrastive Learning" and, incidentally, "A Simple Framework for Contrastive Learning of Visual Representations" (SimCLR). This repository serves as a reference, illustrating these methods using CIFAR datasets. It includes a `SupConLoss` function that takes features and labels, degenerating to SimCLR loss if labels are not provided. The implementation provides comparison results on CIFAR-10 and CIFAR-100, showcasing improved accuracy over standard cross-entropy. It also details running instructions for standard cross-entropy, supervised contrastive learning, and SimCLR, including pretraining and linear evaluation stages, and supports custom datasets.
AI Town on HuggingFace
AI Town on HuggingFace offers a unique web-based simulation environment where users can observe and interact with AI-driven characters. These characters are designed to live, move around, and engage in conversations with each other, creating a dynamic and evolving virtual town. Users have the ability to type messages to these AI characters and receive real-time replies, fostering an interactive experience. This tool provides a platform for experimenting with AI in a simulated setting, allowing for observation of AI behavior and interaction patterns. It's a project that showcases the capabilities of AI in creating autonomous, conversational agents within a virtual world.
tslearn
tslearn is an open-source machine learning toolkit specifically designed for time series analysis in Python. It provides a wide array of functionalities for tasks such as clustering, classification, and regression of time series data. The toolkit supports various data preprocessing steps, including scaling and resampling, and offers different distance metrics like Dynamic Time Warping (DTW). tslearn is built to be compatible with scikit-learn's API, allowing users to leverage familiar utilities for hyper-parameter tuning and pipelines. It also includes features for calculating barycenters, performing early classification, and working with UCR datasets, making it a versatile tool for researchers and practitioners in the field.
transdim
transdim is an open-source machine learning project focused on transportation data imputation and prediction. It provides models to address challenges in spatiotemporal data modeling, specifically dealing with incomplete data and forecasting future traffic states. The project implements various machine learning models, mainly in Python using Numpy and Jupyter Notebooks, for tasks such as missing data imputation (e.g., random, non-random, and blockout missing patterns) and spatiotemporal prediction, both with and without missing values. It supports a range of publicly available transportation datasets, including traffic speed, volume, and passenger flow data from various cities. The project aims to create accurate and efficient solutions for these complex data challenges, offering practical examples and documentation for implementation and evaluation.
Transformer-MM-Explainability
Transformer-MM-Explainability is an official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers. This open-source project offers a novel method to visualize and understand the decision-making processes of any Transformer-based network. It includes practical examples for popular models such as DETR, VQA, CLIP, and LXMERT, making it a valuable resource for researchers and developers working with multi-modal and encoder-decoder architectures. The tool provides notebooks for easy experimentation and reproduction of results, with clear instructions for setting up environments and running examples on GPUs, including Colab support.