AI Agents & Automation
Browsing page 591 of AI Agents & Automation. Sorted by confidence score — our independent quality rating.
Attendance-Management-system-using-face-recognition
Attendance-Management-system-using-face-recognition is an open-source project built with Python and OpenCV, designed to automate attendance tracking through facial recognition. Users can register new students by taking multiple images, which are then used to train the system's facial recognition model. Once trained, the system can automatically mark attendance for registered individuals by detecting their faces. It generates CSV files for attendance records, organized by subject, and allows users to view attendance data in a tabular format. This system requires users to set up their environment and adjust file paths, making it a technical solution for automated attendance.
luos_engine
Luos-engine is an open-source, lightweight library designed to manage hardware products as a collection of independent software features. It functions as a real-time orchestrator for cyber-physical systems, facilitating the design, testing, and deployment of embedded applications and digital twins. The tool can be utilized on any microcontroller or computer, across various networks, promoting free and fast development of multi-electronic-board connected products. By using Luos-engine, developers can leverage existing work, accelerate time-to-market, and ensure robustness and universality of their applications. It supports development, debugging, validation, monitoring, and management from anywhere, promoting organized and effective development practices for scalability and adaptability.
caffe-yolo
caffe-yolo offers a Caffe implementation of the YOLO (You Only Look Once) real-time object detection system. This tool specifically supports YOLO v1 and includes batch normalization layers. The Caffe models used are not trained within Caffe but are converted from Darknet's original .weight files, ensuring compatibility and leveraging existing pre-trained models. The conversion process involves creating .prototxt files from Darknet's .cfg files, initializing the Caffe network, reading weights from Darknet, and then replacing initialized weights with the pre-trained ones. It provides scripts for creating .prototxt and .caffemodel files, and a main script for performing object detection on images. This makes it a valuable resource for developers and researchers working with object detection in a Caffe environment.
maml
Maml is an open-source code repository for Model-Agnostic Meta-Learning (MAML), a technique designed for the fast adaptation of deep networks. Developed by cbfinn, this repository provides the foundational code accompanying the paper "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks" (Finn et al., ICML 2017). It specifically includes implementations for few-shot supervised learning domain experiments, covering tasks such as sinusoid regression, Omniglot classification, and MiniImagenet classification. The project is built using Python 2.* or 3.* and TensorFlow v1.0+, making it accessible for researchers and developers working in meta-learning and few-shot learning. Users can access data preparation instructions for Omniglot and MiniImagenet, and detailed usage instructions are available within the `main.py` file.
Cute Tarot
Cute Tarot is a digital platform and mobile application designed to make tarot card readings accessible and engaging. It offers both 'Kawaii Tarot' and 'Spoopy Tarot' themes, providing a unique aesthetic for spiritual exploration. Users can enjoy digital pick-a-card readings, receive tailored interpretations, and set daily intentions. The platform also includes features like a free daily pentacle, the ability to upload IRL (in real life) tarot spreads, and a 'Cute Serendipity Rewards' system. It integrates quantum science principles and allows users to quickly find card meanings, supporting a modern approach to spiritual well-being.
DeepRL-Agents
DeepRL-Agents is an open-source repository offering a comprehensive collection of Deep Reinforcement Learning algorithms, all implemented using Tensorflow. This resource is ideal for individuals looking to understand and apply various RL techniques, from foundational Q-learning and policy gradient methods to more advanced concepts like Double-Dueling-DQN, Deep Recurrent Q-Networks, and Asynchronous Advantage Actor-Critic (A3C). The repository includes iPython notebooks for each algorithm, often accompanied by tutorial series published on Medium, making it a valuable educational and practical tool for learning about reinforcement learning.
DeepRL-TensorFlow2
DeepRL-TensorFlow2 is a GitHub repository offering straightforward implementations of a wide array of Deep Reinforcement Learning (DRL) algorithms, all built with TensorFlow2. The project prioritizes code clarity, making it an excellent resource for students and researchers delving into DRL. Each algorithm is contained within a single Python script, simplifying the learning process by eliminating the need to navigate multiple files. The repository is actively maintained and continuously updated with new DRL algorithms. It currently includes implementations for DQN, DRQN, DoubleDQN, DuelingDQN, A2C, A3C, PPO, and DDPG, with TRPO, TD3, and SAC noted as planned additions. The project also provides code snippets illustrating the core ideas behind each algorithm, such as using target networks and replay buffers in DQN, or advantage functions in A2C.
Mouse Hackathon
Mouse Hackathon is a dynamic platform designed for creative innovation using AI, specifically structured around 1-minute challenges. It serves as a Hugging Face Space by VIDraft, offering a collaborative environment for AI enthusiasts and innovators. The platform allows users to participate in the MOUSE-I Hackathon, providing clear information on dates, prize amounts, and participation steps. It also features language switching between English and Korean, alongside a news view, to keep participants informed and engaged. This tool is ideal for those looking to quickly experiment with AI concepts and engage in rapid prototyping within a competitive yet supportive hackathon setting.
Deep-reinforcement-learning-with-pytorch
Deep-reinforcement-learning-with-pytorch is an open-source GitHub repository that offers PyTorch implementations of classic and state-of-the-art deep reinforcement learning algorithms. The project includes implementations of popular methods such as DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, and TD3. Its primary goal is to provide clear and accessible code, making it easier for individuals to learn and experiment with deep reinforcement learning algorithms. The repository is actively maintained, with plans to add more advanced algorithms and update existing code. It also provides installation instructions and examples for testing the implementations.
mmskeleton
MMSkeleton is an open-source toolbox developed by OpenMMLAB, specifically designed for skeleton-based human understanding. It offers a highly extensible framework that systematically organizes code and projects, allowing for adaptation to various tasks and scaling to complex deep models. Key functionalities include 2D and 3D pose estimation, skeleton-based action recognition (like ST-GCN), and action synthesis. The toolbox also supports building custom skeleton-based datasets and creating personalized applications. It is part of the OpenMMLAB project, developed on the ST-GCN research project, and is released under the Apache 2.0 license.
evolution-strategies-starter
evolution-strategies-starter offers a distributed implementation of the Evolution Strategies algorithm, as detailed in the paper "Evolution Strategies as a Scalable Alternative to Reinforcement Learning." This open-source project utilizes a master-worker architecture where the master broadcasts parameters to workers, and workers return results. The code is specifically designed to run on AWS EC2, making it resilient to worker termination and suitable for spot instances. It requires a Mujoco license for humanoid experiments and uses Packer for AMI building. The project is provided as-is, with no further updates expected, serving as a foundational codebase for researchers and developers in reinforcement learning.
Skywork-R1V
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning. The series includes both open-source versions with model weights and inference code, as well as closed-source offerings like Skywork-R1V4-Lite. These models deliver exceptional performance across vision understanding, code execution, and deep research tasks, featuring agentic capabilities. Key features include code execution for complex tasks, deep research integration with web search, multi-turn reasoning with tool usage, and streaming support for real-time responses. The models have demonstrated state-of-the-art performance on various multimodal benchmarks, particularly excelling in perception and deep research capabilities.
ner-annotator
ner-annotator is a specialized Named Entity Recognition (NER) annotation tool designed to create training data for custom NER models with SpaCy. It provides an intuitive user interface for labelling entities in text, supporting both word-level and character-level annotation. Users can define custom labels with color-coding for enhanced clarity. The tool generates training data in a generic JSON format, making it readily usable for various tagging formats like IO, IOB, or IOBES. While no longer actively maintained, the web application and desktop versions (Linux and Windows) remain fully functional, offering features like keyboard shortcuts and the ability to import existing annotations for review. It also includes light and dark themes for user preference.
Online-3D-BPP-DRL
Online-3D-BPP-DRL is an open-source project that provides the implementation of the paper "Online 3D Bin Packing with Constrained Deep Reinforcement Learning." This tool is designed for researchers and developers interested in optimizing 3D bin packing problems using AI. It allows users to train new models on randomly generated sequences or test existing models with various data sets. The repository includes code for user-study applications, multi-bin algorithms, and MCTS for comparison, offering a comprehensive environment for experimentation and development in this domain. Users can adjust network architectures and parameters to suit their specific needs, making it a flexible platform for advanced AI research in logistics and optimization.
Online-3D-BPP-PCT
Online-3D-BPP-PCT is an open-source tool that implements a method for efficient online 3D bin packing. It leverages deep reinforcement learning (DRL) on a hierarchical packing configuration tree to enhance the practical applicability of the online 3D Bin Packing Problem (BPP). This approach makes the DRL model adept at dealing with practical constraints and performing well even in continuous solution spaces. Key features include arbitrary container and item sizes, support for continuous online 3D-BPP, algorithms for approximating stability, and improved performance with complex constraints. It also offers more adequate heuristic baselines for domain development and stable training.
python-docx2txt
python-docx2txt is a pure Python-based utility designed for extracting text and images from DOCX files. This open-source tool is adapted from python-docx but extends its capabilities to include content from headers, footers, and hyperlinks, offering a more comprehensive extraction solution. It can be run both from the command line for quick processing or integrated into Python scripts for automated document handling. Users can specify a directory to save extracted images, making it useful for tasks requiring both textual and visual data from DOCX documents. Its straightforward installation via pip and simple usage make it accessible for developers and data scientists working with document processing.
pytorch-pose
pytorch-pose is an open-source PyTorch toolkit designed for 2D single human pose estimation. It offers a comprehensive pipeline for training, inference, and evaluation, making it a valuable resource for researchers and developers in computer vision. The toolkit includes a robust dataloader with various data augmentation options, compatible with popular human pose databases such as MPII, LSP, and FLIC. Key features include multi-thread data loading, multi-GPU training support, a logger for tracking progress, and visualization of training and testing results. It is compatible with PyTorch 0.4.1/1.0 and provides detailed instructions for installation, data preparation, and usage, including testing with pre-trained models and evaluating PCKh@0.5 scores.
PyGCL
PyGCL is a PyTorch-based open-source library specifically designed for Graph Contrastive Learning (GCL). It provides a comprehensive framework for researchers and developers to implement and experiment with various GCL algorithms. The library features modularized GCL components, including graph augmentation techniques like Edge Adding, Feature Masking, and Node Dropping, as well as different contrasting architectures and modes (single-branch, dual-branch, bootstrapped, within-embedding). PyGCL also implements a variety of contrastive objectives such as InfoNCE, JSD, and Barlow Twins, alongside negative sampling strategies. It supports standardized evaluation with evaluators like Logistic Regression and SVM, and offers utilities for managing experiments, making it a valuable tool for advancing graph representation learning.
nitrain
Nitrain (formerly torchsample) is a framework-agnostic Python library designed for medical image analysis, enabling efficient training of AI models. It provides robust functionalities for sampling and augmenting medical images, supporting various frameworks like PyTorch, TensorFlow, and Keras. The library simplifies model training by offering reasonable defaults and a high level of abstraction. Users can visualize results within a medical imaging context, making it a comprehensive tool for medical imaging AI development. Full examples for segmentation, classification, and registration tasks are available, and it integrates with the ANTsPy package for advanced medical image processing.
SEAM
SEAM (Self-supervised Equivariant Attention Mechanism) is an open-source implementation designed for weakly supervised semantic segmentation. This tool addresses the challenge of generating accurate object masks from image-level supervision, a common limitation in advanced class activation map (CAM) solutions. SEAM introduces a self-supervised approach by enforcing consistency regularization on predicted CAMs across various transformed images, effectively narrowing the gap between full and weak supervisions. Additionally, it incorporates a pixel correlation module (PCM) to refine predictions by leveraging context appearance information and similar neighbors. Extensive experiments on the PASCAL VOC 2012 dataset demonstrate SEAM's superior performance compared to state-of-the-art methods using the same level of supervision, making it a valuable resource for AI researchers and computer vision engineers.
TextGrocery
TextGrocery is an efficient short-text classification tool built upon the LibLinear library. It is designed to categorize text quickly and accurately, making it suitable for tasks like classifying news titles or other brief content. A key feature is its integration with Jieba, providing robust support for Chinese tokenization, which is crucial for processing Chinese language texts. The tool demonstrates superior performance compared to scikit-learn's SVM and Naive Bayes classifiers in terms of both accuracy and processing time, as shown in benchmarks with news title datasets. TextGrocery offers a straightforward API for training models from lists or files, saving and loading models, and performing predictions and tests, making it accessible for developers and data scientists working with text classification.
Trading-Gym
Trading-Gym is an open-source project designed for the development and testing of reinforcement learning algorithms within the context of financial trading. It offers a flexible environment, currently featuring a SpreadTrading environment, which allows users to trade spreads based on bid and ask price time series for multiple products. A key feature is its generic data feeding mechanism, enabling users to create custom DataGenerators to input diverse price data. The environment's state includes prices, entry price, and position (long, short, or flat). Trading-Gym's API is inspired by OpenAI Gym, aiming for full compatibility to integrate as an additional OpenAI environment, making it accessible for researchers and developers familiar with the OpenAI Gym framework.
Timmy App
Timmy App is a domain name currently listed for sale on HugeDomains.com. The website content indicates that the domain is available for a one-time purchase of $4,295 or through a 24-month payment plan at $178.96 per month. HugeDomains.com offers a 30-day money-back guarantee and secure shopping with SSL encryption. They also provide quick delivery of the domain, typically within one to two hours of purchase, and offer zero percent financing for payment plans. The purchase includes only the domain name, with email packages and hosting services needing to be acquired separately.
YOLOv11-RGBT
YOLOv11-RGBT offers a comprehensive single-stage multispectral object detection framework, extending the capabilities of YOLO models (from YOLOv3 to YOLOv13) and RTDETR to handle RGBT (Red, Green, Blue, Thermal) data. This project simplifies the configuration of visible and infrared datasets for multimodal object detection tasks, providing three distinct configuration methods. It supports multi-spectral object detection, keypoint detection, and instance segmentation. The framework is adaptable to various pixel-aligned images, including depth maps and SAR images, not just multispectral. Key features include support for TIFF images, 16-bit multi-spectral datasets with arbitrary channels, and various image formats like Gray, BGR, RGBT, and Multispectral with flexible channel configurations.