💻

Coding & Development

Browsing page 37 of AI tools for Testing & QA in Coding & Development. Sorted by confidence score — our independent quality rating.

All Backend & APIs Code Assistants Coding Agents Database & SQL DevOps & Infrastructure Documentation Frontend & UI Game Development Mobile Development No-Code / Low-Code Open Source & Models Prompt Engineering Testing & QA Vibe Coding Web Scraping & Automation

apollo

55%

Apollo is an open-source autonomous driving platform designed to accelerate the development, testing, and deployment of autonomous vehicles. It provides a high-performance and flexible architecture, supporting a wide range of autonomous driving applications. The platform has evolved through numerous versions, each introducing new modules and features, from basic GPS waypoint following to complex urban road navigation with advanced perception and planning algorithms. Apollo emphasizes collaboration and innovation in the autonomous vehicle technology field, offering extensive documentation and quick-start guides for developers. It supports various hardware configurations and software environments, including different Ubuntu versions, NVIDIA GPUs, and Docker-CE, making it a comprehensive solution for autonomous driving development.

web-eval-agent

54%

web-eval-agent is an open-source MCP server designed to autonomously evaluate web applications. It leverages a browser-use powered agent to execute and debug web applications directly within your code editor. Key features include navigating web apps, capturing network traffic, collecting console errors, and autonomous debugging. The tool generates rich UX reports, detailing agent steps, console logs, network requests, and a chronological timeline of actions. It also offers a setup_browser_state tool for interactive browser sessions, allowing for single sign-on and cookie reuse. While the project has been sunset, its capabilities offer a robust solution for automated web application testing and debugging.

Checkie.AI

54%

Checkie.AI, now operating as TabSense.AI, functions as a confidential report index. The platform requires users to enter a specific passcode to decrypt and access the encrypted report content directly within their browser. Based on the available information, it seems to be a specialized tool for accessing and reviewing sensitive reports, likely within the domain of testing and analysis, given its previous branding as Checkie.AI and Testers.AI. The website's primary function is to serve as a secure gateway to these encrypted reports, ensuring that only authorized individuals with the correct passcode can view the information.

microprofile

54%

microprofile is an embeddable profiler designed for C++ projects, offering robust capabilities for performance analysis and bottleneck identification. It integrates easily into existing codebases, requiring just a few lines to start profiling. Key features include CPU and GPU timing across multiple APIs like OpenGL, D3D11, D3D12, and Vulkan, as well as support for multithreaded renderers. The tool also provides counter tracking, a timeline view for longer-duration events, and a live web view for real-time monitoring and capture generation. A standout feature is dynamic instrumentation for Intel x86-64, allowing injection of markers into running code without recompilation, though it's noted as experimental. Captures can be compared, and the tool supports compressed captures using miniz to manage file sizes.

great_expectations

54%

Great Expectations (GX Core) is an open-source data quality tool designed to help data teams ensure the reliability and integrity of their data. It allows users to define, document, and test 'Expectations' – essentially unit tests for data – to always know what to expect from their datasets. GX Core combines community wisdom with a super-simple package, making it easy to implement data quality checks. It supports Python 3.10 through 3.13, with experimental support for Python 3.14 and later. The tool fosters collaboration by providing a common language for data quality tests and automatically generating documentation for validation results, simplifying data quality processes and preserving institutional knowledge about data.

continuous-eval

54%

continuous-eval is an open-source package designed for the data-driven evaluation of applications powered by Large Language Models (LLMs). It provides a modular approach to evaluation, allowing users to apply tailored metrics to each specific module within their LLM pipeline. The tool includes a comprehensive library of metrics to facilitate thorough assessment. It supports the evaluation of diverse LLM use cases, including Retrieval-Augmented Generation (RAG), code generation, and the utilization of agent tools.

lightweight-human-pose-estimation-3d-demo.pytorch

54%

This repository offers a real-time 3D multi-person pose estimation demo built with PyTorch. It leverages the Lightweight OpenPose and Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB papers to detect and track 2D and 3D coordinates of up to 18 keypoints, including ears, eyes, nose, neck, shoulders, elbows, wrists, hips, knees, and ankles. The model was trained on MS COCO and CMU Panoptic datasets, achieving 100 mm MPJPE on the CMU Panoptic subset. For enhanced performance, it supports Intel OpenVINO for fast inference on CPUs and NVIDIA TensorRT for accelerated inference on Jetson devices, offering significant speedups.

safe-control-gym

54%

safe-control-gym offers physics-based CartPole and Quadrotor Gym environments built using PyBullet, featuring symbolic a priori dynamics powered by CasADi. This framework is designed for learning-based control, as well as model-free and model-based reinforcement learning (RL). It includes symbolic safety constraints and implements input, parameter, and dynamics disturbances to rigorously test the robustness and generalizability of various control approaches. The tool provides a unified benchmark suite for safe learning-based control and RL in robotics, supporting a range of implemented controllers like PID, LQR, iLQR, MPC, SAC, and PPO, alongside safety filters such as MPSC and CBF. It also offers performance comparisons against other popular Gym environments.

Doclin

54%

Doclin is a real-time code discussion tool designed to enhance collaboration among developers. It allows users to comment on and discuss code directly within their development environment, fostering better understanding and knowledge sharing. All comments are securely stored in the cloud, which helps prevent clutter in Git repositories and keeps the codebase clean. A key feature of Doclin is its ability to automate knowledge base creation, eliminating the need for manual documentation efforts. Furthermore, it automatically updates this documentation to reflect any changes made to the code, ensuring that the documentation always remains current and accurate. This makes Doclin an efficient solution for maintaining up-to-date code documentation and streamlining development workflows.

ain

54%

Ain is a terminal HTTP API client designed as an alternative to graphical tools like Postman, Paw, and Insomnia. It enables developers to organize APIs flexibly using files and folders, promoting scripting of input and processing of output via pipes. The tool supports the use of shell scripts and executables for common tasks, and allows for dynamic configuration through environment variables or .env-files. Ain handles URL-encoding automatically and can share generated `curl`, `wget`, or `httpie` command-lines. It's built to be helpful with errors and targets users who interact with many APIs using a simple file format, leveraging existing command-line tools for actual API calls.

HiDream Arena

54%

HiDream Arena provides a dedicated platform for users to compare different AI image generation models. It specifically supports the evaluation of models such as HiDream-I1-dev, HiDream-I1-full, and FLUX-dev, allowing users to see their outputs side-by-side. This tool is hosted on Hugging Face, making it accessible to a broad audience interested in AI image generation. Its primary function is to facilitate the comparison process, helping users identify the strengths and weaknesses of various models.

chatgpt-failures

54%

chatgpt-failures is a GitHub repository dedicated to collecting and documenting instances where ChatGPT and other large language models exhibit failures. This archive acts as a valuable resource for researchers, developers, and AI enthusiasts interested in understanding the limitations, biases, and vulnerabilities inherent in these advanced AI systems. Users can leverage this collection for comparative analysis with alternative models, to identify common failure patterns, and to generate synthetic data for robust testing and training of new AI models. It provides a practical dataset for improving the reliability and safety of language models.

HeadPoseEstimation-WHENet

54%

HeadPoseEstimation-WHENet is an end-to-end head-pose estimation network designed for real-time, fine-grained prediction of Euler angles across the full range of head yaws from a single RGB image. Unlike many existing methods that perform well only for frontal views, WHENet targets head poses from all viewpoints, making it suitable for applications in autonomous driving and retail. The network builds on multi-loss approaches with adapted loss functions and training strategies for wide-range estimation. It also uniquely extracts ground truth labelings of anterior views from a panoptic dataset. WHENet is compact and efficient, making it suitable for mobile devices and applications, and meets or beats state-of-the-art methods for frontal head pose estimation.

Image-Guided OWL-ViT Demo

54%

The Image-Guided OWL-ViT Demo is a Hugging Face Space designed to demonstrate the capabilities of the OWL-ViT model for image-guided object detection. While the current live website indicates a runtime error preventing the application from functioning, its intended purpose is to provide a platform for users to interact with and understand how the OWL-ViT model identifies objects within images based on specific guidance. This tool is valuable for researchers, developers, and AI enthusiasts interested in the practical application of advanced image recognition technologies and exploring the underlying mechanisms of models like OWL-ViT.

Quotient AI

54%

Quotient AI offers a monitoring solution specifically for AI systems, enabling teams to catch failures before they impact users. The platform is adept at identifying critical issues such as hallucinations, flawed reasoning, and irrelevant retrievals within AI applications. By running specialized detectors on logs and traces, Quotient AI pinpoints root causes and highlights important problems. It provides comprehensive visibility into the behavior of AI search, Retrieval-Augmented Generation (RAG), and AI agents.

Selenium Screenshot Gradio

54%

Selenium Screenshot Gradio is a tool hosted on Hugging Face that allows users to capture screenshots of web pages. By simply entering a website URL, the application leverages Selenium to navigate to the specified page and generate a screenshot, which is then displayed to the user. This functionality is useful for various purposes including web testing, UI testing, and creating documentation. The tool provides a straightforward interface for automating the process of web page capture, making it accessible for those who need quick visual records of web content. Although currently paused, its design indicates a focus on ease of use for obtaining web page snapshots.

DIWAMA

53%

DIWAMA is currently undergoing system upgrades and is in maintenance mode. The website indicates that the platform is not down, but rather improving its backend systems to offer an enhanced experience. While specific features and capabilities are not accessible during this period, the message suggests a focus on improving the core infrastructure. Users are advised to check back later for the updated and improved version of the platform. The previous description indicated DIWAMA provided AI-powered image recognition for detecting and auditing municipal solid waste (MSW) streams, aiming to make waste management more transparent and traceable.

Algomax

53%

Algomax is an evaluation platform specifically designed for Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) models. It offers robust tools to gain insights into qualitative metrics, which are crucial for understanding model performance beyond quantitative scores. The platform assists developers in the critical tasks of model refinement and performance benchmarking, ensuring their AI applications are optimized. Key features include detailed visualizations and real-time logging capabilities, which help in monitoring and debugging. Algomax is built for seamless integration into existing codebases, allowing for efficient and continuous model assessment.

Is It Huggable

52%

Is It Huggable is an image recognition tool designed to classify images and perform object detection. It offers functionalities for testing AI models, making it valuable for developers and researchers. Additionally, the tool serves educational purposes, providing a platform for learning about image recognition and AI. It is available for free, making it accessible to a broad audience interested in AI and computer vision.

promptflow

52%

Promptflow is an open-source tool specifically designed to streamline the development of large language model (LLM) applications. It provides comprehensive support across the entire application lifecycle, starting from initial prototyping and extending through rigorous testing phases. The tool also facilitates seamless production deployment and offers capabilities for ongoing monitoring of LLM applications, ensuring their robust and reliable operation. Its primary goal is to empower developers to create high-quality, AI-powered applications efficiently.

Tuvi.ai

52%

Tuvi.ai is an AI tool that appears to be in its very early stages of development, as indicated by its current website content. The site displays a default 'Welcome to nginx!' page, suggesting that the web server is successfully installed but the application itself is not yet configured or publicly launched. While the tool's specific functionalities are not detailed, its previous description indicated a focus on code and development, aiming to assist with various coding tasks. However, based on the live website, no concrete features or use cases can be confirmed at this time.

benchm-ml

52%

benchm-ml is a GitHub repository designed to provide a minimal benchmark for various machine learning libraries. Its primary function is to assess the scalability, speed, and accuracy of open-source implementations, specifically focusing on binary classification algorithms. The benchmark includes popular methods such as random forests and neural networks. It covers a range of prominent machine learning tools and frameworks, including R packages, Python's scikit-learn, H2O, xgboost, and Spark MLlib, offering a comparative analysis across these platforms.

ADB Shell - Debug Toolbox

52%

ADB Shell - Debug Toolbox is a comprehensive mobile application designed for developers to efficiently debug and manage Android devices. It offers a powerful ADB shell with features like support for Android 4.X-Android 13, pair mode, Wi-Fi wireless ADB, and local shell ADB. The toolbox includes functionalities such as launching, uninstalling, and managing applications, viewing running apps, taking screenshots, pushing and pulling files, and remote control capabilities. This tool streamlines development and testing workflows by providing a user-friendly interface for executing ADB commands and monitoring system information directly from a mobile device, enhancing productivity for those working with Android applications.

vuln-bank

52%

vuln-bank is a deliberately vulnerable banking application, specifically created to facilitate the practice of security testing for web applications, APIs, and AI-integrated apps. This tool is invaluable for security professionals and developers looking to enhance their skills in pentesting and secure coding practices. It features a comprehensive set of common vulnerabilities found in real-world applications, providing a realistic and safe environment to identify, exploit, and mitigate security flaws. By offering a hands-on learning experience, vuln-bank helps users understand the impact of various vulnerabilities and develop effective defense strategies, ultimately improving the security posture of their own projects.

EXPLORE OTHER CATEGORIES

🎨 Content & Design 📊 Productivity & Business 🤖 AI Agents & Automation 📚 Research & Education 🧘 Wellness & Lifestyle 💼 Career Development 📈 Marketing & Growth 📉 Data & Analytics 💬 Customer Support & CX 💰 Finance 🛒 E-commerce