AI Agents & Automation
Browsing page 32 of AI tools for Voice Agents in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
Airudit
Airudit specializes in voice AI technologies designed to enhance man-machine collaboration across various sectors. The platform aims to bridge communication gaps between front, middle, and back offices, fostering real-time and intuitive interactions. By enabling bidirectional communication between humans and machines, Airudit seeks to deliver significant financial benefits to clients. Its applications span critical industries such as defense, banking, and Industry 4.0, where efficient and seamless voice-driven interactions can optimize operations and improve decision-making processes. The technology focuses on creating a more integrated and responsive operational environment.
Callify.ai
Callify.ai is an AI-powered recruitment automation platform designed to significantly cut down hiring costs and improve efficiency. It automates every phone interaction from sourcing to joining, integrating phone reach-outs, WhatsApp messages, and emails in a synchronized manner. The platform allows candidates to respond to recruiter questions, which are then converted to text and matched against ideal hiring criteria, enabling recruiters to screen hundreds of candidates without manual phone calls. Callify.ai aims to enhance metrics such as lead-to-hire conversion, hiring velocity, and quality of hires, while reducing recruiter operational time and cost per hire. It is built with privacy-by-design, ISO 27001 and ISO 27701 compliant, and certified under NYC-144 law for bias-free automated decision systems.
CNTXT AI
CNTXT AI provides comprehensive AI solutions and data services tailored for enterprises and governments, specializing in transforming raw data into AI-ready assets. Their offerings include data services for organizing, labeling, and preparing data, ensuring full in-region compliance and validation by Arabic-native experts. They also design and deliver custom AI systems, from rapid pilots to enterprise-wide deployments, optimized for measurable ROI. The AI Product Lab develops domain-specific models and AI-first applications, including Munsit, an accurate Arabic Speech-to-Text model, and TestAI, an AI validation platform. CNTXT AI emphasizes Arabic-first AI excellence, end-to-end AI readiness, and sovereign-hosted solutions, ensuring data security and compliance with local frameworks.
ConverseNow.AI
ConverseNow.AI offers advanced voice AI solutions specifically designed for the restaurant industry, enabling businesses to automate order taking and enhance customer interactions. The platform is highly customizable, allowing restaurants to configure the AI's tone, persona, upsell logic, and localization to match their unique brand and operational needs. It handles over 2,000,000 conversations and repurposes 83,000+ labor hours monthly. Key features include flexible usage, AI insights, coupon/discount integration, and multilingual voice support, ensuring a seamless experience for diverse customer bases. ConverseNow.AI aims to reduce staffing pressure, provide 24/7 service with live support, and continuously evolve its AI through dedicated engineering and customer input.
Navana.ai
Navana.ai is building India’s foundational voice AI infrastructure with proprietary speech models that understand India, not just linguistically, but contextually across 12 Indian languages and 40+ dialects. Their Bodhi speech models are purpose-built for Indian realities, handling real-world audio challenges like noise, overlaps, code-switching, and accents. The platform offers an end-to-end voice AI stack engineered for pan-India scale, complexity, and compliance, with enterprise-grade performance. Navana.ai provides solutions like an AI Contact Center, Bodhi Speech API for real-time and batch use cases, and an Audio Intelligence API for call center automation, delivering transcripts, summaries, and performance insights. It is designed for enterprise needs, offering on-prem architecture for data control and security, seamless integrations, and compliance with RBI guidelines, ISO 27001, and SOC 2 Type II standards.
iVoz Ai
iVoz Ai offers an advanced AI Voice Agent designed to empower businesses across various sectors by automating lead generation and customer interactions. This tool enables instant AI voice agents and real-time solutions, redefining business communication. It integrates seamlessly with existing CRM systems, ensuring a smooth and hassle-free experience. iVoz Ai helps businesses save up to 70% on costs by only charging for answered calls, making it a cost-effective solution for acquiring superior leads. The platform supports multilingual interactions with auto language detection and industry-specific vocabulary, catering to diverse customer bases in industries like real estate, insurance, financial services, healthcare, and education.
Netwrck
Passisto is an AI-powered enterprise platform designed to revolutionize recruitment and knowledge management. It automates the entire hiring pipeline, from defining job offers and screening candidates at scale to conducting intelligent AI interviews and making data-driven decisions. The platform features automated CV screening, flexible phase management, AI interview templates, and automated communications to streamline the hiring process. Beyond recruitment, Passisto Enterprise includes a full suite of AI tools such as an AI Knowledge Base for unifying company documents, an AI Email Builder for generating context-aware emails, and an AI Form Builder for instant form creation. It aims to accelerate time-to-hire, enhance candidate quality, reduce recruitment overhead, and increase diversity and fairness in hiring.
Alcove Group
Alcove Group offers an AI-powered care technology service designed to promote independent living and provide peace of mind. Their comprehensive solution includes a proprietary AI-powered IoT data platform with integrated in-home sensors, smart wearables, and a one-touch video calling aid called the 'Alcove Video Carephone'. The service connects 20,000 older and disabled adults, their families, friends, and carers. Alcove's multi-award-winning technology revolutionizes TEC delivery for local authorities and the NHS, offering digital referral management, advanced analytics, and benefits capture. Additionally, they provide a Virtual Care Agency for virtual care delivery and an AI-enabled 24/7 ARC and managed responder and falls lifting service, fully integrated with their management suite.
Bossed
Bossed is an AI-powered, voice-interactive interview simulator designed to help job seekers prepare for any interview. Users can upload their CV/resume and import job listings from sites like Indeed or LinkedIn, allowing the AI interviewer to personalize questions for a realistic experience. The platform offers real-time interview practice with an AI coach, providing actionable feedback on communication skills, cultural fit, problem-solving, and technical ability. Users can choose difficulty levels from easy to hard, working their way up to master their interview technique. The realistic voice chat feature simulates a genuine interview setting, and the powerful AI interviewer can identify gaps in a candidate's CV. Bossed aims to provide constructive feedback to improve interview performance and help users secure their dream job.
Brilo AI
Brilo AI offers human-like AI phone call agents designed to automate customer support and streamline business operations. It allows businesses of any size to set up AI phone agents in minutes, boosting customer satisfaction and cutting costs without requiring a technical team. The platform provides detailed post-call analysis, integrates seamlessly with over 6,000 apps, and handles inbound calls 24/7. Key features include instant transfer to human agents when needed, 24/7 regulatory compliance, and appointment scheduling that syncs with existing tools. Brilo AI is trusted by fast-growing teams across various industries like healthcare, financial services, and high-volume customer operations.
OutcomesAI
OutcomesAI is revolutionizing nursing care by integrating AI Voice Agents with licensed nurses to provide scalable, safe, and cost-effective healthcare solutions. Its core AI engine, Glia, automates routine patient interactions such as symptom triage, scheduling, and follow-ups, allowing nurses to focus on critical care. The platform supports various healthcare settings including nurse triage, patient access, post-acute & transition care, virtual care, and pharma & specialty care. OutcomesAI aims to reduce operational costs, accelerate scheduling, and significantly increase care capacity, delivering measurable impact through evidence-based, protocol-driven care that is HIPAA and SOC 2 certified.
Relyable
Relyable is a comprehensive platform designed for automated testing and monitoring of AI voice agents. It enables users to generate hundreds of realistic test conversations, evaluate every call against a custom rubric, and monitor production agents live to ensure high performance. The platform offers native integrations with Vapi, Retell, and ElevenLabs, allowing for quick setup. Users can create AI-assisted test cases from system prompts, define personas with over 200 presets, and assign them to conversation scenarios for extensive coverage. Relyable also provides real-time monitoring, logging and analyzing every live call, and sending alerts via various channels like Slack and PagerDuty when performance drifts. This ensures problems are addressed proactively, significantly accelerating the deployment of reliable AI voice agents.
ForEva
Foreva AI offers a restaurant voice AI solution designed to handle phone calls, manage orders, reservations, and provide customer service around the clock. It boasts 99% order accuracy and multilingual support (English, Spanish, Chinese). Restaurants can choose between a complete standalone solution, including a phone number and merchant processing, or a POS integration that syncs orders directly with systems like Square and Clover. Foreva aims to increase phone orders by an average of 30%, ensure 99.9% call answer rates, and offers a quick setup, with some integrations taking as little as 5 minutes. It's built to understand restaurant-specific language, including complex modifiers and dietary requests.
Uplift AI
Uplift AI offers advanced voice models, named Orator, specifically designed for Pakistani languages, including Urdu, Sindhi, Balochi, and Roman Urdu. These models provide human-like realism in speech synthesis and understanding, outperforming competitors like OpenAI and ElevenLabs in user preference evaluations while being significantly more cost-effective. The platform provides both a Creator Studio for direct use and a Developer API for integration into other applications. Uplift AI aims to make digital services accessible to everyone in their local language, especially in regions with lower literacy rates, by enabling interaction with technology through speech. Future language support includes Punjabi and Saraiki.
Caantin AI
Caantin AI delivers comprehensive voice AI solutions, spanning from initial data collection to final deployment. The platform is designed to provide essential data, thorough evaluations, and actionable outcomes to a diverse clientele, including AI laboratories, governmental bodies, and large enterprises. It specializes in offering compliant AI agents for various tasks, such as debt recovery, with a unique payment model contingent on the successful collection of debts. This approach highlights its focus on results-driven AI applications and its commitment to delivering tangible value to its clients.
awesome-speech-recognition-speech-synthesis-papers
awesome-speech-recognition-speech-synthesis-papers is an open-source GitHub repository that serves as a curated list of academic papers focused on various aspects of speech technology. It covers key areas such as Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis (TTS), Language Modelling, Singing Voice Synthesis (SVS), and Voice Conversion (VC). The repository is organized by topic, making it easy for researchers, academics, and students to find relevant literature. It includes papers ranging from foundational works to recent advancements, often providing direct links to PDF versions. This resource is invaluable for anyone looking to delve into the theoretical and practical developments in speech processing.
AI Virtual Therapist
AI Virtual Therapist is an AI-powered tool designed to help users understand and manage their emotions through text and voice analysis. Users can input text to discover its emotional tone or upload audio clips to detect the speaker's mood. Beyond analysis, the tool provides an interactive chat interface where an AI responds in both text and spoken voice, offering a dynamic and engaging experience. This application is hosted on Hugging Face Spaces, leveraging advanced AI models for emotion detection and conversational AI, making it accessible for personal use in exploring emotional well-being.
Send email with your Voice
Send email with your Voice is an AI Chrome extension designed to streamline email composition by enabling users to dictate messages. This tool offers live, editable transcriptions as you speak, ensuring accuracy and allowing for immediate corrections or refinements. Alongside the text, the system also captures and records the audio of your spoken email. Once you've finished speaking, the email is instantly ready to be sent, making it ideal for hands-free communication or multitasking. It enhances accessibility for users with mobility impairments and provides a quick way to send voice notes as emails.
Smart Dictate
Smart Dictate is an AI-powered dictation tool designed to provide highly accurate voice-to-text transcription across all websites. It leverages context-aware AI to understand and correctly transcribe industry-specific terminology, technical abbreviations, complex names, and scientific notations in real-time. The tool seamlessly integrates with popular platforms such as email clients (Gmail, Outlook), social media, CRM systems, and documentation tools. A key differentiator is its dynamic long-term memory, which learns from user dictations, adapts to vocabulary, and remembers technical terms for perfect transcription without constant context. This results in a lightning-fast and efficient dictation experience, often three times faster than typing, with smart punctuation and zero lag.
Mapwise
Mapwise is an AI-powered learning assistant designed to transform various study materials into structured, step-by-step learning roadmaps. Users can upload notes, PDFs, and videos, which Mapwise then processes to extract topics, structure concepts, and generate milestones. The platform offers a comprehensive suite of study tools, including AI-generated flashcards with spaced repetition, interactive AI quizzes, and voice tutor sessions directly tied to the learning roadmap. This integrated approach helps students, professionals, and self-learners break down complex topics, track progress, and reinforce learning effectively. Mapwise aims to provide a single solution for organized and adaptive study, eliminating the need to juggle multiple apps.
FunASR
FunASR is a fundamental end-to-end speech recognition toolkit designed to bridge the gap between academic research and industrial applications. It offers a comprehensive suite of features including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker Diarization, and multi-talker ASR. The toolkit provides convenient scripts and tutorials for both inference and fine-tuning of pre-trained models. FunASR boasts a vast collection of academic and industrial pre-trained models available on ModelScope and Hugging Face, including the highly accurate and efficient Paraformer-large. Recent updates include support for large models like Fun-ASR-Nano-2512 (31 languages), Whisper-large-v3-turbo, and Qwen-Audio multimodal models, alongside continuous improvements in real-time and offline transcription services, memory optimization, and multi-platform support.
sagittarius
Sagittarius is an innovative open-source tool designed for exploring the voice and video capabilities of GPT-4 and Gemini models. It provides an online platform where users can interact with these advanced AI models using both voice and video inputs, offering a real-time exploration of multimodal AI. The tool is accessible directly through a web browser, eliminating the need for any installation. Users simply require an API key from either OpenAI (with access to the gpt-4-vision-preview model) or Gemini to get started. Sagittarius is noted for its speed and support for multiple voices, making it a versatile option for developers and enthusiasts interested in cutting-edge AI interactions.
stephanie-va
Stephanie is an open-source platform designed for building voice-controlled applications and automating daily tasks, mimicking the functionality of a virtual assistant. It provides a flexible framework for developers to create and customize their own voice-controlled systems. The platform emphasizes its open-source nature, allowing for community contributions and extensive modification. Key features include voice control, task automation, and an intent prediction algorithm called Sounder. It supports Python and offers detailed documentation for installation, configuration, and usage, making it suitable for technical users looking to implement custom voice solutions.
GROWL
GROWL is the first physical AI coach designed to bring a full-size, human-form AI entity into the home. It moves, reacts, and speaks to users in real time, offering a supportive and fully embodied coaching experience that trains with you, not at you. This innovative system combines advanced hardware, software, and content to create an interactive, personalized boxing and fitness experience. Key features include immersive projection of a virtual coach, interactive sensing for precise punch tracking, AI-powered 3D motion tracking for form correction, and advanced gaming power using Unreal Engine 5 for gamified workouts. The revolutionary multi-layer composite boxing bag provides optimal resistance for all skill levels, while dynamic lighting and sound enhance the immersive atmosphere.