This is a submission for the AssemblyAI Voice Agents Challenge
What I Built
AgentForge is a powerful platform to create, manage, and scale autonomous voice-enabled AI agents tailored for intelligent commerce automation. Designed for Business Automation, leverages AssemblyAI’s Universal-Streaming technology to empower agents with real-time voice interaction capabilities in complex business workflows.
These agents can autonomously handle:
Customer inquiries about products, invoices, and payments
Payroll processing and employee payouts
Bounty funding and task management
Scheduling and revenue sharing across teams
Whether it's automated sales calls, customer support, or real-time lead qualification, it transforms human interactions into intelligent, scalable financial processes powered by speech.
Demo
Live Demo Link: https://agentforge-eta.vercel.app/
GitHub Repository
Link: https://github.com/RSN601KRI/Agent-Forge
Technical Implementation & AssemblyAI Integration
AgentForge is built to integrate AssemblyAI’s Universal-Streaming API for real-time transcription and voice processing.
🎙️AssemblyAI Universal Streaming
We connected AssemblyAI’s WebSocket streaming API directly with our frontend using LiveKit as the orchestration framework.
The transcription stream powers:
Live intent parsing for payment, invoice and scheduling workflows
Voice-command execution via Payman SDK (e.g., “Generate invoice for Adnan for $1200”)
Real-time feedback and conversational agent interactions.
// Example: AssemblyAI WebSocket Streaming Setup (client-side)
const socket = new WebSocket('wss://api.assemblyai.com/v2/realtime/ws?sample_rate=16000');
socket.onmessage = (message) => {
const data = JSON.parse(message.data);
if (data.text) {
handleVoiceCommand(data.text); // Intent parser + command trigger
}
};
We integrated the Payman SDK to allow our voice agents to:
📤 Trigger real payments, bounties, and revenue splits
📄 Auto-generate and process invoices based on voice cues
👥 Execute payroll with smart contract support
These financial workflows are securely handled using programmable Web3 infrastructure, triggered by real-time voice interactions.
🛠️ Tech Stack
Frontend: React + Vite + TypeScript
UI: ShadCN/UI + TailwindCSS
Voice Stream: AssemblyAI Universal-Streaming API
Backend/Agent Logic: Node.js + Payman SDK
Deployment: Vercel
Streaming Orchestration: LiveKit
Highlights
✅ Built a practical B2B/B2C voice automation tool that actually executes transactions
🔄 Supported multi-step workflows with live voice interaction
🔒 Ensured secure and programmable backend actions through smart contract integration
🔊 Used AssemblyAI to bring real-time transcription accuracy to finance-based domain voice commands
Team Submissions:
👩💻Developed by @rsn601kri (Roshni Kumari)
Thanks to AssemblyAI for powering the future of intelligent voice-enabled agents!
Let’s build a future where agents think, transact and speak — in real-time.
Thanks for participating!
Top comments (0)