DEV Community

Cover image for GetAutoCue: A Hands-Free Teleprompter with Real-Time Voice Control built with AssemblyAI
Shimanta Krishna Bhuyan
Shimanta Krishna Bhuyan

Posted on

GetAutoCue: A Hands-Free Teleprompter with Real-Time Voice Control built with AssemblyAI

AssemblyAI Voice Agents Challenge: Real-Time

This is a submission for the AssemblyAI Voice Agents Challenge.

What I Built

I built GetAutoCue, a professional, browser-based teleprompter designed to eliminate the most common frustration for presenters: unnatural pacing. It uses AssemblyAI's Universal-Streaming model to listen to your voice in real-time, automatically scrolling the script to match your natural speaking speed.

This project is being submitted to the Real-Time Performance prompt by creating a highly responsive, low-latency voice experience that makes presenting feel seamless and intuitive. Whether you're recording a video, giving a speech, or practicing a presentation, GetAutoCue ensures the script is always exactly where you need it to be.

Demo

Key Features in Action

1. Flawless Voice-Activated Scrolling
Voice-activated scrolling with dynamic highlighting

The app actively listens, highlighting the current word and dimming spoken words, while scrolling the viewport to keep you perfectly on track.

2. On-the-Fly Script Editing
Script editor dialog

Paste your script and make edits without ever leaving the teleprompter view.

3. Professional Display Controls
Control panel

Hotkeys for easy controls

Full control over font size, colors, and horizontal/vertical mirroring for professional beam-splitter glass setups.

GitHub Repository

Technical Implementation & AssemblyAI Integration

The core of GetAutoCue is the useVoiceMode hook, which seamlessly integrates AssemblyAI's real-time streaming capabilities into the React/Next.js front-end.

1. Establishing the Real-Time Connection

When the user clicks on "Start Listening" in the voice mode of the app, it first fetches a temporary authentication token from a NextJS api route which calls AssemblyAI's API endpoint. This token is then used to establish a secure WebSocket connection to AssemblyAI.


    // Now proceed with the actual connection setup
    const response = await fetch('/api/assemblyai/token');
    const data = await response.json();
    if (!data.token) throw new Error('Failed to get AssemblyAI token');

    const wsUrl = `wss://streaming.assemblyai.com/v3/ws?sample_rate=16000&token=${data.token}`;
    socketRef.current = new WebSocket(wsUrl);

    socketRef.current.onopen = () => {
        setIsConnected(true);
    };

Enter fullscreen mode Exit fullscreen mode

2. Streaming Audio from the Browser

I used the navigator.mediaDevices.getUserMedia API to capture microphone input with a 16000 sample rate to match AssemblyAI's requirements. A MediaRecorder instance then chunks this audio stream into manageable blobs, which are sent through the WebSocket. This ensures a continuous, low-latency flow of data.


    // Set up audio processing for PCM data
    const stream = await navigator.mediaDevices.getUserMedia({
        audio: {
            sampleRate: 16000,
            channelCount: 1,
            echoCancellation: true,
            noiseSuppression: true
        }
    });

    // Create audio context for processing
    const audioContext = new AudioContext({ sampleRate: 16000 });
    const source = audioContext.createMediaStreamSource(stream);
    const processor = audioContext.createScriptProcessor(4096, 1, 1);

    processor.onaudioprocess = (event) => {
        if (socketRef.current?.readyState === WebSocket.OPEN) {
            const inputData = event.inputBuffer.getChannelData(0);

            // Convert float32 audio data to int16 PCM
            const pcmData = new Int16Array(inputData.length);
            for (let i = 0; i < inputData.length; i++) {
                // Clamp the value to [-1, 1] and convert to 16-bit integer
                const sample = Math.max(-1, Math.min(1, inputData[i]));
                pcmData[i] = sample * 0x7FFF;
            }

            socketRef.current.send(pcmData.buffer);
        }
    };

Enter fullscreen mode Exit fullscreen mode

3. Processing Transcripts and Syncing the UI

The magic happens when the WebSocket sends back transcript data. The onmessage handler listens for both partial and final transcripts returned from Assembly AI's universal streaming model.

To achieve the dynamic highlighting and scrolling, I implemented a fuzzy matching algorithm using fuse.js. As transcripts arrive, the app:

  1. Identifies the last spoken word in the transcript.
  2. Finds its corresponding position in the full script.
  3. Updates the UI state to highlight the next word as "current" and dim all previous words as "spoken."
  4. Calculates the progress through the script and smoothly scrolls the container to keep the current word in view.

This approach creates a robust and natural-feeling experience, as the teleprompter doesn't jump but flows with the speaker.


    socketRef.current.onmessage = (event) => {
        const message = JSON.parse(event.data);

        // Handle different message types from AssemblyAI v3 API
        if (message.message_type === 'PartialTranscript') {
            handlePartialTranscript(message.text);
        } else if (message.message_type === 'FinalTranscript') {
            handleFinalTranscript(message.text);
        } else if (message.transcript) {
            // Handle Turn events with transcript data
            if (message.end_of_turn) {
                handleFinalTranscript(message.transcript);
            } else {
                handlePartialTranscript(message.transcript);
            }
        }
    };

    socketRef.current.onerror = (error) => {
        stopListening();
    };

    socketRef.current.onclose = (event) => {
        setIsConnected(false);
    };

Enter fullscreen mode Exit fullscreen mode

By leveraging AssemblyAI's Universal-Streaming model, I was able to build a teleprompter that feels less like a machine and more like a self-paced and natural one.

Top comments (2)

Collapse
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

Okay, this deserves attention!
good job!

Collapse
 
shimantabhuyan profile image
Shimanta Krishna Bhuyan

Glad you liked it Nikoloz!