DEV Community

Cover image for Building a 60 FPS Audio Visualizer and Web UI on a Single-Core RISC-V MCU (ESP32-C3 Architecture Deep Dive)
makepkg
makepkg

Posted on

Building a 60 FPS Audio Visualizer and Web UI on a Single-Core RISC-V MCU (ESP32-C3 Architecture Deep Dive)

Every developer loves pushing hardware to its absolute limits. Recently, I set out to build an advanced Internet Radio (v4.6) around the ESP32-C3 โ€” a cost-effective, single-core 32-bit RISC-V microcontroller running at 160MHz with ~400KB of usable RAM.

The challenge? Stream high-bitrate MP3 audio over WiFi, decode it on the fly, output it via I2S, drive an asynchronous HTTP Web Server, handle hardware interrupts, and render a smooth 60 FPS audio spectrum visualizer with 9 complex mathematics-driven animation modes on a monochrome OLED display. All of this on a single core.

Here is a deep dive into the architectural decisions, FreeRTOS task scheduling, and rendering pipelines that made it possible.


๐Ÿง  1. FreeRTOS Task Architecture & Prioritization

When dealing with a single-core MCU, true concurrency doesn't exist; we rely entirely on deterministic time-slicing via FreeRTOS. A single bottleneck in the display rendering or network stack will immediately cause audio stuttering (buffer underrun).

To solve this, the processing chain is decoupled into independent tasks separated by thread-safe queues and strict priority constraints:

Highest Priority โ”€โ”€โ”€>  [ Audio Decoder Task ]  Priority 4 (Time-critical)
                             โ”‚
                             โ–ผ  (128KB Ring Buffer)
                       [ Network / WiFi Stack ] Priority 3 (Burst-driven)
                             โ”‚
                             โ–ผ
                       [ Hardware Input ISR ]   Priority 2 (Interrupt debounced)
                             โ”‚
                             โ–ผ
Lowest Priority  โ”€โ”€โ”€>  [ OLED Render & Web ]    Priority 1 (Frame-skipping allowed)

Enter fullscreen mode Exit fullscreen mode

The Audio Priority Safeguard

The MP3 decoding task takes absolute precedence. If the network drops packets or the web server hits a heavy asset request, FreeRTOS preempts those operations to feed the 128KB audio ring buffer. We enforce a 2-second prebuffering threshold before initializing the I2S DMA transmission.


๐ŸŽจ 2. The 60 FPS Visualizer Engine: Pushing I2C to the Limit

The standout feature of this system is the 9-mode real-time visualization matrix (including Cyberpunk Hexagons, 4D Tesseract projections, and Liquid Plasma).

Getting 60 frames per second on an SSD1306 OLED via I2C while decoding audio is historically a bottleneck. Here is how it was bypassed:

1. I2C Overclocking & Frame Skipping

Standard I2C runs at 100kHz. We push the ESP32-C3 I2C hardware controller to 400kHz (Fast Mode). To prevent the rendering loop from blocking the main loop during peak CPU cycles, an atomic frame-skip mechanism is implemented:

// Pseudocode snippet of the priority-aware loop
void visualizerTask(void *pvParameters) {
    TickType_t lastWakeTime = xTaskGetTickCount();
    for(;;) {
        if (audioDecoder.isDecoding() && cpu_heavy_flag) {
            // Drop frame to preserve audio integrity
            vTaskDelayUntil(&lastWakeTime, pdMS_TO_TICKS(32)); // Drop to 30 FPS
            continue;
        }
        renderVisualizerStyle(current_style);
        oled.display(); // Flushes 1024 bytes buffer over 400kHz I2C
        vTaskDelayUntil(&lastWakeTime, pdMS_TO_TICKS(16)); // Target 60 FPS
    }
}

Enter fullscreen mode Exit fullscreen mode

2. Fast Fourier Transform (FFT) Allocation

The 16-band spectrum analyzer reads the raw PCM data immediately post-decoding. To prevent heap fragmentation, the arrays for the visualization algorithms are allocated statically with IRAM_ATTR attributes, ensuring the RISC-V core can perform fast data manipulation inside cache-backed internal memory.


๐ŸŒ 3. Asynchronous Web UI Architecture (LittleFS)

The device hosts a fully responsive station manager, real-time logging, and system diagnostics page. Running a traditional synchronous web server would block the CPU for milliseconds during file reads, instantly killing the audio stream.

Solution: Async Web Server + Custom Web API

We utilized ESPAsyncWebServer combined with a LittleFS filesystem partition (1.4MB).

  • Zero-Blocking Architecture: The async server processes HTTP requests in sockets via background LwIP stack events. When a user changes a station or slider, it transmits minimal JSON payloads via a RESTful API rather than reloading heavy HTML pages.
  • Flash Protection Layer: Changing volume or updating configuration files triggers an flash-write operation. To prevent early flash degradation and sudden performance drops, we implemented a 5-second write-debounce cache. If a user spams the volume encoder, the state is cached in RAM and only committed to state.json inside LittleFS once the system is idle for 5 seconds.

๐Ÿ”‹ 4. Low-Power Design: Deep Sleep Optimization

For a desktop device, standby efficiency matters. Long-pressing the hardware encoder puts the system into an ultra-low-power Deep Sleep mode, dropping power consumption to a microscopic ~10ยตA.

Before entering sleep, the system executes an atomic exit routine:

  1. Gracesfully halts the I2S DMA engine (preventing speaker "pop" noises).
  2. Flushes all volatile system metrics and rolling logs to LittleFS.
  3. Attaches an external wakeup interrupt to the encoder button pin (GPIO2).
  4. Commands the SSD1306 display controller to enter charge-pump shutdown mode via I2C.

๐Ÿ› ๏ธ Memory Mapping & Footprint Breakdown

Optimizing the 4MB external Flash memory allocation partition was crucial to balancing features and files:

Partition Size Purpose
App Partition (OTA) 2.5 MB Compiled C++ Binary (FreeRTOS, Audio/Graphics engines)
LittleFS Filesystem 1.4 MB Static Web UI assets, logs, station JSON configs
System NVS 100 KB Non-volatile storage for runtime variables, WiFi calibration

๐ŸŽฏ Conclusion

By carefully managing task scheduling, optimizing peripheral busses (I2C/I2S), and shifting to non-blocking asynchronous software paradigms, it's completely viable to turn a budget $3 single-core chip like the ESP32-C3 into a high-performance multimedia device.

The entire source code, along with wiring schemas, PlatformIO configuration profiles, and assembly bills of material, is fully open-source.

๐Ÿ‘‰ Check out the complete repository here: https://github.com/makepkg/ESP32-C3-Internet-Radio

Have you worked with audio streaming or complex UI rendering on single-core microcontrollers? Let's discuss task scheduling and optimization tricks in the comments below!

Top comments (0)