<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: tagmybeat</title>
    <description>The latest articles on DEV Community by tagmybeat (@tagmybeat).</description>
    <link>https://dev.to/tagmybeat</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3949813%2Ffa0e11e6-d075-4e20-8291-946d4abf4531.png</url>
      <title>DEV Community: tagmybeat</title>
      <link>https://dev.to/tagmybeat</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tagmybeat"/>
    <language>en</language>
    <item>
      <title>I built a free music tool</title>
      <dc:creator>tagmybeat</dc:creator>
      <pubDate>Mon, 25 May 2026 02:43:54 +0000</pubDate>
      <link>https://dev.to/tagmybeat/i-built-a-free-music-tool-i4l</link>
      <guid>https://dev.to/tagmybeat/i-built-a-free-music-tool-i4l</guid>
      <description>&lt;h1&gt;
  
  
  Building TagMyBeat: An AI Producer Tag Generator and a Browser-Based BPM &amp;amp; Key Finder That Never Uploads Your Audio
&lt;/h1&gt;

&lt;p&gt;If you produce music, you know the drill: you finish a beat, export it, and then need a vocal tag to brand it. You also need to know the BPM and key of every sample in your library so everything fits together. These are two different problems, but they share the same DNA: audio processing, fast turnaround, and privacy for unreleased material.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;&lt;a href="https://tagmybeat.com" rel="noopener noreferrer"&gt;TagMyBeat&lt;/a&gt;&lt;/strong&gt; to solve both, and in this post I want to walk through &lt;em&gt;how&lt;/em&gt; they work under the hood.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is TagMyBeat?
&lt;/h2&gt;

&lt;p&gt;TagMyBeat is two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A Producer Tag Generator&lt;/strong&gt; — type text, pick a voice and an effect preset, and get a professional vocal tag (think "DJ Khaled!" but yours).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A Free BPM and Key Finder&lt;/strong&gt; — drop in an MP3, WAV, or FLAC, and get tempo, musical key, and Camelot code back, &lt;strong&gt;entirely in your browser&lt;/strong&gt;. No upload. No server.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The tag generator is the main product (3 free generations per day, no login). The BPM/Key Finder is a free engineering-as-marketing tool that brings producers into the ecosystem. Both are open about how they work, so let's get into the tech.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tech Stack at a Glance
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Next.js 16 (App Router), React 19, TypeScript, Tailwind CSS 4, Zustand 5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend&lt;/td&gt;
&lt;td&gt;FastAPI (Python), FFmpeg&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth&lt;/td&gt;
&lt;td&gt;Browser fingerprinting (FingerprintJS) — no accounts, no passwords&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audio Analysis (Client)&lt;/td&gt;
&lt;td&gt;essentia.js (WASM-compiled C++), Web Workers, Web Audio API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment&lt;/td&gt;
&lt;td&gt;Cloudflare (frontend), VPS (backend)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Part 1: How the Producer Tag Generator Works
&lt;/h2&gt;

&lt;p&gt;The generation pipeline is a classic producer-consumer pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User types text → quota check → task queue (asyncio.Queue) → worker pool (8 workers)
→ TTS generates speech → FFmpeg applies audio effects → file saved → frontend polls status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Audio Effects with FFmpeg
&lt;/h3&gt;

&lt;p&gt;After the voice is generated, FFmpeg steps in. We have 5 built-in effect presets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Clean&lt;/strong&gt; — light compression and normalization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hype&lt;/strong&gt; — pitch shift up, short reverb, stereo widening&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chill&lt;/strong&gt; — pitch shift down, large reverb, warm EQ&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cinematic&lt;/strong&gt; — deep reverb, delay, dramatic EQ&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retro&lt;/strong&gt; — bitcrushing, tape saturation, vinyl noise&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each preset is a chain of FFmpeg audio filters (&lt;code&gt;atempo&lt;/code&gt;, &lt;code&gt;aecho&lt;/code&gt;, &lt;code&gt;equalizer&lt;/code&gt;, etc.) composed programmatically. Advanced mode lets you dial in 6 custom parameters: stutter speed, reverb size, reverb wet, delay time, delay feedback, and pitch shift.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 2: Deep Dive — How the BPM and Key Finder Works (Entirely Client-Side)
&lt;/h2&gt;

&lt;p&gt;This is the part I'm most excited about. The BPM/Key Finder runs &lt;strong&gt;100% in the browser&lt;/strong&gt; using WebAssembly and Web Workers. Your audio files never leave your device.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Core Engine: essentia.js
&lt;/h3&gt;

&lt;p&gt;Under the hood, we use &lt;strong&gt;&lt;a href="https://github.com/MTG/essentia.js" rel="noopener noreferrer"&gt;essentia.js&lt;/a&gt;&lt;/strong&gt;, a WebAssembly port of &lt;a href="https://essentia.upf.edu/" rel="noopener noreferrer"&gt;Essentia&lt;/a&gt;, an open-source C++ library for audio analysis developed by the Music Technology Group at Universitat Pompeu Fabra. Essentia has been battle-tested in academic research and production systems (Spotify uses it internally).&lt;/p&gt;

&lt;p&gt;Loading essentia.js happens &lt;strong&gt;lazily&lt;/strong&gt; — the ~2MB WASM binary is fetched from our CDN only when you first open the BPM/Key Finder. We use a Web Worker to keep the main thread responsive during initialization and analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Smart Windowing
&lt;/h3&gt;

&lt;p&gt;Not all parts of a track are equally useful for analysis. Intros often have no drums, outros fade out, and breakdowns can confuse tempo detection. So we don't feed the entire track to the analyzer.&lt;/p&gt;

&lt;p&gt;Instead, we use a &lt;strong&gt;smart windowing strategy&lt;/strong&gt; that varies by track length:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Track Length&lt;/th&gt;
&lt;th&gt;BPM Windows&lt;/th&gt;
&lt;th&gt;Key Windows&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&amp;lt; 45 seconds&lt;/td&gt;
&lt;td&gt;Single window covering the full track&lt;/td&gt;
&lt;td&gt;Same full-track window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;45–90 seconds&lt;/td&gt;
&lt;td&gt;One 20s window starting at 15% in&lt;/td&gt;
&lt;td&gt;One 40s window starting at 15% in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;gt; 90 seconds&lt;/td&gt;
&lt;td&gt;Three 20s windows at 25%, 50%, and 70%&lt;/td&gt;
&lt;td&gt;Two 40s windows at 35% and 60%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All windows avoid the first and last 10% of the track (the &lt;strong&gt;edge guard&lt;/strong&gt;) to dodge intros and outros. This might seem like a small detail, but it dramatically improves accuracy on real-world tracks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: BPM Detection with PercivalBpmEstimator
&lt;/h3&gt;

&lt;p&gt;For tempo detection, we use essentia's &lt;code&gt;PercivalBpmEstimator&lt;/code&gt;, which is based on the &lt;strong&gt;Percival&lt;/strong&gt; algorithm. Here's how it works at a high level:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Onset Detection&lt;/strong&gt; — The algorithm detects note and drum onsets across multiple frequency bands using spectral flux.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inter-Onset Interval (IOI) Histogram&lt;/strong&gt; — It computes a histogram of time differences between consecutive onsets. Regular rhythms produce peaks at the tempo period and its multiples.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Perceptual Weighting&lt;/strong&gt; — The algorithm applies a perceptually-motivated weighting function that favors tempos in the 80–200 BPM range (where most music lives) and penalizes half-time and double-time candidates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BPM Candidate Scoring&lt;/strong&gt; — Multiple tempo hypotheses are scored against the IOI histogram, with penalties for ambiguous cases.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But here is the key insight: for longer tracks, we run this algorithm on &lt;strong&gt;three different 20-second windows&lt;/strong&gt; at different positions and then &lt;strong&gt;aggregate the results&lt;/strong&gt;. This is critical because a single window might hit a breakdown or a double-time section.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Key Detection with KeyExtractor
&lt;/h3&gt;

&lt;p&gt;Key detection uses essentia's &lt;code&gt;KeyExtractor&lt;/code&gt;, which works as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Chroma Feature Extraction&lt;/strong&gt; — The audio is converted into a &lt;strong&gt;chromagram&lt;/strong&gt;: a 12-bin representation of how much energy is present at each pitch class (C, C#, D, ..., B), collapsed across all octaves. This is computed every 4096 samples with a 4096-sample hop size.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Profile Correlation&lt;/strong&gt; — The averaged chroma vector is correlated against &lt;strong&gt;24 key profiles&lt;/strong&gt; (12 major + 12 minor keys using the Krumhansl-Kessler tonal hierarchy). Each profile represents the expected chroma distribution for a given key.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strength and Confidence&lt;/strong&gt; — The algorithm returns not just the best-matching key and scale, but also a &lt;strong&gt;strength score&lt;/strong&gt; (0–1) indicating how strongly the chroma matches the profile. Low strength often means the track modulates or has an ambiguous tonal center.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For tracks longer than 90 seconds, we sample two 40-second windows and use a &lt;strong&gt;majority-voting aggregation&lt;/strong&gt;: if both windows agree on the key, we return it with the averaged strength. If they disagree, we pick the one with higher strength. This handles tracks where the verse and chorus are in different keys.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Aggregation and Normalization
&lt;/h3&gt;

&lt;p&gt;Raw BPM estimates need cleanup. Here is our aggregation pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw BPM estimates from each window
→ Normalize into 80-200 BPM range (doubling/halving)
→ Cluster analysis: group estimates within ±2 BPM of each other
→ If largest cluster has ≥ 2 estimates: return median of that cluster
→ Otherwise: return median of all estimates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;cluster approach&lt;/strong&gt; is important. Imagine a trap beat at 140 BPM. Window 1 might detect 140, Window 2 detects 70 (half-time feel during a sparse section), and Window 3 detects 140. Without clustering, a simple average gives 116.7 — useless. With clustering, the 140 group wins, and we get the correct answer.&lt;/p&gt;

&lt;p&gt;The normalization step (multiply by 2 if below 80, divide by 2 if above 200) uses the fact that most music sits in 80–200 BPM. A 65 BPM estimate on a drum and bass track is almost certainly half of 130 (or, more likely, half of ~174).&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Camelot Notation
&lt;/h3&gt;

&lt;p&gt;For DJs, we convert the detected key to &lt;strong&gt;Camelot notation&lt;/strong&gt;, which is the standard harmonic mixing system used in Pioneer DJ gear, rekordbox, and Mixed In Key:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Camelot&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Camelot&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ab minor&lt;/td&gt;
&lt;td&gt;1A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;B major&lt;/td&gt;
&lt;td&gt;1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Eb minor&lt;/td&gt;
&lt;td&gt;2A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;F# major&lt;/td&gt;
&lt;td&gt;2B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bb minor&lt;/td&gt;
&lt;td&gt;3A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Db major&lt;/td&gt;
&lt;td&gt;3B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F minor&lt;/td&gt;
&lt;td&gt;4A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Ab major&lt;/td&gt;
&lt;td&gt;4B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;C minor&lt;/td&gt;
&lt;td&gt;5A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Eb major&lt;/td&gt;
&lt;td&gt;5B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G minor&lt;/td&gt;
&lt;td&gt;6A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Bb major&lt;/td&gt;
&lt;td&gt;6B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;D minor&lt;/td&gt;
&lt;td&gt;7A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;F major&lt;/td&gt;
&lt;td&gt;7B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A minor&lt;/td&gt;
&lt;td&gt;8A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;C major&lt;/td&gt;
&lt;td&gt;8B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E minor&lt;/td&gt;
&lt;td&gt;9A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;G major&lt;/td&gt;
&lt;td&gt;9B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B minor&lt;/td&gt;
&lt;td&gt;10A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;D major&lt;/td&gt;
&lt;td&gt;10B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F# minor&lt;/td&gt;
&lt;td&gt;11A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;A major&lt;/td&gt;
&lt;td&gt;11B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Db minor&lt;/td&gt;
&lt;td&gt;12A&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;E major&lt;/td&gt;
&lt;td&gt;12B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The rule of thumb: adjacent Camelot numbers mix harmonically (8A → 9A, 8A → 7A), and same-number cross-scale transitions work (8A → 8B).&lt;/p&gt;

&lt;h3&gt;
  
  
  Why No Server Upload?
&lt;/h3&gt;

&lt;p&gt;Three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Privacy&lt;/strong&gt; — producers work with unreleased material. They should not have to trust a server with their beats.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instant UX&lt;/strong&gt; — no upload progress bar, no network errors, no timeouts. Files are decoded locally in milliseconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero server cost&lt;/strong&gt; — audio analysis is CPU-intensive. Offloading it to the client means we can offer this as a free tool indefinitely.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The trade-off is that we cannot analyze YouTube or Spotify links (the browser would need to download the audio first, which raises copyright issues). That feature is on the roadmap, but for now the tool is file-based.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned Building This
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;essentia.js is incredible but heavy.&lt;/strong&gt; The WASM binary is ~2MB, and initialization takes 1-3 seconds on a cold load. The lazy-loading pattern (only fetch when the BPM/Key Finder page opens) was essential to keep the main tag generator fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Smart windowing matters more than the algorithm.&lt;/strong&gt; Switching from full-track analysis to multi-window analysis with edge guards improved BPM accuracy by ~15-20% on real-world test tracks. The algos are mature; the preprocessing is where you win.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Web Workers are underrated for audio.&lt;/strong&gt; Running essentia on the main thread caused 200-500ms UI freezes per file. Moving to a Worker made batch analysis of 20+ files feel smooth. The message-passing overhead is negligible compared to the analysis time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Browser fingerprinting is a pragmatic middle ground.&lt;/strong&gt; No one wants to create an account for a free tool. Fingerprinting gives us abuse prevention and data isolation without signup friction. The trade-off is that clearing browser data resets your quota, but for a free tier that is acceptable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://tagmybeat.com/tools/bpm-key-finder" rel="noopener noreferrer"&gt;BPM and Key Finder&lt;/a&gt;&lt;/strong&gt; — free, no signup, works offline after the WASM loads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://tagmybeat.com" rel="noopener noreferrer"&gt;Producer Tag Generator&lt;/a&gt;&lt;/strong&gt; — 3 free generations per day&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The BPM/Key Finder source is visible in the browser (it is all client-side JS). The backend for the tag generator is in Python/FastAPI. If you are curious about any specific part, drop a comment — I am happy to dive deeper into the FFmpeg effects chains, the Edge TTS integration, or the essentia.js setup.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;TagMyBeat is built with Next.js 16, FastAPI, essentia.js, Edge TTS, FFmpeg, and Tailwind CSS. Deployed on Cloudflare + a VPS. No user accounts, no audio uploads for the BPM/Key Finder.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>showdev</category>
      <category>sideprojects</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
