DEV Community: ShaiZadok

I built a 100-point prompt scorer for SUNO AI — 16 checks, open-source on npm

ShaiZadok — Tue, 21 Apr 2026 06:24:25 +0000

Why deterministic prompt scoring?

A few months ago I was using SUNO AI and kept regenerating the same song idea 20-30 times before getting something close to what I imagined. The prompt syntax felt opaque. Genre close but sub-genre missed. Mood right but vocals wrong.

Turns out SUNO's prompt behavior is actually deterministic enough to score. So I wrote one: suno-prompt-scorer on npm (MIT).

What the scorer checks — 16 signals

Each check is weighted; total is a percentage 0-100:

#	Check	Category	Weight
1	Character limit (v4: 200, v4.5+: 1000)	style	8%
2	Genre collisions (53 known pairs)	style	10%
3	Weak token detection (context-aware)	style	8%
4	Strong token reward (48 hardware/prod anchors)	style	8%
5	Tag ordering weight `2/(1+k)`	style	8%
6	Genre in position 1	style	7%
7	Mood in position 2	style	5%
8	Per-category limits (genre 1-2, mood 1-2, instruments 2-3)	style	8%
9	Invalid tag detection (49 known bad)	style	8%
10	Suspicious tag detection (28 unverified)	style	4%
11	Misclassified subgenres (100 mapped)	style	5%
12	Bracket syntax validation	lyrics	7%
13	Regional coherence	advanced	4%
14	Version-specific warnings	advanced	5%
15	Ready-package proximity to benchmarks	style	5%
16	Bracket verbatim check (informational, weight 0)	lyrics	—

The anchor-based philosophy

The most interesting design decision: separate core nouns (must be verified) from modifiers (creative freedom).

So [shofar blast] passes — "Shofar" is a verified instrument, "blast" is a free modifier. But [QuantumSynth breakdown] fails — no verified anchor.

This preserves creativity (real producers combine real instruments in unexpected ways) while catching hallucinations.

// Core nouns: genres, instruments, keys, vocal types → verbatim
// Structural: [Intro], [Verse], [Chorus], [Drop] → verbatim
// Modifiers: blast, crystalline, thundering → free

What I learned about SUNO

Building this surfaced several non-obvious findings:

Position 1 is 60-70% of output DNA. The first tag dominates.
Tag weight drops 2/(1+k) per position. Position 6 has ~30% of position 1's weight.
Some modifiers are weak in isolation but strong in context. "Modern" alone is weak, "polished modern production" is specific.
V4.5+ supports 1,000 chars in Style, not 200. The 200-char limit was V4 only — still common misconception.
Collision pairs aren't obvious. "calm + aggressive" is easy; "minimal + orchestral" and "whisper + powerful vocals" are less so.

Usage

npm install suno-prompt-scorer

import { scorePrompt } from 'suno-prompt-scorer';

const result = scorePrompt(
  "Electropop, 128 BPM, 808 Bass, Moog bass, Confident, Euphoric",
);
console.log(result.total);        // 99
console.log(result.breakdown);    // 16 checks with pct + message

Contributions welcome

The knowledge base (4,000+ verified tags across 13 categories) is the main area where contributions help most — especially regional genres, edge cases, and emerging subgenres.

Disclosure: I'm the creator of AceTagGen. The scorer npm package is a standalone MIT-licensed extraction of the scoring engine. The web tool at acetaggen.com uses the same engine with a larger server-side knowledge base.

— Shai Zadok

Why Your SUNO Songs Sound Generic (And How to Fix It)

ShaiZadok — Mon, 20 Apr 2026 05:53:30 +0000

Let's be honest: most SUNO songs sound the same. Not bad — SUNO's audio quality is impressive across the board. But listenable and memorable are very different things. If you've made 50 songs and they all blend together in your memory, this article is for you.

We've analyzed hundreds of SUNO generations — the forgettable ones and the ones that make you hit replay. The difference almost always comes down to five specific mistakes. Here they are, with exact before-and-after examples for each.

Problem 1: Using Only the Genre Name

This is the single most common cause of generic output. You type "Rock" or "Jazz" or "Electronic" into the Style field and expect SUNO to read your mind about which kind.

Why it fails: Genre names like "Rock" or "Pop" map to enormous training data clusters. "Rock" covers everything from Chuck Berry to Radiohead to Nickelback. When you give SUNO only a genre name, it targets the statistical center of that entire cluster — the average. And the average of all rock music is... generic rock music. It's the musical equivalent of asking for "food" and getting a plain sandwich.

Bad example:
\Style: Pop \\

What you get: A perfectly adequate pop song that sounds like it was designed to play in an elevator of a trendy hotel. Pleasant. Forgettable. You've heard it a thousand times.

Fixed example:
\Style: 90s Garage Rock, dusty tape-saturated, surf guitar, raw vocals, 155 BPM \\

What you get: A specific, character-rich sound with lo-fi warmth, driving energy, and a retro edge that immediately stands out.

The fix formula: Always add at least three specificity layers to your genre:

Sub-genre: Not "Rock" but "Garage Rock" or "Shoegaze" or "Math Rock"
Era: "90s" or "70s" or "Late 80s" — this changes everything about production, instruments, and feel
Texture: "dusty tape-saturated" or "crisp digital" or "vinyl crackle" — this defines the sonic character

These three additions move you from the center of a massive cluster to a specific corner of it. That's where the interesting sounds live.

More examples of the pattern:

Generic	Specific
Jazz	50s Cool Jazz, smoky club, upright bass, brush drums
Electronic	90s Acid House, gritty 303, warehouse reverb, 130 BPM
Hip Hop	90s Boom Bap, dusty vinyl samples, SP-1200, head-nod groove
Classical	Late Romantic Orchestral, soaring strings, French horn, 72 BPM
R&B	90s Neo-Soul, warm Rhodes, velvet vocals, 85 BPM

Notice how every specific version immediately conjures a sound in your head. That's the power of specificity.

Problem 2: Over-Tagging (The "More Is Better" Trap)

You learned that tags matter, so you wrote 15 of them. You covered every category twice. Your Style field is a wall of descriptors. And the result sounds... muddled. Generic. Somehow less distinctive than simpler prompts.

Why it fails: SUNO processes tags as probability signals. Each tag pulls the generation in a direction. With 5-8 coherent signals, they reinforce each other — the AI has a clear target. With 15+ signals, many of them compete or overlap, creating noise. The AI can't serve all masters, so it compromises on all of them. The result is the statistical average of 15 different directions — which is, by definition, the center. The generic zone.

Bad example:
\Style: Emotional powerful epic cinematic dramatic dark moody intense orchestral symphonic sweeping grand majestic beautiful atmospheric haunting \\

15 adjectives. Zero specificity. Half of them are synonyms. The AI reads this as: "something big and emotional" — and gives you the most average "big and emotional" track in its training data.

Fixed example:
\Style: Dark Cinematic Orchestral, 60 BPM, cello lead, thunderous timpani, haunting female choir \\

6 descriptors. Genre first (Dark Cinematic Orchestral). Numeric BPM (60). Three specific sonic elements (cello, timpani, female choir). Every tag does different work. Nothing overlaps.

The fix formula: For each tag in your prompt, ask: "Is this doing work that no other tag already covers?" If two tags point in the same direction — cut one. If a tag is abstract and vague — replace it with something specific. The goal is 5-8 tags where each one pulls a different lever.

Per-category limits to follow:

Genre: 1-2 (max 3)
Mood/Energy: 1-2 (max 2)
Instruments: 2-3 (max 4)
Vocal Style: 1 (max 2)
Production/Texture: 1-2 (max 3)
BPM: 1
Era: 0-1

If your prompt exceeds these limits in any category, you're over-tagging.

Problem 3: Ignoring Brackets (The Biggest Missed Opportunity)

Most SUNO users spend 90% of their effort on the Style field and leave the Lyrics field as plain text. This is like hiring a film crew and then never directing them. The Lyrics field's bracket system is where the real power lives.

Why it fails: The Style field sets the global tone — it's the "casting director." But bracket tags in the Lyrics field are the "film director" — they control what happens moment to moment. Community testing confirms that bracket instructions are roughly 10x more powerful than Style field descriptors for arrangement control.

If you're only using the Style field, you're leaving 90% of SUNO's control system untouched.

Bad example:
\Style: Emotional Pop Ballad, piano, soft, building Lyrics: I remember the days we shared Walking through the autumn air You smiled and the world stood still Nothing else could match the thrill \\

No section tags. No dynamic changes. No performance direction. SUNO will generate a flat, single-energy track from start to finish.

Fixed example:
\`
Style: Emotional Pop Ballad, piano, breathy vocals, 70 BPM

Lyrics:
[Soft Verse]
[intimate]
I remember the days we shared
Walking through the autumn air

[Crescendo]

[Powerful Chorus]
[belted vocals]
You smiled and the world stood still
Nothing else could match the thrill
(nothing else, nothing else)

[Decrescendo]

[Whispered Bridge]
And I wonder if you feel it too...
`\

What changed: The same lyrics now have dynamic range — a soft, intimate verse that builds into a powerful, belted chorus with an echo ad-lib, then pulls back into a whispered bridge. The song breathes. It moves. It takes you somewhere.

The fix formula:

Always use section tags: [Verse]\, [Chorus]\, [Bridge]\, [Outro]\
Add 1-2 modifier tags per section: energy, instrument, or delivery
Use [Crescendo]\ / [Decrescendo]\ for dynamics between sections
Use parentheses for ad-libs and echoes: (ooh)\, (yeah!)\
Place tags directly before the lyrics they affect — proximity matters

Problem 4: Same Verse Twice (The Loop Trigger)

You wrote great lyrics for Verse 1 and then... copied them for Verse 2. Or you wrote nearly identical verses with minor word changes. Now SUNO is looping the same melody and energy for both — the song feels like it's going in circles.

Why it fails: SUNO tends to loop when it detects repeated patterns. If Verse 1 and Verse 2 have the same lyrics, the same syllable count, and the same structure, the AI interprets this as "repeat what I did before." It recycles the melody, the arrangement, and the energy. The result is a song that feels stuck — identical halves of the same section pasted together.

Bad example:
\`
[Verse 1]
Walking down this empty road
Carrying a heavy load

[Verse 2]
Walking through this lonely night
Searching for a distant light
`\

Both verses: same syllable count, same rhythm, same structure, even similar imagery. SUNO will almost certainly produce the same melody for both.

Fixed example:
\`
[Verse 1]
[soft, acoustic]
Walking down this empty road
Carrying a heavy load

[Verse 2]
[building intensity]
The streetlights blur and the rain won't stop
I scream your name from the rooftop
But nobody's listening anymore
`\

What changed: Verse 2 is structurally different — longer lines, different syllable count, different emotional energy, different imagery. The bracket tag [building intensity]\ tells SUNO to evolve the arrangement. The song progresses instead of looping.

The fix formula:

Make Verse 2 lyrically different — don't just swap a few words
Change the line length or syllable count between verses
Use different bracket modifiers for each verse
Evolve the emotional arc — Verse 1 is reflective, Verse 2 is urgent
Add an instrumental transition between verses (empty lines or parenthetical direction)

Problem 5: No BPM (The Anchor That Holds Everything Together)

You set the genre, the mood, the instruments, and the vocal style. But you didn't specify a BPM. And the song came out at some default tempo that doesn't match the energy you wanted.

Why it fails: BPM isn't just tempo — it's one of SUNO's strongest anchors. A numeric BPM value influences:

Drum pattern selection
Rhythmic feel
Energy level
Genre alignment (a 170 BPM track will naturally lean toward drum and bass patterns even without explicit genre tags)
Vocal pacing and delivery

Without a BPM, SUNO guesses based on genre norms. But genre norms cover wide ranges — "Pop" can be 90 BPM or 130 BPM, and those are completely different vibes.

Bad example:
\Style: Dark Techno, menacing, heavy bass, industrial \\

Without BPM, SUNO might generate anything from 110 (too slow for techno) to 150 (too fast for the "menacing" vibe). The result feels misaligned — the energy doesn't match the mood.

Fixed example:
\Style: Dark Techno, menacing, 135 BPM, heavy bass, gritty analog, industrial \\

With 135 BPM\, SUNO locks into the right tempo range for dark techno. The drum patterns, bass rhythm, and overall energy all align because the BPM anchor tells the AI exactly where to sit on the energy spectrum.

The fix formula: Always include a numeric BPM. Here are reference ranges for common genres:

Genre	BPM Range	Sweet Spot
Ballad	60-80	70
R&B / Soul	75-95	85
Pop	100-130	120
Rock	110-140	128
House / EDM	120-130	128
Techno	125-145	135
Drum and Bass	170-180	174
Phonk	130-160	145

Pick a BPM within the range and commit to it. Don't leave it to chance.

The Honest Summary

If your SUNO songs sound generic, the fix isn't more creativity or better lyrics or a premium subscription. It's usually one (or more) of these five specific mistakes:

Vague genre — add sub-genre + era + texture
Too many tags — cut to 5-8 focused descriptors
No brackets — use section tags and dynamic modifiers in the Lyrics field
Identical verses — make Verse 2 structurally different
No BPM — always include a numeric tempo anchor

We've all been there. The first 20 songs anyone makes with SUNO sound roughly the same. The difference between casual users and people producing standout tracks is usually just these five fixes applied consistently.

AceTagGen was built specifically to prevent these mistakes. The Questionnaire enforces per-category limits so you can't over-tag, requires genre specificity beyond just a name, always includes BPM, and structures the Lyrics field with proper bracket tags and dynamic progression.

Stop making average songs. Start making yours.

Disclosure: I'm the creator of AceTagGen — the tool referenced throughout this article. Originally published at acetaggen.com/blog/why-your-suno-songs-sound-generic.

SUNO V5 vs V4.5: The Complete Comparison

ShaiZadok — Mon, 20 Apr 2026 05:47:31 +0000

One of the most common mistakes SUNO users make is treating all versions the same. They write a prompt, hit generate, and wonder why it sounds different (or worse) than what they got last time. The reason is usually that they switched versions without adjusting their approach.

Each SUNO version has a distinct personality — different strengths, different weaknesses, and different prompt styles that get the best results. This guide breaks down every version you can currently use, when to pick each one, and how to migrate prompts between them.

The Version Lineup

V3.5: The Obedient Soldier

Audio quality: Lowest of the current options
Personality: Follows complex structural demands literally
Best for: Avant-garde experiments, complex arrangements, structural precision

V3.5 is described by the community as "dumber but more obedient." It doesn't have the audio quality of newer models, but it follows your instructions with an almost mechanical precision that the newer, smarter models sometimes resist.

If you need a song with seven sections, three tempo changes, alternating vocal styles, and unconventional structure — V3.5 is your model. It won't second-guess your decisions or try to "fix" your arrangement into something more conventional.

When to use V3.5:

Experimental and avant-garde compositions
Songs with complex, non-standard structures
When you need the AI to follow instructions literally, even unusual ones
Structural prototyping before re-rendering in a higher-quality version

When to avoid V3.5:

Any production where audio quality matters
Commercial releases or portfolio pieces
Simple songs where newer versions will sound dramatically better

Prompt style for V3.5:
Be as detailed and specific as you want. V3.5 handles long, complex prompts better than any other version. Stack bracket tags, specify precise arrangements, give it a 15-section song structure — it will attempt all of it.

V4.5: The Heavy Hitter

Audio quality: Good (a clear step up from V3.5)
Personality: Excels at dense, aggressive, heavy music
Best for: Metal, Industrial, Hard Rock, Punk, and heavy electronic genres
Song length: Up to 8 minutes

V4.5 was where SUNO started getting serious about audio quality. But its real strength is genre-specific: it handles heavy genres better than any other version, including V5. The training data and tuning for V4.5 were optimized for dense, distorted, high-energy music.

V4.5 also supports conversational-style prompts — you can write more naturally instead of using rigid tag syntax. And it supports songs up to 8 minutes, which is critical for genres like progressive metal or extended EDM tracks.

When to use V4.5:

Metal (all sub-genres: death, black, progressive, doom, thrash)
Industrial and dark electronic
Hard rock and punk
Any track over 4 minutes that needs sustained intensity
Dense arrangements with multiple layers of distortion

When to avoid V4.5:

Clean, acoustic, or minimalist genres (V5/V5.5 handle these better)
When you want the highest possible audio resolution
Simple pop or singer-songwriter material

Prompt style for V4.5:
V4.5 responds well to descriptive, almost conversational prompts. Instead of pure tag lists, you can write things like "A crushing industrial metal track with mechanical drum patterns and distorted vocal screams over a wall of synthesizers." This conversational style often produces better results in V4.5 than strict tag formatting.

V5: The Clean Machine

Audio quality: Highest fidelity (48kHz)
Personality: Simplest prompts produce the best results
Best for: Clean genres, pop, singer-songwriter, acoustic, jazz, classical
Key feature: Negative prompting support

V5 represents a major leap in audio quality — 48kHz output, 10x faster generation, and noticeably cleaner audio across all genres. But it comes with a trade-off: V5 works best with simpler prompts.

Where V3.5 thrived on complexity and V4.5 handled conversational detail, V5 actually performs better when you give it fewer, clearer instructions. The model is smart enough to fill in the gaps, and over-specifying can lead to worse results than trusting the AI's interpretation.

V5 also introduced proper negative prompting support — the "no X" syntax in the Style field actually works reliably (though the dedicated Exclude Styles field is still more reliable).

When to use V5:

Pop, singer-songwriter, and acoustic music
Jazz, classical, and any genre where audio clarity matters
When you want the cleanest possible output
Productions where audio quality is the top priority
When using negative prompting / Exclude Styles

When to avoid V5:

Heavy metal and aggressive genres (V4.5 still handles the weight better)
Complex structural arrangements (V5 may simplify them)
When you want maximum expressiveness (V5.5 has the edge)

Prompt style for V5:
Less is more. The 5-8 tag sweet spot is especially important here. Write clean, focused prompts with strong tokens. Avoid redundant descriptors. Use the colon syntax for precision: [Energy: High]\, [Mood: Euphoric]\, [Texture: Grainy]\.

Example V5 prompt:
\Indie Folk, fingerpicked acoustic guitar, warm analog, breathy female vocals, 95 BPM, intimate \\

That's 7 descriptors, and V5 will produce a beautiful, detailed track from them. Add 10 more descriptors and you'll likely get something worse.

V5.5: The Artist

Audio quality: Very high (close to V5)
Personality: Most expressive and emotionally dynamic
Best for: Vocal-forward tracks, cross-genre experimentation, personalized sound
Key features: Voice Cloning, Custom Models, "My Taste" personalization

V5.5 is SUNO's most expressive model. Where V5 optimizes for clean audio, V5.5 optimizes for performance — the vocals are more dynamic, the emotional range is wider, and the AI takes more creative risks with interpretation.

The standout feature is Voice Cloning: you record a spoken phrase to verify rights to your voice, and V5.5 can then maintain that "vocal fingerprint" across all your tracks. Combined with Custom Models, this makes V5.5 the version for artists building a consistent brand.

V5.5 also has a known behavior called "pop-washing" — it tends to smooth niche genres toward pop conventions. A metal prompt in V5.5 may come out with cleaner vocals and more polished production than the same prompt in V4.5. This is a feature for some use cases and a bug for others.

When to use V5.5:

Vocal-forward tracks where emotional expression matters most
When using Voice Cloning or Custom Models
Cross-genre experimentation (the AI handles genre blending well)
Pop, R&B, soul, and any genre where vocal nuance is critical
Building a consistent sonic identity across multiple tracks

When to avoid V5.5:

Raw, unpolished genres (punk, lo-fi, garage rock) — V5.5 may over-polish
Pure instrumental work (voice cloning features are wasted)
When you want exact genre accuracy for niche styles (pop-washing risk)

Prompt style for V5.5:
Similar to V5 — clean and focused. But V5.5 responds particularly well to emotional and performance descriptors: "vulnerable," "intimate," "soaring," "desperate." These weak tokens that are inconsistent in other versions become much more reliable in V5.5 because the model was trained specifically for expressive interpretation.

The Quick Decision Matrix

I want...	Use...
Best audio quality	V5
Heavy metal / industrial	V4.5
Most expressive vocals	V5.5
Complex structure control	V3.5
Songs over 4 minutes	V4.5
Voice cloning	V5.5
Simplest workflow	V5
Clean pop / acoustic	V5 or V5.5
Avant-garde / experimental	V3.5
Cross-genre blending	V5.5

Migration Tips: Moving Prompts Between Versions

One of the most frustrating SUNO experiences is having a prompt that works perfectly in one version and produces garbage in another. Here's how to migrate:

V4.5 to V5

Problem: Your detailed V4.5 prompt sounds over-processed or confused in V5.

Solution: Simplify. Remove 30-50% of your descriptors. V5 doesn't need (and doesn't want) the level of detail V4.5 thrived on.

Before (V4.5):
\Aggressive industrial metal with mechanical percussive patterns, distorted screaming vocals layered with clean harmonies, heavy synthesizer bass, glitchy digital textures, dark and menacing atmosphere, 145 BPM, compressed and loud \\

After (V5):
\Industrial Metal, distorted vocals, heavy synth bass, dark menacing, 145 BPM \\

V5 to V5.5

Problem: Your clean V5 prompt comes out "pop-washed" in V5.5.

Solution: Add genre-anchoring specificity. If the genre is niche, reinforce it with era and sub-genre descriptors. Add "raw" or "unpolished" if you want to counter the pop-wash tendency.

Before (V5):
\Black Metal, blast beats, shrieking vocals, tremolo picking \\

After (V5.5):
\Raw Black Metal, 90s Norwegian, lo-fi tape warmth, blast beats, shrieking vocals, tremolo picking, unpolished \\

V3.5 to V5/V5.5

Problem: Your complex structural prompt doesn't work in newer versions.

Solution: Simplify the structure. Newer versions resist complex, non-standard arrangements. Reduce from 7 sections to 4-5 standard ones. Move structural complexity into the Lyrics field bracket tags rather than the Style field.

Any version: the universal fix

If a prompt stops working after a version switch, try this:

Cut the Style field in half — remove the weakest/most abstract tags
Move arrangement instructions to the Lyrics field — bracket tags are more reliable than Style field for structure
Add a numeric BPM if you don't have one — it anchors everything
Test with a 1:30 generation before committing to a full song

The Bottom Line

There's no single "best" SUNO version. Each one is a different tool optimized for different jobs:

V3.5 = precision at the cost of quality
V4.5 = power and weight
V5 = clarity and simplicity
V5.5 = expression and personality

The best producers don't pick one version — they pick the right version for each project. And they adjust their prompt style to match what that version responds to best.

AceTagGen's Questionnaire automatically optimizes tag output based on your selected version, so your prompts always match the model you're targeting.

Disclosure: I'm the creator of AceTagGen — the tool referenced throughout this article. Originally published at acetaggen.com/blog/suno-v5-vs-v4-complete-comparison.

The Science Behind Perfect SUNO Prompts

ShaiZadok — Mon, 20 Apr 2026 05:47:25 +0000

SUNO doesn't read your prompts like a human reads a sentence. It processes them as weighted probability signals inside a neural network — a layered system that converts your text into mathematical vectors, pairs them with weights, and renders audio in a single pass. Understanding how this engine actually works is the difference between getting lucky and getting consistent.

This article breaks down the real mechanics behind SUNO's prompt processing, based on verified community research and hundreds of hours of testing.

How SUNO Actually Processes Your Input

SUNO operates as a dual-brain model. There are two distinct input channels, and they do fundamentally different things:

The Style Field is the "Global Brain." It establishes your song's core DNA — genre, mood, instrumentation, vocal style, production quality. Think of it as the casting director: it decides who shows up to the recording session before a single note is played.

The Lyrics Field is the "Local Brain" — or more precisely, the Timeline Architect. It triggers state changes and arrangement shifts at specific moments in the song. This is where you direct the performance in real time.

Here is the critical insight most people miss: these two brains have very different levels of influence. A bracketed instruction in the Lyrics field is roughly 10x more powerful than the same instruction placed in the Style field for arrangement control. The Style field sets the global tone; the Lyrics field overrides it locally.

Left-to-Right Priority: The Most Important Rule You'll Learn

SUNO follows a left-to-right priority system. The first tag in your Style field carries significantly more weight than the second, which carries more than the third, and so on. Community testing suggests the weight drops roughly by half with each position — approximately following the formula:

\Weight = 2 / (1 + position) \\

Position 1 gets a weight of ~1.0. Position 2 gets ~0.67. Position 3 gets ~0.5. By position 6, a tag carries only about 15% of the influence of the first tag.

This means your first tag is your most powerful decision. If you write Pop, Electric Guitar, Aggressive, 140 BPM\ — you get a Pop song with some electric guitar flavor. But write Electric Guitar, Aggressive, 140 BPM, Pop\ — and you get a guitar-driven track that happens to have pop structure.

The first 20-30 words in your Style field serve as "anchors" — they define the core DNA that everything else modifies. After that, you're adding seasoning to an already-set dish.

The Recommended Order

Based on how the weighting system works, the optimal tag order is:

Genre + Era (the foundation — this gets the most weight)
Mood / Energy (emotional direction)
Key Instruments (sonic palette)
Vocal Style (performance character)
Production Quality / Texture (sonic finish)
BPM (tempo anchor)

This order isn't arbitrary. Genre tags map to the largest training data clusters. Placing genre first ensures SUNO pulls from the right pool of musical DNA before applying modifications.

The 5-8 Tag Sweet Spot (And Why It Exists)

One of the most common mistakes is over-tagging. People stuff 15-20 descriptors into the Style field thinking more detail means better results. The opposite is true.

5-8 focused descriptors across all categories is the verified sweet spot. Here's why:

SUNO processes tags as probability signals. Each tag pulls the generation in a certain direction. When you have 5-8 clear signals, they reinforce each other — the AI has a strong, coherent target to aim at.

When you have 15+ signals, many of them compete. "Ethereal" pulls one direction while "Punchy" pulls another. "Lo-fi" wants tape warmth while "Crisp" wants digital clarity. The AI averages these conflicting signals and produces something generic — not because it's incapable, but because you asked for everything at once.

Per-Category Limits

The 5-8 total recommendation breaks down like this per category:

Category	Recommended	Maximum
Genre	1-2	3
Mood/Energy	1-2	2
Instruments	2-3	4
Vocal Style	1	2
Production/Texture	1-2	3
Tempo/BPM	1	1
Era	0-1	1

Notice how even the maximums add up to about 16 — but you should never use all maximums simultaneously. Pick 2-3 categories to emphasize and keep the rest minimal.

The 1,000-Character Limit: How to Maximize Every Character

The Style field supports up to 1,000 characters in V4.5 and later. (The old 200-character limit was a UI constraint in legacy V4 — it no longer applies.)

Most people waste this space with redundant descriptors. Here's how to make every character count:

Use strong tokens instead of weak ones. Strong tokens like "TR-909 kick," "Moog bass," or "120 BPM" pull from specific training data and produce consistent results. Weak tokens like "beautiful," "ethereal," or "amazing" are abstract — the AI interprets them loosely and inconsistently.

Be specific about era and texture. "Rock" maps to a massive, unfocused cluster. "90s Garage Rock, dusty tape-saturated" maps to a very specific sound. The more precise your anchors, the less the AI has to guess.

Include numeric BPM. A BPM number is one of the strongest anchors available. It doesn't just set tempo — it influences the entire rhythmic feel, drum pattern selection, and energy level of the generation.

Real Examples: Bad Prompt vs. Good Prompt

Example 1: The Over-Tagger

Bad prompt:
\Beautiful emotional powerful epic cinematic orchestral dramatic inspiring uplifting majestic soaring sweeping grand triumphant \\

Why it fails: 13 abstract adjectives, zero specificity. SUNO has no genre anchor, no era, no instruments, no tempo. Every tag is a "weak token" that the AI interprets loosely. Result: generic cinematic music that sounds like stock audio.

Good prompt:
\Epic Orchestral, Late Romantic era, 72 BPM, soaring strings, French horn melody, thunderous timpani, triumphant \\

Why it works: Genre first (Epic Orchestral), era for specificity (Late Romantic), numeric BPM anchor (72), three specific instruments pulling from defined training clusters, one mood descriptor to unify the emotional direction. 7 descriptors, each one doing real work.

Example 2: The Genre-Only Prompt

Bad prompt:
\Rock \\

Why it fails: "Rock" maps to one of the largest training clusters in SUNO's model. Without any narrowing descriptors, the AI picks the statistical average of all rock music — which is generic, mid-tempo, and forgettable.

Good prompt:
\90s Garage Rock, raw and distorted, dusty tape-saturated, fuzz guitar, driving drums, snarling vocals, 155 BPM \\

Why it works: Sub-genre + era (90s Garage Rock) narrows the cluster dramatically. Texture descriptors (raw, distorted, tape-saturated) tell the AI what production quality to target. Specific instruments (fuzz guitar, driving drums) define the sonic palette. Vocal style (snarling) sets the performance character. BPM (155) anchors the energy. Every tag is a strong signal pointing in the same direction.

Example 3: The Conflicting Prompt

Bad prompt:
\Calm Aggressive Lo-fi Heavy Metal whispered screaming acoustic electric 60 BPM 180 BPM \\

Why it fails: Every pair of tags contradicts the previous one. Calm vs. Aggressive. Lo-fi vs. Heavy Metal. Whispered vs. Screaming. 60 BPM vs. 180 BPM. The AI averages all conflicts and produces incoherent mush.

Good prompt:
\Dark Ambient Metal, 75 BPM, whispered verses, droning bass guitar, eerie atmospherics, lo-fi vinyl crackle \\

Why it works: It picks ONE direction and commits. The "dark ambient" and "metal" combine into a coherent sub-genre. The low BPM sets a brooding pace. Whispered vocals fit the mood. The lo-fi texture descriptor includes "vinyl crackle" — because lo-fi tags without a specific texture (vinyl crackle, tape warmth, or analog hiss) produce results that sound "too clean" and miss the mark entirely.

The Lyrics Field: Where the Real Power Lives

While the Style field sets the foundation, the Lyrics field is where arrangement control happens. Bracket tags in the Lyrics field are the most powerful tools SUNO offers for shaping a song.

Key rules for bracket tags:

Maximum 2-4 tags per section (more and SUNO ignores most of them)
Keep each tag to 1-3 words (long phrases may be sung as lyrics)
Each tag on its own line in separate brackets
One instrument cue per section (3+ instruments per section = muddy)
One delivery cue per section ([whisper]\ OR [rap]\, not both)

The ideal formula:

\[Section Name] [instruction1] [instruction2] Lyrics here... \\

Place tags directly before the lyrics they affect — not at the top of a long section. SUNO has a short attention span for bracket instructions, so proximity matters.

Putting It All Together

The difference between amateur and professional SUNO prompts comes down to three things:

Understanding priority — your first tag matters 3-5x more than your last
Using strong tokens — specific instruments, numeric BPM, and era descriptors beat vague adjectives
Respecting the limits — 5-8 focused tags outperform 15 scattered ones every time

This is exactly what AceTagGen automates for you. Our questionnaire walks you through each category in the right order, enforces per-category limits, and uses our database of 3,000+ community-verified tags to ensure every descriptor is a strong token that SUNO actually responds to. No guesswork, no wasted characters, no conflicting signals.

Build your first prompt the right way — try the Questionnaire now.

Disclosure: I'm the creator of AceTagGen — the tool referenced throughout this article. Originally published at acetaggen.com/blog/science-behind-perfect-suno-prompts.