Back to Blog MasterForge Blog

Suno v5 vs v5.5: What Actually Changed in the Audio?

March 2026  ·  15 min read  ·  Audio Analysis & AI Music  ·  By Petri Korhonen

Suno v5.5 arrived with Voices, Custom Models, and My Taste. The community consensus was immediate: “it sounds better.” Forums filled with praise. Comparisons were based on vibes, memory, and enthusiasm.

We wanted something more concrete. So we generated the same track on both versions — five times, across five genres — using identical prompts, identical settings, and zero seeds. Then we ran spectral analysis on every pair: 15 metrics per track, scipy STFT, middle 60% of each clip, M/S decomposition.

The result? v5.5 is not simply “cleaner.” It’s doing something far more interesting. It’s adapting its output to match the genre — making intelligent decisions about which frequencies to emphasize and which to discard. But it also introduces new challenges that your mastering chain needs to handle.

This is what 15 metrics and 5 genres actually tell you.

How We Tested

We generated five tracks spanning a wide range of complexity and genre conventions:

Glass River

Piano ballad
2 voices · 72 BPM

District

Indie rock
3–4 voices · 128 BPM

Iron Doctrine

Thrash metal
5+ voices · 168 BPM

Reactor Core

Hard techno
3–4 voices · 148 BPM

Count the Days

Dark hip-hop
2–3 voices · 85 BPM

For each track, we used identical style prompts, exclude lists, Weirdness, and Style settings on both v5 and v5.5. No seeds — we wanted to test the model’s own generation capability. Four generations were made per version, and the closest-matching pair was selected for analysis.

Important Limitation

Different generations are different performances. These are not controlled A/B codec tests — they’re two AI performances of the same brief. Arrangement differences between generations are expected and documented. We compensate by analyzing spectral characteristics (which reflect model behavior) rather than waveform alignment (which reflects arrangement).

Analysis pipeline: Python 3, scipy.signal.stft with nperseg=4096 (~11.7 Hz frequency resolution at 48 kHz), middle 60% of each clip to avoid intro/outro artifacts, and M/S decomposition for stereo analysis. Same methodology as our Guide A artifact analysis.

The Simple Story (And Why It’s Wrong)

If you only looked at shimmer — the metallic high-frequency artifact we measured in Guide A — you might conclude that v5.5 is a mixed bag. Two tracks improved, three got worse:

Shimmer comparison across 5 tracks: v5 vs v5.5
Shimmer (6–14 kHz energy ratio) across all five tracks. Lower is typically better — but not always.

Glass River’s shimmer dropped 54.8% — a massive cleanup. District improved by 12.7%. But Iron Doctrine’s shimmer increased 258%, Reactor Core went up 64%, and Count the Days rose 16%.

Case closed? v5.5 only helps simple arrangements?

Not even close. This is where most comparisons stop, and where ours begins.

When Better Numbers Mean Worse Sound

Here is the single most important finding in this entire analysis.

Iron Doctrine is a dense thrash metal track: 5+ simultaneous voices, 168 BPM, dual distorted guitars, double bass drum, screamed vocals. On v5, the shimmer metric read 3.48% — suspiciously low for the densest track in our test. For context, the simpler District (3–4 voices) measured 8.49%.

How can denser metal be “cleaner” than mid-complexity rock?

The Answer

v5 didn’t clean up the high frequencies. It killed them. The v5 spectrum drops off sharply above 2 kHz, losing 10–15 dB compared to v5.5 across the entire 2–14 kHz range. v5 “solved” shimmer by making the metal sound muffled — removing artifacts AND legitimate high-frequency content together. Pick attack, cymbal presence, vocal edge — all gone.

Iron Doctrine spectrograms: v5 vs v5.5
Iron Doctrine spectrograms. Left: v5 (dark above 2 kHz — suppressed HF). Right: v5.5 (visible energy across full range — restored HF).

The spectrum overlay makes this unmistakable:

Iron Doctrine spectrum overlay: v5 vs v5.5
Iron Doctrine average frequency spectrum. The two curves cross at ~2 kHz. Below: v5 has more energy (mid-range mud). Above: v5.5 restores the full high-frequency range.

v5.5’s shimmer reads 12.45% — nearly four times higher. But when we put on headphones, the v5.5 version was dramatically better. More detail, better bass definition, more aggressive — actually sounding like metal instead of metal heard through a pillow.

🎧 Listening notes — Sennheiser hi-fi headphones

v5: Terrible quality. Classic v5 failure on dense material. Muffled, lifeless.
v5.5: Significantly better. Genre-typical metallic shimmer present (this is correct for the style). Much better bass definition and detail across the board.

This is the blog’s core argument: a metric alone doesn’t tell you quality. The shimmer number measures energy in 6–14 kHz, but it cannot distinguish between codec artifact noise and legitimate musical content. In genres with naturally high HF energy — metal, electronic, bright pop — a higher shimmer reading may actually mean the model is doing its job better.

This is why “measurement + listening” beats both “just vibes” and “just numbers.”

The Real Story: v5.5 Is Genre-Adaptive

Once we looked past shimmer alone, a clear pattern emerged across all five tracks. v5.5 doesn’t apply a single processing change to everything. It adapts its spectral profile to match genre conventions.

Spectral centroid tells the story

The spectral centroid — the “center of gravity” of a track’s frequency content — moved in opposite directions depending on genre:

TrackGenreCentroid v5Centroid v5.5Direction
Glass RiverPiano ballad2,414 Hz1,809 Hz↓ 25% darker
DistrictIndie rock4,381 Hz3,781 Hz↓ 14% warmer
Iron DoctrineThrash metal3,578 Hz5,157 Hz↑ 44% brighter
Reactor CoreHard techno3,909 Hz4,499 Hz↑ 15% brighter
Count the DaysDark hip-hop3,264 Hz3,817 Hz↑ 17% brighter

Piano ballad gets warmer. Rock gets warmer. Metal gets brighter and more aggressive. These are exactly the directions a human mix engineer would take each genre. v5.5 isn’t just applying a blanket EQ — it’s making genre-informed spectral decisions.

The Memphis proof

The strongest single piece of evidence came from Count the Days, our dark hip-hop track. Given the same prompt, v5 generated a modern dark trap production. v5.5 generated a Memphis / Three 6 Mafia-style interpretation — a fundamentally different sound world from the same words.

Count the Days spectrum overlay: v5 vs v5.5
Count the Days spectrum overlay. v5.5’s Memphis interpretation has dramatically more sub-bass energy and a different spectral profile than v5’s modern trap version.

The numbers confirm the listening experience: sub-bass increased 56.5%, presence dropped 53.9%, stereo width collapsed to near-mono (−85.9%), and dynamic range exploded by 75.5% (15.1 → 26.5 dB). Every one of these changes is genre-authentic for Memphis hip-hop: heavy 808s, dark frequency balance, mono imaging, empty space between hits.

v5.5 didn’t just generate a “better” hip-hop track. It made a creative genre decision that happened to be more faithful to the prompt’s “dark hip hop with heavy 808 sub-bass” description.

What Actually Improved Across the Board

Dynamic range: 4 out of 5 tracks

The most consistent genuine improvement was in dynamic range. v5.5 generates music with more breathing room — less aggressive compression, more contrast between loud and quiet sections.

Dynamic range comparison across 5 tracks
Dynamic range (P95–P5 of RMS frames) across all five tracks. v5.5 improved dynamics in four out of five cases.
TrackDR v5DR v5.5Change
Glass River19.2 dB17.3 dB−9.9% (already dynamic)
District5.7 dB9.2 dB+61.4%
Iron Doctrine6.5 dB7.5 dB+15.4%
Reactor Core9.4 dB11.1 dB+18.1%
Count the Days15.1 dB26.5 dB+75.5%

For mastering, this means less work fighting over-compressed source material. v5.5 tracks have more natural dynamics that respond better to limiting and loudness optimization.

Bass leak improved in electronic and metal genres

Bass leak (Side/Mid energy ratio below 200 Hz) — a key indicator of mono compatibility — improved significantly where it matters most:

TrackGenreBass Leak v5Bass Leak v5.5Change
Iron DoctrineMetal0.2600.114−56.1%
Reactor CoreTechno0.0030.002−37.8%
Count the DaysHip-hop0.0040.003−21.7%
DistrictRock0.0430.046+6.4% (stable)
Glass RiverPiano0.1990.390+96.0% (wider piano imaging)

v5.5 keeps bass centered for genres where mono compatibility matters (club, PA systems, phone speakers). The Glass River exception may reflect more realistic piano stereo imaging in the low register — real pianos do have stereo bass. Genre-adaptive behavior again.

What Didn’t Change (And What Got Worse)

Fog remains: 3 worse, 1 unchanged, 1 improved

Fog — spectral flatness in the 400–2 kHz range, the “muddy blanket” over the mid-range — was not addressed by v5.5. In fact, most tracks got slightly foggier:

TrackFog v5Fog v5.5Change
Glass River0.0370.038+1.9%
District0.1070.126+17.6%
Iron Doctrine0.2110.302+43.0%
Count the Days0.0880.108+23.9%
Reactor Core0.3390.301−11.1%

The only fog improvement came from Reactor Core — hard techno with discrete spectral peaks (kick fundamental, acid resonance, pad drone). Synthetic signals may be easier for the codec to preserve than broadband acoustic content. If your track sounds muddy in the mid-range, v5.5 won’t fix it. Arrangement is still the primary defense against fog.

Presence consistently reduced: 4 out of 5 tracks

Presence comparison across 5 tracks
Presence energy (2–5 kHz) across all five tracks. v5.5 reduced presence in 4 out of 5 cases — significantly.

v5.5 systematically pulls back 2–5 kHz energy — the vocal presence and clarity range. Glass River: −84.4%. District: −66.8%. Count the Days: −53.9%. Reactor Core: −30.1%. Only Iron Doctrine got a presence boost (+48.8%) — the genre that needs aggression.

This is the single most consistent spectral change across all five tracks. Your mastering chain likely needs a presence boost around 2–5 kHz when working with v5.5 material.

Sub-bass increased in 4 out of 5 tracks

v5.5 produces significantly more low-end weight in most genres: Glass River +192%, Iron Doctrine +123%, Count the Days +57%, District +33%. Only Reactor Core saw a decrease (−27%), likely due to a different kick character.

More sub-bass means more energy fighting for headroom during mastering. High-pass filtering and careful low-end management become more important with v5.5 sources.

Stereo: narrower by default, but prompt-responsive

Four out of five tracks came out narrower on v5.5 — some dramatically so (Count the Days: −85.9%, Glass River: −55.4%, Iron Doctrine: −50.9%). The single exception was District, whose prompt explicitly requested “wide stereo image” — and v5.5 delivered (+41.9%).

This suggests v5.5 defaults to tighter stereo imaging but responds better to stereo instructions in the style prompt. If you want width, ask for it explicitly.

The Cleanest Improvement: Glass River

For a straightforward “did it get better” story, Glass River is the clearest example. A simple piano ballad with two voices — the kind of track where codec artifacts are most audible.

Glass River spectrograms: v5 vs v5.5
Glass River spectrograms. v5 (left) shows diffuse energy above 6 kHz — shimmer haze. v5.5 (right) has cleaner harmonic separation.
Glass River spectrum overlay
Glass River average spectrum. v5 is 10+ dB louder in the 2–5 kHz presence range. v5.5 is warmer, with more sub-bass and less brightness.
🎧 Listening notes

v5: Audible background hiss and noise. Sound feels thinner, more fragile, emptier.
v5.5: Background is very quiet. Piano and transients are clearer. Better instrument separation. Sound is fuller and warmer.

The data confirms the ears: shimmer dropped 54.8% (2.31% → 1.05%), which is actually below the 1.5% we measured on our Guide A reference track Resonance. v5.5 piano ballads are approaching “mastered” levels of high-frequency cleanliness straight out of the generator.

Open Questions

Two observations from our testing that we can’t fully explain yet:

District’s quality degradation. The v5.5 version of District (indie rock) sounded noticeably worse toward the end of the track — more artifacts, more hiss. Our middle-60% analysis didn’t capture this because it focuses on the core section. Does v5.5’s codec “budget” run out in long, dense arrangements? This needs further investigation with full-track analysis.

Reactor Core’s minimal improvement. Hard techno showed the smallest subjective difference between v5 and v5.5. The fog improvement (−11.1%) was real but barely audible. Techno may be the genre where v5 → v5.5 matters least — or it may be that synthetic signals were already well-served by v5’s codec.

What This Means for Your Mastering Chain

If you’re mastering v5.5 material, your processing needs to adapt. Based on our five-track analysis:

1

Boost presence (2–5 kHz)

v5.5 pulls back the clarity range in 4 out of 5 genres. A gentle shelf or bell boost around 3 kHz will restore vocal presence and instrument definition.

2

Manage sub-bass

More low-end energy means more headroom competition. High-pass filtering below 30–40 Hz and careful sub-bass compression will keep the bottom end tight without sacrificing weight.

3

Check stereo width

v5.5 defaults to narrower stereo imaging. If your track feels too centered, a mid-side widener on the high frequencies can open it back up. But check mono compatibility first — the tighter default may actually be an improvement.

4

Don’t trust shimmer numbers blindly

In bright genres (metal, electronic), higher shimmer may mean the model preserved legitimate high-frequency content that v5 was suppressing. Listen before you de-ess. If the high end sounds correct for the genre, leave it alone.

5

Leverage the better dynamics

v5.5’s improved dynamic range means less need for dynamic expansion. Your limiter can work more naturally with source material that already breathes.

Honest Conclusion

v5.5 is a genuine improvement over v5 — but not in the way most people think. It’s not just “cleaner audio.” It’s a smarter model that makes genre-informed decisions about spectral balance, dynamics, and stereo imaging.

For simple, acoustic arrangements (ballads, singer-songwriter, minimal productions), v5.5 delivers a clear and measurable quality improvement. Less shimmer, cleaner backgrounds, fuller sound.

For dense, bright genres (metal, aggressive electronic), v5.5 restores high-frequency content that v5 was suppressing. The metrics look “worse” but the music sounds better. This is arguably the bigger win — v5 was hiding its limitations, v5.5 is honest about the complexity.

What v5.5 doesn’t fix: fog in the mid-range. If your track sounds muddy between 400 Hz and 2 kHz, the answer is still arrangement — fewer simultaneous voices, better frequency separation between instruments. The codec’s bitrate budget is still finite, and v5.5 spends it more wisely but doesn’t expand it.

The Bottom Line

Upgrade to v5.5 — it’s better. But don’t expect magic. The rules from our Guide A still apply: arrangement drives quality. v5.5 makes smarter decisions with what you give it, but what you give it still matters most.

Appendix: Complete Metrics

Core metrics across all 5 tracks

TrackVoicesBPMShimmer v5Shimmer v5.5Δ
Glass River2722.31%1.05%−54.8%
District3–41288.49%7.41%−12.7%
Iron Doctrine5+1683.48%12.45%+257.7%
Reactor Core3–41484.83%7.95%+64.4%
Count the Days2–3854.71%5.48%+16.4%

Dynamics and stereo

TrackDR v5DR v5.5ΔWidth v5Width v5.5Δ
Glass River19.2 dB17.3 dB−10%0.2510.112−55%
District5.7 dB9.2 dB+61%0.1230.175+42%
Iron Doctrine6.5 dB7.5 dB+15%0.4600.226−51%
Reactor Core9.4 dB11.1 dB+18%0.0210.014−31%
Count the Days15.1 dB26.5 dB+76%0.0570.008−86%

Before You Master — Guide Series

v5 vs v5.5 — What Actually Changed in the Audio? You are here
B — Arrangement for AI: Why 3 Instruments Sound Better Than 5
C — Prompting Suno Like a Producer
D — From Suno to Spotify: The Complete Release Pipeline
📖

Download This Analysis as PDF

All the data, spectrograms, and tables in a shareable 9-page PDF. Free, no signup required.

⬇ Download PDF (Free)
📧

Get the Full Guide Series

This is one piece of a larger series. Register for free and we'll send you the upcoming guides as they're published:

B — Arrangement for AI: Why 3 Instruments Sound Better Than 5
C — Prompting Suno Like a Producer
D — From Suno to Spotify: The Complete Release Pipeline

Register Free → No spam. Just guides and updates.

Ready to Master Your AI Tracks?

MasterForge was built for AI-generated music — with tools designed specifically for the artifacts, dynamics, and spectral characteristics of neural codec audio.

masterforge.app