graysonwerner100-commits

Posted on May 9

What 123 million simulated CS2 case openings taught me about modeling RNG

#cs2casesimulator #csgocasesimulator #nextjs #webdev

I run case-sim.com, a free CS2 case opening simulator. As of this week the global counter ticked past 123 million openings. That's a weirdly humbling number — it's also enough rolls that any subtle bug in my probability code would have been screaming at me for years.

I want to share what I actually learned building this thing, because most "how to model case openings" tutorials I've seen are wrong in at least one quiet, important way. If you're building anything with weighted RNG — loot tables, gacha, slot mechanics, A/B traffic splitting — some of this will probably save you a bug.

This is going to be code-heavy. Sorry/not sorry.

The drop rate model (and the off-by-one I shipped for two weeks)

Valve disclosed CS2's case tiers back in 2017 and they haven't changed:

Tier	Color	Drop chance
Mil-Spec	Blue	79.92%
Restricted	Purple	15.98%
Classified	Pink	3.20%
Covert	Red	0.64%
Knife/Glove	Gold	0.26%

Add those up: exactly 100%. That's not a coincidence — that's an invariant you should be enforcing in tests, not assuming.

The naive implementation is simple:

const TIER_WEIGHTS = {
  milSpec:    0.7992,
  restricted: 0.1598,
  classified: 0.0320,
  covert:     0.0064,
  rare:       0.0026, // knife/glove
};

function rollTier() {
  const r = Math.random();
  let cumulative = 0;
  for (const [tier, weight] of Object.entries(TIER_WEIGHTS)) {
    cumulative += weight;
    if (r < cumulative) return tier;
  }
  return 'milSpec'; // floating-point fallback
}

That fallback at the bottom is not me being paranoid. 0.7992 + 0.1598 + 0.0320 + 0.0064 + 0.0026 should equal 1.0, but in JavaScript:

0.7992 + 0.1598 + 0.0320 + 0.0064 + 0.0026
// 0.9999999999999999

If Math.random() returns 0.99999999... and your cumulative caps out at 0.9999999999999999, you fall off the end. With 123M rolls that bug fires roughly... a lot. So you either return a fallback or you switch to integer weights:

const TIER_WEIGHTS_INT = {
  milSpec:    7992,
  restricted: 1598,
  classified:  320,
  covert:       64,
  rare:         26,
}; // sums to exactly 10000

function rollTier() {
  const r = Math.floor(Math.random() * 10000);
  let cumulative = 0;
  for (const [tier, weight] of Object.entries(TIER_WEIGHTS_INT)) {
    cumulative += weight;
    if (r < cumulative) return tier;
  }
}

I shipped the float version. It worked. But the integer version is just better and I switched.

Within a tier, items are uniform. So once rollTier() returns 'covert', you just pick uniformly from the covert items in that case. Don't overthink it.

The 10% StatTrak roll

StatTrak is a separate roll after the item is picked, not a sixth tier:

function open(caseDef) {
  const tier = rollTier();
  const item = pickUniform(caseDef.items[tier]);
  const isStatTrak = Math.random() < 0.10;
  return { item, isStatTrak };
}

I see a lot of clones get this wrong — they roll StatTrak as part of the tier weights and the math drifts. It's conditional, not categorical.

Float values: where most tutorials die

Each skin has a wear value from 0.0 (Factory New) to 1.0 (Battle-Scarred). The wear conditions:

Wear	Float range
Factory New	0.00 – 0.07
Minimal Wear	0.07 – 0.15
Field-Tested	0.15 – 0.38
Well-Worn	0.38 – 0.45
Battle-Scarred	0.45 – 1.00

Here's the gotcha that wrecks most sims: float ranges aren't 0 to 1 for every skin. Each skin has a min_float and max_float clamped by the workshop creator. The AK-47 Vulcan has min=0.0, max=0.9. The AWP Asiimov has min=0.18, max=0.55 — meaning it cannot drop Factory New, ever. If your sim rolls AWP Asiimov FN, your sim is wrong.

Correct version:

function rollFloat(skin) {
  const raw = Math.random();
  return raw * (skin.maxFloat - skin.minFloat) + skin.minFloat;
}

function wearFromFloat(f) {
  if (f < 0.07) return 'Factory New';
  if (f < 0.15) return 'Minimal Wear';
  if (f < 0.38) return 'Field-Tested';
  if (f < 0.45) return 'Well-Worn';
  return 'Battle-Scarred';
}

The skin-specific min/max is the entire reason float-capped skins (like a 0.00-something AK-47 Fire Serpent) trade for absurd money. They're rarer than the wear distribution alone suggests because the underlying float distribution is also clamped on the high end.

The spin animation bias trap

This is the one I want every developer building a loot reveal UI to internalize, because I've seen it shipped in production at sites that should know better.

You've seen the CS2 case opening animation: items scroll horizontally, decelerate, and stop on the result. The wrong way to build that:

// ❌ WRONG: roll position, derive result
function openCase() {
  const trackLength = items.length * itemWidth;
  const stopPosition = Math.random() * trackLength;
  animateTo(stopPosition);
  return getItemAt(stopPosition); // result depends on visual layout
}

Why it's wrong: the items on the track aren't uniformly distributed by tier. Most slots are common items, very few slots are knives. If you roll a uniform position over the track, your effective drop rate is just whatever fraction of the visible track is occupied by each tier — not Valve's 79.92/15.98/3.20/0.64/0.26.

The right way:

// ✅ RIGHT: roll result, then animate to it
function openCase(caseDef) {
  const result = rollResult(caseDef);             // 1. determine outcome
  const targetIndex = placeResultOnTrack(result); // 2. position it
  const targetPx = targetIndex * itemWidth + jitter();
  animateTo(targetPx, { duration: 5000, easing: 'ease-out' });
  return result;                                  // 3. show what we already decided
}

The track is just visual theater. The result is decided first. The animation lands on it.

This sounds obvious but if you grep enough open-source case opener clones on GitHub you'll find the wrong pattern in real code. It feels right because it looks random. It just isn't the right random.

Pattern indices and the Doppler problem

Each item also rolls a pattern index from 0 to 999. For most skins this affects the visual layout but not the value. For some skins — Case Hardened (blue gems), Crimson Web (full webs), Fade (100% fades), and especially Dopplers — pattern is everything.

Dopplers are nasty because the same skin name maps to different "phases" depending on pattern:

// Phase mapping for one Doppler variant (illustrative — exact ranges vary by knife)
const DOPPLER_PHASES = {
  ruby:        [418],
  sapphire:    [419],
  blackPearl:  [420],
  phase1:      [415],
  phase2:      [416],
  phase3:      [417],
  phase4:      [421],
};

function dopplerPhase(item, patternIndex) {
  for (const [phase, indices] of Object.entries(item.phases)) {
    if (indices.includes(patternIndex)) return phase;
  }
  return null;
}

Phase distribution isn't uniform — Ruby/Sapphire/Black Pearl are ~5% combined, the rest split among phases 1-4. If you implement Dopplers as a single skin with a uniform pattern roll, your sim will show Ruby Karambits at like 0.1% rate and people will (correctly) call you out.

I model each Doppler variant as essentially seven distinct items at the right relative weights. It's annoying but it's the only way to get realistic pulls. You can see the full case list at case-sim.com/cases if you want to compare.

What scale actually breaks

123M rolls means a lot of things you wouldn't think about start to matter:

Database writes. I do not write a row per opening. That would be insane. Aggregate counters per case, per item, per day. The "global unbox feed" people see is a sliding window of recent rolls held in memory and trimmed aggressively. If you're building anything in this space, design your schema around aggregates from day one.

Rare-item flexing. When someone pulls a Dragon Lore (1 in 3,906 in a Souvenir Cobblestone package), they take a screenshot. They post it. The screenshot shows your URL. This is great for you — but it also means your knife/glove pull endpoint gets traffic spikes. Cache pulled-item OG images and pre-render the share cards.

Float floors. I had a bug where extremely low floats (below 0.001) would render as scientific notation in the UI: 1.234e-4 instead of 0.000123. Looked like garbage on share cards. Always format floats explicitly:

const formatFloat = (f) => f.toFixed(6).replace(/0+$/, '').replace(/\.$/, '');

Trust. Once you publish drop rates, people will run their own statistical tests. Someone DM'd me a chi-square goodness-of-fit on 50,000 of their personal openings. It checked out (thank god). Have your random source be auditable — prefer crypto.getRandomValues() over Math.random() if you can, even though for sim purposes Math.random is mathematically fine.

function secureRandom() {
  const arr = new Uint32Array(1);
  crypto.getRandomValues(arr);
  return arr[0] / 0xFFFFFFFF;
}

A few things I'd do differently

If I rebuilt today:

Integer weights from day one. Every probability lives as parts-per-10000 or parts-per-1000000. Floating point in probability code is just asking for it.
Decouple the item pool from the case definition. I had cases hard-referencing item objects. When Valve added new wear ranges or float caps, I had to touch every case. Now the case is just a list of item IDs + tier weights, and the item registry is the source of truth.
Test the invariant. Every case definition has a unit test that asserts tier weights sum to exactly 10000. Cheap to write, catches stupid copy-paste mistakes when adding new cases.

test.each(allCaseDefs)('case %s tier weights sum to 10000', (caseDef) => {
  const total = Object.values(caseDef.tiers).reduce((a, b) => a + b, 0);
  expect(total).toBe(10000);
});

That test has caught me twice.

Wrap

If you're modeling weighted-random anything — loot, gacha, traffic splitting, even A/B variants — the lessons from running 123M case openings of Counter-Strike 2's official cases are basically:

Use integer weights, not floats. Probability sums need to be exact, not "close enough."
Roll the result first, then animate to it. Visual position should never derive the outcome.
Per-skin float ranges matter more than tier weights for determining real-world rarity.
Conditional rolls (StatTrak, Doppler phase) are not extra tiers. Don't shoehorn them in.
Test that your weights sum to your invariant. Every time. In CI.

If you want to see how this all pans out in practice — opening real cases with real Valve odds, no money, no signup — that's literally what I built case-sim for. Otherwise, take the code patterns and go build your own thing. The world has room for more weighted RNG done right.

Hit me up in the comments if you've shipped something similar and ran into different gotchas. Always curious what other people hit.

DEV Community