WMI hands you 'Intel(R) Core(TM) i7-12700H CPU @ 2.60GHz' and your pricing table says 'i7-12700H'

#powershell #windows #scripting #programming

i have a PowerShell script that estimates what a Windows PC would sell for. it reads the hardware, prices each part, applies an age curve, prints a number. the valuation math gets all the attention. the part that actually ate my weekend was matching what Windows told me to what my pricing table knew.

here's the gap. i ask WMI for the CPU:

$cpu = (Get-CimInstance Win32_Processor | Select-Object -First 1).Name
# Intel(R) Core(TM) i7-12700H CPU @ 2.60GHz

my pricing table has exactly one key for that chip: i7-12700H. those two strings do not match. not even close in a way string equality cares about. there's the trademark junk, the word CPU sitting in the middle, and a clock speed that WMI reports as base (2.60) when the chip actually boosts to 4.7. none of that is in my table and all of it is in the WMI string.

first instinct was a pile of -replace calls. strip "(R)", strip "(TM)", strip "CPU", strip the "@ x.xx GHz" tail. that worked until it didn't. AMD strings are shaped completely differently. "AMD Ryzen 7 5800H with Radeon Graphics". that "with Radeon Graphics" tail means a laptop APU shows up looking like both a CPU and a GPU, so my GPU detection double-counted it. clean up one vendor's format and you break the other's.

fuzzy matching, with a floor

what i landed on: pull the model token with a regex, then fuzzy match it against the table keys by edit distance.

function Resolve-CpuKey {
    param([string]$RawName, [string[]]$Keys)

    # grab the model token: i7-12700H, Ryzen 7 5800H, etc
    if     ($RawName -match '(i[3579]-\w+)')      { $candidate = $Matches[1] }
    elseif ($RawName -match '(Ryzen \d+ \w+)')    { $candidate = $Matches[1] }
    else                                          { $candidate = $RawName }

    $best = $null; $bestDist = [int]::MaxValue
    foreach ($k in $Keys) {
        $d = Get-LevenshteinDistance $candidate $k
        if ($d -lt $bestDist) { $bestDist = $d; $best = $k }
    }
    if ($bestDist -le 5) { return $best }
    return $null
}

the Levenshtein floor at 5 is a number i picked by hand and never had a reason to change. under 5 and it's almost always the right chip off by a stray character or two. over 5 and it starts matching garbage, so i'd rather return null than confidently price an i7 as an i3 because the edit distance happened to land low. a wrong-but-confident answer is worse than no answer in a tool whose whole job is giving you a number to trust.

when match returns null i fall back to a tier default. i3, i5, i7, i9 each get a floor price, Ryzen 3/5/7/9 too, and the generation number nudges it up. coarse, but it never spits out something insane.

the part i didn't see coming

once matching worked i summed the components and the total was too high. every single time.

the sample machine i test on is an Acer Nitro, i7-12700H, RTX 3050 Ti, 16GB DDR5, 512GB NVMe. priced part by part:

  CPU ........  $240
  GPU ........   $90
  RAM ........   $48
  Storage ....   $31
              ───────
  parts gross   $409

eBay sold listings for that same machine: around $320. the parts add up to $409 and the actual market is a hundred dollars under that. and this is before any age hit.

the reason is obvious in hindsight. nobody buys a whole working laptop to part it out. the resale price of an assembled used computer is not the sum of its sellable parts, it's whatever a nervous stranger will pay for a machine they can't fully test before handing over cash. that fear discount is real and it's worth about 25% here.

so the script doesn't trust the parts sum on its own. when the online check is on, it blends the offline estimate and the eBay average 60/40, offline weighted heavier.

$final = ($offlineEstimate * 0.60) + ($ebayAverage * 0.40)

which i now think is backwards.

what's rough

that 60/40 blend leans on the offline number, but the eBay average is closer to ground truth. i should weight eBay heavier, maybe 60/40 the other way, and only fall back to offline-heavy when there are too few sold listings to trust. it's a one-line change i keep not making.
the regex token grab assumes Intel and AMD naming. an Apple chip or anything ARM falls straight through to the raw-string match and almost certainly returns null.
the pricing table is a JSON file i update by hand every few months. it goes stale fast. the honest fix is a scraper, which is its own project.
the APU double-count is patched, not solved. i special-case "with Radeon Graphics" and similar tails. a cleaner detector would key off whether the GPU shares a PCI path with the CPU.

it runs on PowerShell 5.1, which ships with every Windows install, so there's nothing to set up.

https://github.com/TiltedLunar123/pc-worth