DEV Community

Cover image for From ASP.NET + MSSQL to PHP + MySQL: Migrating a Names Site Without Mangling the Accents
Marvin Tang
Marvin Tang

Posted on

From ASP.NET + MSSQL to PHP + MySQL: Migrating a Names Site Without Mangling the Accents

My baby-name site started life on ASP.NET with a MSSQL backend. It worked. The reason I rewrote the whole thing in PHP + MySQL had nothing to do with the code being bad — it was everything around the code.

This is the migration writeup: why I left a stack that was technically fine, the MSSQL → MySQL gotchas that actually cost me time (the worst one was thematically perfect for a names site), and what the new stack let me build that the old one was quietly blocking.

The real reason: hosting, not code

ASP.NET is a perfectly good framework. The problem is its habitat. For a solo dev shipping small sites:

  • Windows hosting is scarce and pricier. The cheap, plentiful end of the hosting market is overwhelmingly Linux/LAMP. Windows + IIS plans are fewer, cost more, and the budget ones I could find were the least stable.
  • The servers were worse. More downtime, slower support, fewer knobs I was allowed to touch.
  • Changes were awkward. Deploying and tweaking a .NET app on shared Windows hosting was consistently more friction than scp + a PHP file on a LAMP box.

PHP + MySQL inverts all three: hosting is everywhere, it's cheap, the quality bar at the same price is higher, and extending the site later is far less ceremony. For a content/data site I plan to keep adding to, "easy to extend on cheap, stable hosting" beats "technically elegant on expensive, fragile hosting" every time.

So the port was a means to an end. The end was the database.

The schema migration: MSSQL → MySQL

A names site is mostly database. The dictionary is ~19,862 names; the popularity layer is 63,890 US Social Security Administration records spanning 1880–2024; the geographic layer is ~6.5M state-level rows. Porting that is where the work lived.

The type and syntax mapping that mattered:

MSSQL                      ->  MySQL
-------------------------------------------------------
INT IDENTITY(1,1)          ->  INT AUTO_INCREMENT
NVARCHAR(n)                ->  VARCHAR(n)        (utf8mb4)
NVARCHAR(MAX)              ->  TEXT / LONGTEXT
BIT                        ->  TINYINT(1)
DATETIME2                  ->  DATETIME
GETDATE()                  ->  NOW()
ISNULL(x, y)               ->  IFNULL(x, y) / COALESCE
LEN()                      ->  CHAR_LENGTH()
[bracketed].[columns]      ->  `backticked`.`columns`
SELECT TOP 10 ...          ->  SELECT ... LIMIT 10
... OFFSET n FETCH m       ->  ... LIMIT m OFFSET n
a + b   (string concat)    ->  CONCAT(a, b)
Enter fullscreen mode Exit fullscreen mode

None of these is hard individually. The cost is that they're scattered through every query and stored procedure, so a "find and replace" mindset misses things — especially string concatenation with +, which silently becomes numeric addition in MySQL instead of erroring.

The collation trap (where the accents died)

Here's the one that bit me, and it's almost funny that it happened on a names site of all things.

Names are full of non-ASCII characters: José, Zoë, Renée, François, Søren. The original MSSQL database stored these fine under its own collation. When I did the first bulk import into MySQL, a chunk of names came back as Jos?, Ren?e, or worse — mojibake like José.

Two separate things were wrong:

  1. The column character set. Defaulting to utf8 in MySQL is a trap — historical MySQL utf8 is only 3 bytes and can't store the full Unicode range. The correct choice is utf8mb4 with a utf8mb4_unicode_ci (or 0900_ai_ci) collation, end to end: table, connection, and client.
  2. The import encoding. Exporting from MSSQL and loading into MySQL without pinning the encoding on both sides re-interpreted the bytes. The fix was exporting as UTF-8 explicitly and telling LOAD DATA INFILE the same with CHARACTER SET utf8mb4.

The lesson I'd tattoo on past-me: set utf8mb4 on the database, the table, and the connection (SET NAMES utf8mb4 / PDO charset=utf8mb4) before importing a single row. Fixing encoding after the data's already mangled means re-importing from source, because you can't always tell a correctly-stored é from a double-encoded one after the fact.

Moving 6.5M rows

For the bulk data — especially the state-level records — row-by-row inserts were a non-starter. The path that worked:

  • Export each MSSQL table to UTF-8 CSV.
  • LOAD DATA INFILE into MySQL with the charset pinned, indexes added after the load (building indexes during a multi-million-row insert is dramatically slower).
  • Spot-check a sample of accented names against the source before trusting the whole load.

The app layer

The PHP rewrite itself was the least surprising part. Data access went from ADO.NET to PDO with prepared statements (parameterized everywhere — a data site is a giant SQL-injection surface if you're lazy). Paging went from OFFSET/FETCH to LIMIT ... OFFSET. The routing and templating got rebuilt but conceptually mapped one-to-one.

I'll be honest that I leaned on Claude to accelerate the mechanical parts of the port — translating stored-procedure logic and grinding through the query rewrites. It's good at the tedious 1:1 translation; the judgment calls (the collation decision, the index-after-load ordering) were still mine to get wrong first.

What the new stack unlocked

Here's the payoff, and the reason the migration was worth it. On the old stack, every new feature was a fight with the hosting. On PHP + MySQL, adding to the site got cheap enough that I could turn a static name dictionary into something closer to a data product:

  • Query-driven tools straight off the SSA tables — popularity by birth year, year-over-year trending movers, a name → US-state lookup over those 6.5M rows, side-by-side comparison of two names across 145 years.
  • Research reports generated from the same dataset — decade-by-decade breakdowns, per-state rankings, long-arc analyses like which mid-century names have effectively gone extinct.

None of that was impossible on the old stack. It was just expensive and annoying enough that I never did it. That's the quiet cost of a high-friction stack: not the features you can't build, but the ones you don't bother to.

You can see where it landed at 9babynames.com. The migration is invisible to visitors — which is exactly the point.


Solo indie dev, writing these up as I go. If you've done a MSSQL → MySQL move, I'm curious whether the collation/charset step bit you too, or whether I just walked into it.

Top comments (0)