DEV Community

Cover image for She had a PhD from MIT. She quit after 6 months because nobody knew what sls_txn_f47 meant.
Brian Cariveau
Brian Cariveau

Posted on

She had a PhD from MIT. She quit after 6 months because nobody knew what sls_txn_f47 meant.

Dr. Jennifer Park walked into my office on her last day.

"I'm sorry," she said. "I really wanted this to work."

I'd recruited her personally. PhD in Machine Learning from MIT. Five years at Spotify building recommendation engines. Above-market salary. Equity. The works.

She lasted six months and four days.

"What happened?"

"I spent six months trying to do one thing: build a recommendation engine. At Spotify, I built similar systems in six weeks."

"And here?"

"Here, I spent six months just trying to understand the data."

She opened our Snowflake warehouse. 847 tables.

sls_txn_f47

usr_bhv_ag_01

car_lst_vw_2

bid_hist_tmp

"Nobody knows what these mean," she said. "The engineer who built them left two years ago. I spent three months reverse-engineering the schema. Then I discovered we have seven different definitions of user_id across tables. Seven."

"I'm not a bad data scientist," she said. "Your data is just impossible to work with."


Four months later

We hired Alex.

Same challenge: "Build a recommendation engine."

He understood the data model in 15 minutes.

Had a working prototype by end of day.

Shipped an upgraded version the next week. Clickthrough rate up 18%.


What changed?

We rebuilt the foundation.

Killed 535 zombie tables nobody was querying.

Renamed everything:

  • sls_txn_f47auction_transactions
  • usr_bhv_ag_01user_behavior_daily
  • car_lst_vw_2car_listings_current

Created one source of truth for every entity.

Documented everything.

Asked "Is this stupidly simple yet?" until the answer was yes.


The test:

Old model: 30 minutes to find last month's revenue

New model: 30 seconds

Alex understood the structure in 15 minutes because the naming was self-explanatory. Actually building the recommendation engine took the rest of the day.

But he wasn't stuck for weeks reverse-engineering cryptic schemas like Jennifer was.


The lesson

You can't build on top of chaos.

Jennifer was brilliant. The data was just impossible to work with.

How many great engineers have you lost because your schema looked like tbl_usr_tmp_20220304?


This is a scene from The Auction Block — a business fable I wrote about what data/analytics teams get wrong (and how to fix it). Think The Phoenix Project but for data & AI teams. I promise you will become a better version of yourself if you thumb through it!

If you've ever inherited a data graveyard and had to rebuild it, you might find it useful.

Available on Kindle & paperback - https://www.amazon.com/Auction-Block-Novel-About-Teams-ebook/dp/B0GM8BRVWC

Top comments (0)