Dr. Jennifer Park walked into my office on her last day.
"I'm sorry," she said. "I really wanted this to work."
I'd recruited her personally. PhD in Machine Learning from MIT. Five years at Spotify building recommendation engines. Above-market salary. Equity. The works.
She lasted six months and four days.
"What happened?"
"I spent six months trying to do one thing: build a recommendation engine. At Spotify, I built similar systems in six weeks."
"And here?"
"Here, I spent six months just trying to understand the data."
She opened our Snowflake warehouse. 847 tables.
sls_txn_f47
usr_bhv_ag_01
car_lst_vw_2
bid_hist_tmp
"Nobody knows what these mean," she said. "The engineer who built them left two years ago. I spent three months reverse-engineering the schema. Then I discovered we have seven different definitions of user_id across tables. Seven."
"I'm not a bad data scientist," she said. "Your data is just impossible to work with."
Four months later
We hired Alex.
Same challenge: "Build a recommendation engine."
He understood the data model in 15 minutes.
Had a working prototype by end of day.
Shipped an upgraded version the next week. Clickthrough rate up 18%.
What changed?
We rebuilt the foundation.
Killed 535 zombie tables nobody was querying.
Renamed everything:
-
sls_txn_f47→auction_transactions -
usr_bhv_ag_01→user_behavior_daily -
car_lst_vw_2→car_listings_current
Created one source of truth for every entity.
Documented everything.
Asked "Is this stupidly simple yet?" until the answer was yes.
The test:
Old model: 30 minutes to find last month's revenue
New model: 30 seconds
Alex understood the structure in 15 minutes because the naming was self-explanatory. Actually building the recommendation engine took the rest of the day.
But he wasn't stuck for weeks reverse-engineering cryptic schemas like Jennifer was.
The lesson
You can't build on top of chaos.
Jennifer was brilliant. The data was just impossible to work with.
How many great engineers have you lost because your schema looked like tbl_usr_tmp_20220304?
This is a scene from The Auction Block — a business fable I wrote about what data/analytics teams get wrong (and how to fix it). Think The Phoenix Project but for data & AI teams. I promise you will become a better version of yourself if you thumb through it!
If you've ever inherited a data graveyard and had to rebuild it, you might find it useful.
Available on Kindle & paperback - https://www.amazon.com/Auction-Block-Novel-About-Teams-ebook/dp/B0GM8BRVWC
Top comments (0)