A week ago, I had a problem. My hair was a mess, and I needed a cut, like badly. But finding a good barber when you have curly hair? That’s a whole new story. Living in Turkey, I’ve learned that while most barbers confidently say, “Yeah, I can cut your hair” the mirror usually tells a different story(I learned this the hard way). So, like always, I hopped on the metro and made my way across the city to the one barber I trust.
He’s the only one who gets it right every time, so no matter how far his shop is, I go. As soon as I walked in, he grinned. “You need me, huh?”
I laughed. “You already know.”
I sat in the chair, and as he wrapped the cape around me, he asked, “So, what do you do again?”
“I work in tech specifically AI/ML” I said.
“Oh, so you build robots?”
I smirked. “Not exactly. But actually, there’s something in AI that relates to this haircut right now, reinforcement learning.”
He raised an eyebrow. “Alright, explain it to me.”
The basics of reinforcement learning
Reinforcement Learning (RL) is all about learning through experience. You take an action, get feedback, and use that feedback to make better choices over time. Imagine training a puppy. If it sits when you say ‘sit,’ you give it a treat. If it jumps instead, no treat. Over time, the puppy figures out that sitting = treats, so it keeps doing it.
“So, like trial and error?” he asked, running the clippers along my fade.
“Exactly. Just like how I had to go through way too many bad barbers before finding you haha.”
The Agent: Me, trying to get a decent haircut
In RL, the agent is the one making decisions. That’s me, desperately looking for someone who won’t mess up my hair.
The Environment: The maze of barbershops
The environment is where the agent operates. In my case, that’s Ankara, full of barbers, each with different levels of skill (or lack of it).
Actions: Trying different barbers
Every time I walked into a new shop and sat in the chair, that was an action. Some led to fresh, clean cuts. Others… well, let’s just say I had to wear a hat for a week.
Rewards: The outcome of the haircut
In RL, feedback comes in the form of rewards or penalties. A perfect fade? Positive reward. A lopsided lineup? Negative reward. My brain quickly learned: avoid that barber, try another.
He nodded. “So, I’m the reward?”
I grinned. “You’re the jackpot.”
Learning from experience
Just like an AI model, I had to learn through trial and error. At first, I was randomly choosing barbers, hoping for the best. That’s called exploration, trying different options to gather information. But once I found you, I stopped experimenting and just stuck with what works.
He laughed. “So you figured out the best strategy?”
“Exactly. In RL, we call that finding the optimal policy*, the best approach for getting the highest reward.”*
Reinforcement Learning isn’t just about AI. It’s how people learn every day. We try things, make mistakes, adjust, and eventually figure out what works. Just like I learned, never trust a barber who says ‘trust me.’
My barber shook his head, smiling. “So, I’m officially AI-approved?”
I nodded. “Certified.”
Final thoughts
I wanted to explain reinforcement learning this way because I think the best way to make tech concepts approachable is by framing them in everyday language and experiences. AI can feel intimidating, but at its core, it mirrors how we navigate life, trial, error, and improvement. Whether it's an algorithm learning from data or me figuring out where to get a proper haircut, the process is pretty much the same.
Thank you for reading! If you have any thoughts or constructive feedback on how I can improve my writing (or just want to chat about AI and machine learning), please leave them in the comments.
Also, if you’re working on building a system, training a model, or need help figuring out the right approach, feel free to reach out at hello@fotiecodes.com.
BWT, here’s a photo from my last visit to the barber:)
Top comments (0)