DEV Community: Nilavya Das

Byte Pair Encoding (BPE) Tokenizer

Nilavya Das — Thu, 07 Aug 2025 20:36:37 +0000

Ever wondered how models like GPT understand text? It all starts with tokenization — and one of the most powerful techniques behind it is called Byte Pair Encoding (BPE). In this post, I’ll explain BPE like you’re five, and then show you how to build it from scratch in Python.

🧠 What is a Tokenizer?

Before a machine learning model can work with language, it needs to convert text into numbers.

But how?

By breaking the text into small pieces called tokens, and giving each piece a number.

For example:
"I love cats" → ["I", "love", "cats"] → [101, 204, 999

But here’s the twist:

What if the model has never seen the word "cats" before?

Should it just give up? Not with BPE.

🔍 What is Byte Pair Encoding (BPE)?

BPE is a clever way to build up tokens from characters, by looking at what combinations show up the most in real text.

It works like this:

Split every word into characters

"hello" → ['h', 'e', 'l', 'l', 'o']
Find the most common pair of symbols

→ merge them (e.g., ('l', 'l') → 'll')
Repeat until you reach your vocab size (like 1000 tokens)

What’s cool is that BPE doesn’t understand the meaning of words — but it accidentally learns real words, just because they appear a lot.

🧱 BPE in Plain English

Imagine you're building with LEGO blocks.

At first, you only have tiny bricks — one for each letter.

But as you build more and more sentences, you notice:

"h" and "e" are often together → make a "he" block
"he" and "l" → "hel"
"hell" and "o" → "hello"

Eventually, common words like “hello” become a single token.

Rare words like "circumnavigation" might stay as chunks like ["circum", "navi", "gation"].

🧪 Build Your Own BPE Tokenizer (from Scratch!)

Here’s a minimal BPE tokenizer in Python using just functions — works great in Google Colab too.

from collections import Counter

def get_stats(token_lists):
    pairs = Counter()
    for tokens in token_lists:
        for i in range(len(tokens) - 1):
            pairs[(tokens[i], tokens[i+1])] += 1
    return pairs

def merge_tokens(token_lists, pair):
    merged = []
    bigram = ''.join(pair)
    for tokens in token_lists:
        new_tokens = []
        i = 0
        while i < len(tokens):
            if i < len(tokens)-1 and (tokens[i], tokens[i+1]) == pair:
                new_tokens.append(bigram)
                i += 2
            else:
                new_tokens.append(tokens[i])
                i += 1
        merged.append(new_tokens)
    return merged

def train_bpe(corpus, vocab_size=100):
    token_lists = [[chr(b) for b in text.encode('utf-8')] for text in corpus]
    vocab = set(t for tokens in token_lists for t in tokens)
    merges = []
    while len(vocab) < vocab_size:
        stats = get_stats(token_lists)
        if not stats:
            break
        best = max(stats, key=stats.get)
        token_lists = merge_tokens(token_lists, best)
        new_token = ''.join(best)
        vocab.add(new_token)
        merges.append(best)
    token_to_id = {token: idx for idx, token in enumerate(sorted(vocab))}
    return merges, token_to_id

def apply_bpe(text, merges):
    tokens = [chr(b) for b in text.encode('utf-8')]
    for pair in merges:
        tokens = merge_tokens([tokens], pair)[0]
    return tokens

def encode(text, merges, token_to_id):
    tokens = apply_bpe(text, merges)
    return [token_to_id.get(token, -1) for token in tokens]

def decode(token_ids, token_to_id):
    id_to_token = {i: t for t, i in token_to_id.items()}
    tokens = [id_to_token[i] for i in token_ids]
    byte_str = ''.join(tokens).encode('latin1')
    return byte_str.decode('utf-8', errors='replace')

corpus = ["hello world", "hello again", "GPT is powerful"]
merges, token_to_id = train_bpe(corpus, vocab_size=100)

text = "hello world again"
encoded = encode(text, merges, token_to_id)
decoded = decode(encoded, token_to_id)

print("Tokens:", apply_bpe(text, merges))
print("Encoded:", encoded)
print("Decoded:", decoded)

Output:
Tokens: ['hello world', ' ', 'a', 'g', 'a', 'in']
Encoded: [23, 0, 17, 45, 17, 56]
Decoded: hello world again

BPE learned that "hello world" is very common — and made it a token!

🧠 Recap

🔍 Feature	✅ What BPE Does
Compression	Fewer tokens for common words
Flexibility	Can break rare words into parts
Byte-level	Can handle any text (emojis, code, other languages)
No dictionary	Learns purely from frequency, not meaning

💬 Final Thoughts

BPE is a simple but brilliant idea that powers some of the most advanced models like GPT. It’s efficient, flexible, and surprisingly effective — all by learning patterns in text without knowing what words “mean”.

👉 Give it a try! Build your own tokenizer and see what kinds of word pieces it discovers.

If you liked this post, follow me for more machine learning and LLM internals! 🧠✨

Byte Pair Encoding (BPE) Tokenizer

Nilavya Das — Thu, 07 Aug 2025 20:36:37 +0000

Ever wondered how models like GPT understand text? It all starts with tokenization — and one of the most powerful techniques behind it is called Byte Pair Encoding (BPE). In this post, I’ll explain BPE like you’re five, and then show you how to build it from scratch in Python.

🧠 What is a Tokenizer?

Before a machine learning model can work with language, it needs to convert text into numbers.

But how?

By breaking the text into small pieces called tokens, and giving each piece a number.

For example:
"I love cats" → ["I", "love", "cats"] → [101, 204, 999

But here’s the twist:

What if the model has never seen the word "cats" before?

Should it just give up? Not with BPE.

🔍 What is Byte Pair Encoding (BPE)?

BPE is a clever way to build up tokens from characters, by looking at what combinations show up the most in real text.

It works like this:

Split every word into characters

"hello" → ['h', 'e', 'l', 'l', 'o']
Find the most common pair of symbols

→ merge them (e.g., ('l', 'l') → 'll')
Repeat until you reach your vocab size (like 1000 tokens)

What’s cool is that BPE doesn’t understand the meaning of words — but it accidentally learns real words, just because they appear a lot.

🧱 BPE in Plain English

Imagine you're building with LEGO blocks.

At first, you only have tiny bricks — one for each letter.

But as you build more and more sentences, you notice:

"h" and "e" are often together → make a "he" block
"he" and "l" → "hel"
"hell" and "o" → "hello"

Eventually, common words like “hello” become a single token.

Rare words like "circumnavigation" might stay as chunks like ["circum", "navi", "gation"].

🧪 Build Your Own BPE Tokenizer (from Scratch!)

Here’s a minimal BPE tokenizer in Python using just functions — works great in Google Colab too.

from collections import Counter

def get_stats(token_lists):
    pairs = Counter()
    for tokens in token_lists:
        for i in range(len(tokens) - 1):
            pairs[(tokens[i], tokens[i+1])] += 1
    return pairs

def merge_tokens(token_lists, pair):
    merged = []
    bigram = ''.join(pair)
    for tokens in token_lists:
        new_tokens = []
        i = 0
        while i < len(tokens):
            if i < len(tokens)-1 and (tokens[i], tokens[i+1]) == pair:
                new_tokens.append(bigram)
                i += 2
            else:
                new_tokens.append(tokens[i])
                i += 1
        merged.append(new_tokens)
    return merged

def train_bpe(corpus, vocab_size=100):
    token_lists = [[chr(b) for b in text.encode('utf-8')] for text in corpus]
    vocab = set(t for tokens in token_lists for t in tokens)
    merges = []
    while len(vocab) < vocab_size:
        stats = get_stats(token_lists)
        if not stats:
            break
        best = max(stats, key=stats.get)
        token_lists = merge_tokens(token_lists, best)
        new_token = ''.join(best)
        vocab.add(new_token)
        merges.append(best)
    token_to_id = {token: idx for idx, token in enumerate(sorted(vocab))}
    return merges, token_to_id

def apply_bpe(text, merges):
    tokens = [chr(b) for b in text.encode('utf-8')]
    for pair in merges:
        tokens = merge_tokens([tokens], pair)[0]
    return tokens

def encode(text, merges, token_to_id):
    tokens = apply_bpe(text, merges)
    return [token_to_id.get(token, -1) for token in tokens]

def decode(token_ids, token_to_id):
    id_to_token = {i: t for t, i in token_to_id.items()}
    tokens = [id_to_token[i] for i in token_ids]
    byte_str = ''.join(tokens).encode('latin1')
    return byte_str.decode('utf-8', errors='replace')

corpus = ["hello world", "hello again", "GPT is powerful"]
merges, token_to_id = train_bpe(corpus, vocab_size=100)

text = "hello world again"
encoded = encode(text, merges, token_to_id)
decoded = decode(encoded, token_to_id)

print("Tokens:", apply_bpe(text, merges))
print("Encoded:", encoded)
print("Decoded:", decoded)

Output:
Tokens: ['hello world', ' ', 'a', 'g', 'a', 'in']
Encoded: [23, 0, 17, 45, 17, 56]
Decoded: hello world again

BPE learned that "hello world" is very common — and made it a token!

🧠 Recap

🔍 Feature	✅ What BPE Does
Compression	Fewer tokens for common words
Flexibility	Can break rare words into parts
Byte-level	Can handle any text (emojis, code, other languages)
No dictionary	Learns purely from frequency, not meaning

💬 Final Thoughts

👉 Give it a try! Build your own tokenizer and see what kinds of word pieces it discovers.

If you liked this post, follow me for more machine learning and LLM internals! 🧠✨

Unlocking the Power of Language Models: A Deep Dive into LangChain 🤖💻

Nilavya Das — Sun, 16 Feb 2025 20:27:43 +0000

Introduction

In recent years, language models have revolutionized the way we interact with technology. From conversational AI to text generation, these models have shown incredible promise in a variety of applications. But what's behind their power? In this blog, we'll be exploring the world of LangChain, an open-source framework that's pushing the boundaries of what's possible with language models.

What is LangChain? 🤔

LangChain is an open-source framework built on top of the Hugging Face Transformers library. It provides a flexible and modular way to work with language models, allowing developers to easily integrate them into their applications. With LangChain, you can create custom workflows that combine multiple language models to achieve complex tasks.

The Power of Language Models 💪

Language models are trained on vast amounts of text data, learning patterns and relationships between words. This allows them to generate text, answer questions, and even engage in conversation. But what makes language models so powerful?

Contextual understanding: Language models can understand the context in which a piece of text is being used.
Semantic reasoning: Models can make connections between words and ideas.
Scalability: With LangChain, you can scale your language model to meet the needs of your application.

How Does LangChain Work? 🤔

LangChain uses a modular architecture to combine multiple language models. This allows developers to create custom workflows that take advantage of different strengths in each model.

Chain: The core component of LangChain, which links together multiple models.
Module: A self-contained unit that can be used to build your workflow.
Config: A set of parameters that define how the chain and modules interact with each other.

Real-World Applications 🌎

LangChain has a wide range of applications, from chatbots and text generation to content creation and even education. Here are just a few examples:

Chatbots: Use LangChain to create conversational interfaces that can understand user intent.
Text Generation: Generate high-quality text with the help of LangChain's language models.
Content Creation: Automate content generation with the power of LangChain.

Getting Started 🚀

Ready to start exploring LangChain? Here are some next steps:

Install: Install LangChain using pip: pip install langchain
Explore: Check out the LangChain documentation for more information.
Build: Start building your own workflow with LangChain.

Conclusion 🤝

LangChain is an exciting new framework that's pushing the boundaries of what's possible with language models. With its modular architecture and flexible design, LangChain makes it easy to create custom workflows that take advantage of multiple language models. Whether you're a developer or researcher, LangChain is definitely worth checking out.

How to Build a Simple AI Agent: A Step-by-Step Guide

Nilavya Das — Tue, 13 Aug 2024 20:13:53 +0000

Artificial Intelligence is everywhere, from chatbots that answer your questions to smart assistants that manage your schedule. But did you know you can build your own AI agent in just a few steps? Whether you're a developer or a curious enthusiast, this guide will show you how to create a simple AI agent that can perform basic tasks—all while keeping things fun and easy. 😄

🛠️ Step 1: Define Your AI Agent’s Mission

First, decide what you want your AI agent to do. Think of it as your agent’s mission. It could be something simple, like answering basic questions, fetching weather updates, or setting reminders. For example, let’s build a personal assistant that can tell you the weather and manage your to-do list. ☁️📅

🔧 Step 2: Gather Your Tools

Next, you'll need some tools to bring your AI agent to life. Here’s your starter pack:

✨ Python: The go-to programming language for AI.
🗣️ Natural Language Processing (NLP): Libraries like NLTK or spaCy help your agent understand text.
🔗 APIs: Services like OpenWeatherMap for weather updates or Google Calendar for scheduling.

🧠 Step 3: Build the Brain of Your AI Agent

Now, let’s get into the fun part—coding! Your AI agent needs a brain that can:

1. Understand Commands: 🗨️

Your agent will listen to user input and figure out what they’re asking. For instance, if someone asks, “What’s the weather today?” your agent should recognize this as a weather request.

Here’s a simple Python function to get started:

import re

def process_input(user_input):
    if re.search(r"weather", user_input.lower()):
        return "weather"
    elif re.search(r"todo", user_input.lower()):
        return "todo"
    else:
        return "unknown"

2. Make Decisions: 🤔

Once the command is understood, your agent needs to decide what to do next. Should it fetch the weather, add a task, or do something else?

Here’s how you might code that:

def decide_action(input_type):
    if input_type == "weather":
        return "Fetching weather data..."
    elif input_type == "todo":
        return "Adding to your to-do list..."
    else:
        return "I’m not sure how to help with that."

3. Take Action: 💪

Finally, your agent needs to do what it decided. This could involve calling an API to get the weather or adding an item to your to-do list.

Here’s an example for fetching the weather:

import requests

def get_weather():
    response = requests.get('https://api.openweathermap.org/data/2.5/weather?q=New+York&appid=your_api_key')
    weather_data = response.json()
    return f"The weather in New York is {weather_data['weather'][0]['description']}."

def execute_action(action):
    if action == "Fetching weather data...":
        return get_weather()
    else:
        return "Action not implemented."

🎮 Step 4: Test and Play

With the basics in place, it’s time to play around with your new AI agent. Try different commands and see how it responds. Is it doing what you expected? If not, tweak the code and make it better. 🚀

Here’s a quick test run:

user_input = input("Ask me something: ")
input_type = process_input(user_input)
action = decide_action(input_type)
response = execute_action(action)
print(response)

🌐 Step 5: Deploy Your AI Agent

When you’re happy with how your agent works, consider deploying it so others can use it too. You could integrate it into a messaging app or turn it into a web service. The possibilities are endless! 🌍

🎉 Conclusion: The Fun is Just Beginning

Congratulations! You've just built your first AI agent. While this one is pretty simple, it opens the door to more exciting projects. You can expand its capabilities, teach it new tricks, and make it smarter over time. Building AI agents is not just about coding—it’s about creating something that interacts with the world in meaningful ways. So, go ahead and explore the endless possibilities! 🚀🤖

Now that you’ve got the basics down, what will your next AI agent do? The sky's the limit! 🌟

The Future of Work: How AI Agents Are Redefining Careers and Job Roles 🤖

Nilavya Das — Tue, 13 Aug 2024 13:31:12 +0000

Hey there! Have you ever wondered how the rise of AI is going to affect our jobs? If you’re like me, you’ve probably thought about it a lot. AI agents—those super-smart digital helpers we’ve come to rely on—are transforming not just our daily tasks but the entire landscape of work. Let’s dive into how AI agents are reshaping careers, creating new opportunities, and what it all means for the future of work. 🌟

🤝 AI as Collaborators, Not Replacements

First things first, let’s clear up a big misconception: AI isn’t here to take over all our jobs. Instead, think of AI agents as collaborators that enhance what we do. They’re here to make our work easier and more efficient, not replace us.

For example, in healthcare, AI agents can analyze medical data faster than any human could, helping doctors diagnose patients more accurately and quickly. But it’s still the doctors who interpret that data and make the final decisions. AI is doing the heavy lifting with data, while humans handle the nuanced, empathetic care that only people can provide. 💡

💼 New Job Roles Are Emerging

As AI agents take over more routine tasks, we’re seeing new job roles emerge that didn’t even exist a few years ago. Roles like AI trainers, who teach AI systems how to perform their tasks better, or AI ethicists, who ensure these systems are used responsibly.

In finance, AI is handling complex calculations and predicting market trends, freeing up financial analysts to focus on strategy and big-picture planning. In customer service, AI agents manage basic queries, while human agents tackle the more complicated, emotionally charged issues. These are just a few examples of how AI is creating opportunities for new, specialized careers. 🌐

🧠 Skills of the Future

So, what does all this mean for us and our skill sets? As AI continues to evolve, the skills that will be in demand are changing too. Sure, technical skills are important, but the future workplace will also value creativity, emotional intelligence, and problem-solving abilities—things AI can’t easily replicate.

For instance, being able to interpret data that AI provides is a crucial skill. But beyond that, being able to think creatively about how to use that data or how to solve problems AI identifies is where humans will truly shine. 🌟

❤️ Balancing Automation and Human Touch

As AI agents take over more of the technical and routine aspects of work, the human touch becomes even more valuable. Imagine a customer service scenario where an AI agent handles the initial query, but a human steps in to resolve a more complex issue. It’s the perfect blend of efficiency and empathy.

In fields like marketing, AI can analyze consumer behavior and predict trends, but humans are the ones who craft the messages that resonate on an emotional level. The future of work isn’t about AI vs. humans; it’s about how we can best work together. 🤖🤲

📚 Preparing for the AI-Driven Workplace

So, how do we get ready for this AI-driven future? Continuous learning is key. As AI tools evolve, staying updated on new technologies and how they can be applied in your field is essential. Upskilling in areas like data analysis, digital communication, and creative problem-solving will keep you ahead of the curve.

And it’s not just about technical skills—adaptability and a willingness to embrace change will be just as important. The workplace is evolving rapidly, and those who can navigate this change with a positive attitude will thrive. 🌱

🌟 Wrapping Up

The future of work with AI agents is exciting, full of possibilities, and yes, a little bit daunting. But by seeing AI as a collaborator rather than a competitor, we can unlock new opportunities and redefine what work means. Whether it’s through new job roles, evolving skill sets, or finding the right balance between automation and the human touch, there’s a place for everyone in this AI-enhanced world.

So, let’s embrace the changes AI brings and get ready for a future where work is smarter, more efficient, and full of new possibilities. The future is bright, and it’s just around the corner! 🌈

AI Agents: The Invisible Helpers Transforming Your Life

Nilavya Das — Mon, 12 Aug 2024 17:18:13 +0000

Ever feel like your day could use an extra pair of hands? 👐 Imagine having a personal assistant who’s always available, never takes a break, and gets smarter the more you use them. Sound like science fiction? It’s not. Welcome to the world of AI agents—your invisible helpers working behind the scenes to make life easier, smoother, and way more fun. 🎉

What Are AI Agents, Anyway? 🤔

AI agents are like digital wizards. 🧙‍♂️ They live inside your phone, computer, or even your car, quietly doing things for you. They learn from you, adapt to your habits, and help out in ways you might not even notice. Think Siri reminding you to pick up groceries, or Netflix suggesting that perfect show for your Friday night binge. 📺 These aren’t just coincidences—they’re AI agents doing their thing.

Why Everyone’s Talking About AI Agents 🚀

1. They’re the Ultimate Time-Savers ⏳:

Imagine never having to sift through hundreds of emails 📧 or juggle a dozen reminders again. AI agents can handle all the boring stuff—answering routine questions, scheduling appointments, even ordering your coffee just the way you like it. ☕ They’re like having a super-efficient, tireless assistant who’s always on call.

2. They’re Crazy Smart 🧠:

Need help making a tough decision? 🤷‍♂️ AI agents can analyze loads of data in seconds and give you smart recommendations. Whether it’s figuring out which stocks to invest in 💹 or what to cook for dinner based on what’s in your fridge, these digital brains have got you covered.

3. They Know You Better Than Your Best Friend 🤓:

Ever wonder how Spotify knows exactly what song you’re in the mood for? 🎶 Or how your online shopping cart always seems to suggest that one thing you didn’t know you needed? 🛒 That’s AI at work, learning your preferences and delivering a personalized experience that feels almost like magic. ✨

4. They’re Lightning Fast ⚡:

In a world where speed is everything, AI agents are the ultimate multitaskers. 💪 They can handle countless tasks at once—whether it’s processing transactions, detecting fraud, or managing a busy customer service line. All while you sit back and relax. 😌

5. They Get Smarter Every Day 📈:

The more you interact with them, the better they get. It’s like having an assistant who not only keeps up with you but actually improves over time. Today’s AI agents are good, but tomorrow’s? They’ll blow your mind. 🤯

The Future: More AI, More Awesome 🌟

AI agents are just getting started. Soon, they’ll be even more integrated into our lives—helping with everything from personalized education 📚 to environmental conservation 🌍. But as they evolve, it’s up to us to make sure we use them responsibly, keeping an eye on issues like privacy and fairness.

So, the next time you get a spot-on recommendation, or your smart home anticipates your needs before you even realize them, give a little nod to the AI agents quietly making your life better. They’re the unsung heroes of the digital age, and they’re just getting started. 🚀

Ready to let AI agents take your life to the next level? The future is here, and it’s pretty incredible. 🌈

Revolutionizing Coding with Devin: The World's First AI Software Engineer

Nilavya Das — Thu, 14 Mar 2024 17:03:50 +0000

Forget free coffee ☕️ and ping pong tables 🏓, the hottest perk in Silicon Valley might just be your new coworker – Devin, the world's first AI software engineer. But is Devin here to revolutionize coding or steal your paycheck 👀?

Imagine this:🤔 You're neck-deep in a complex software project 🤯, drowning in repetitive tasks ⏳ and bug fixes😵‍💫. Enter Devin, your tireless AI teammate. Devin can churn out code 👨‍💻, build websites 🌎, and tackle intricate engineering challenges – all while you focus on the big picture. Sounds like a dream come true 🤩, right?

Although Devin has impressive skills, they are incomparable to the Terminator with a coding degree. Here's the reality:

Devin automates repetitive tasks, freeing up your time to focus on creative problem-solving. Work with Devin to avoid errors and make your job more interesting.🤖💼
Marvel, Devin's AI constantly learns and adapts, becoming a better coder with every project. Think of it as having a coding prodigy by your side, eager to learn from your expertise.🌟👨‍💻
Collaboration is key. Devin works well with human engineers by sharing ideas, receiving feedback, and keeping everyone informed. It's like having a coding partner who never needs a nap.🤝👨‍💻

So, is Devin coming for your job 👀? Not likely 😮‍💨 The creators envision Devin as a force multiplier, not a replacement. Think of it as having a super-powered coding assistant who lets you focus on the strategic aspects of software engineering. 💪👨‍💻

While Devin is a groundbreaking development, AI in software engineering is still young. True human-level creativity and the ability to handle the unexpected are still very much human strengths. 🧠🛠️

But make no mistake, Devin is a sign of things to come. As AI continues to evolve, we can expect even more powerful tools to emerge, ushering in a new era of human-AI collaboration in the world of software engineering. 🌟🚀

The question isn't whether AI will change the coding, it's how. So, buckle up, grab your keyboard ⌨️, and get ready to code alongside your new AI teammate. The future of software engineering is here, and it's looking collaborative. 🤖🤝🐛

Backpropagation: Unveiling the Mystery Behind Neural Network Learning

Nilavya Das — Thu, 07 Mar 2024 21:17:55 +0000

Imagine you're teaching your AI buddy to tell cats from dogs in pictures. You show it a photo of a fluffy Persian cat, and it confidently declares "Dog!" Oops! But fear not, because here comes backpropagation to the rescue! It's like the superhero of the neural network world, swooping in to save the day. Let's unravel the mystery behind this magical process without diving into complex math.

No Need for Fancy Numbers, Just Simple Analogies!

Here's the scoop without the headache:

Show and Tell: You show your AI a cat pic (the "show"). It takes a guess (the "tell"), maybe goofing up with "dog."
Uh-Oh, Mistake Alert: Just like correcting your pet's blunder, the AI spots the mistake (the difference between its guess and the real deal).
The Blame Game (The Fun Part!): Backpropagation steps in, figuring out which parts of the AI messed up. Imagine a web of connected neurons like a game of tug-of-war. Backpropagation navigates through this web, assigning blame (proportional to the goof-up) to each involved neuron.
Learning from Mistakes: With this blame game info, the AI adjusts the connections between its neurons (like tweaking rope lengths in tug-of-war). The neurons that led it astray weaken a bit, while those that nudged it in the right direction get a boost.
Rinse, Repeat, Remember! This whole cycle (show, tell, oops, blame, adjust) repeats with different cat and dog pics. With each round, the AI fine-tunes its connections, becoming a cat-detecting pro!

The Superpowers of Backpropagation

All-Rounder: Backpropagation works for all sorts of neural network setups, making it the go-to for many deep learning tasks.
Continuous Learning on Cruise Control: This algorithm lets neural networks keep learning and improving, just like how we get better with practice.

The Bottom Line

Backpropagation is the engine driving neural network learning. Understanding this core concept gives you a peek into the magic of these incredible machines.

Neural Networks: Building a "Brain" from Scratch

Nilavya Das — Fri, 01 Mar 2024 14:25:02 +0000

Introduction:

Welcome to our step-by-step guide on building a neural network from scratch! Neural networks are at the forefront of artificial intelligence and have revolutionized various industries, from healthcare to finance. In this blog post, we'll walk you through the process of creating a simple neural network using Python and NumPy, without relying on any external libraries.

Understanding Neural Networks:

Before diving into the implementation, let's understand what neural networks are. Neural networks are a computational model inspired by the human brain's structure. They consist of interconnected nodes, organized into layers. Each node performs a simple computation and passes its output to the next layer. Neural networks can learn from data and make predictions by adjusting their internal parameters.

Setting Up the Environment:

To start, we'll need Python and NumPy installed on your system. You can install NumPy using pip:

pip install numpy

Building Blocks of the Neural Network:

We'll start by implementing the fundamental building blocks of our neural network. This includes the sigmoid activation function and its derivative, which are essential for introducing non-linearity into the model. We'll also initialize the parameters of the neural network: weights and biases.

def sigmoid(x):
    '''
    The sigmoid function, often used as an activation function
in neural networks,
    especially in binary classification problems. It squashes the input values between 0 and 1,
    facilitating non-linear transformations in the network.
    '''
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    '''
    The derivative of the sigmoid function, which is important for the backpropagation algorithm
in neural networks. It is used to calculate gradients for weights and biases updates,
    enabling the network to learn from the training data.
    '''
    return x * (1 - x)

def initialize_parameters(input_size, hidden_size, output_size):
    '''
    Function to initialize weights & biases for a feedforward neural network with one hidden layer.

    input_size: The number of neurons in the input layer, which corresponds to the number of features in the dataset.
    hidden_size: The number of neurons in the hidden layer.
    output_size: The number of neurons in the output layer. For binary classification, this would typically be 1.
    '''
    np.random.seed(2)
    # Initialize weights between input layer and hidden layer
    weights_input_hidden = np.random.rand(input_size, hidden_size) - 0.5
    # Initialize weights between hidden layer and output layer
    weights_hidden_output = np.random.rand(hidden_size, output_size) - 0.5
    # Initialize bias for the hidden layer to zeros
    bias_hidden = np.zeros((1, hidden_size))
    # Initialize bias for the output layer to zeros
    bias_output = np.zeros((1, output_size))
    return weights_input_hidden, weights_hidden_output, bias_hidden, bias_output

I have also added comments in each line for more clarification.

Forward Propagation:

Forward propagation is the process of computing the output of the neural network given an input. We'll pass the input through the network, apply the activation function, and obtain the final output.

def forward_propagation(X, weights_input_hidden, weights_hidden_output, bias_hidden, bias_output):
    '''
    Function to perform forward propagation through the neural network.

    X: Input data
    weights_input_hidden: Weights between input layer and hidden layer
    weights_hidden_output: Weights between hidden layer and output layer
    bias_hidden: Bias for the hidden layer
    bias_output: Bias for the output layer
    '''
    # Calculate hidden layer input
    hidden_input = np.dot(X, weights_input_hidden) + bias_hidden
    # Apply sigmoid activation function to hidden layer input
    hidden_output = sigmoid(hidden_input)
    # Calculate output layer input
    final_input = np.dot(hidden_output, weights_hidden_output) + bias_output
    # Apply sigmoid activation function to output layer input
    final_output = sigmoid(final_input)
    return hidden_output, final_output

Loss Calculation:

To train the neural network, we need to define a loss function that quantifies the difference between the predicted output and the actual output. We'll use the cross-entropy loss function, which is commonly used for binary classification problems.

def compute_loss(Y, Y_hat):
    '''
    Function to compute the loss between predicted output and actual output.

    Y: Actual output
    Y_hat: Predicted output
    '''
    m = Y.shape[0]
    # Compute cross-entropy loss
    loss = -(1/m) * np.sum(Y * np.log(Y_hat) + (1 - Y) * np.log(1 - Y_hat))
    return loss

Backpropagation:

Backpropagation is the algorithm used to update the parameters of the neural network based on the computed gradients of the loss function with respect to the parameters. It enables the network to learn from the training data and improve its predictions over time.

def backpropagation(X, Y, hidden_output, final_output, weights_hidden_output, weights_input_hidden):
    '''
    Function to perform backpropagation to calculate gradients for weights and biases updates.

    X: Input data
    Y: Actual output
    hidden_output: Output of the hidden layer
    final_output: Predicted output
    weights_hidden_output: Weights between hidden layer and output layer
    weights_input_hidden: Weights between input layer and hidden layer
    '''
    # Compute error at output layer
    error = final_output - Y
    # Compute gradients for weights between hidden and output layers
    d_weights_hidden_output = np.dot(hidden_output.T, error * sigmoid_derivative(final_output))
    # Compute gradients for bias at output layer
    d_bias_output = np.sum(error * sigmoid_derivative(final_output), axis=0, keepdims=True)

    # Compute error at hidden layer
    error_hidden = np.dot(error * sigmoid_derivative(final_output), weights_hidden_output.T)
    # Compute gradients for weights between input and hidden layers
    d_weights_input_hidden = np.dot(X.T, error_hidden * sigmoid_derivative(hidden_output))
    # Compute gradients for bias at hidden layer
    d_bias_hidden = np.sum(error_hidden * sigmoid_derivative(hidden_output), axis=0, keepdims=True)

    return d_weights_input_hidden, d_weights_hidden_output, d_bias_hidden, d_bias_output

Updating Parameters:

After computing the gradients using backpropagation, we'll update the parameters of the neural network using gradient descent.

def update_parameters(parameters, grads, learning_rate=1.0):
    '''
    Function to update weights and biases based on gradients computed during backpropagation.

    parameters: Tuple containing weights and biases
    grads: Tuple containing gradients for weights and biases
    learning_rate: Learning rate for gradient descent
    '''
    weights_input_hidden, weights_hidden_output, bias_hidden, bias_output = parameters
    d_weights_input_hidden, d_weights_hidden_output, d_bias_hidden, d_bias_output = grads

    # Update weights between input and hidden layers
    weights_input_hidden -= learning_rate * d_weights_input_hidden
    # Update weights between hidden and output layers
    weights_hidden_output -= learning_rate * d_weights_hidden_output
    # Update bias at hidden layer
    bias_hidden -= learning_rate * d_bias_hidden
    # Update bias at output layer
    bias_output -= learning_rate * d_bias_output

    return weights_input_hidden, weights_hidden_output, bias_hidden, bias_output

Training the Neural Network:

With all the components in place, we can now train the neural network using a sample dataset. We'll iterate through multiple epochs, updating the parameters at each step to minimize the loss.

def train(X, Y, input_size, hidden_size, output_size, learning_rate, epochs):
    '''
    Function to train the neural network using gradient descent.

    X: Input data
    Y: Actual output
    input_size: Number of features in the input data
    hidden_size: Number of neurons in the hidden layer
    output_size: Number of neurons in the output layer
    learning_rate: Learning rate for gradient descent
    epochs: Number of iterations for training
    '''
    # Initialize parameters
    weights_input_hidden, weights_hidden_output, bias_hidden, bias_output = initialize_parameters(input_size, hidden_size, output_size)

    for epoch in range(epochs):
        # Perform forward propagation
        hidden_output, final_output = forward_propagation(X, weights_input_hidden, weights_hidden_output, bias_hidden, bias_output)
        # Compute loss
        loss = compute_loss(Y, final_output)
        # Perform backpropagation
        grads = backpropagation(X, Y, hidden_output, final_output, weights_hidden_output, weights_input_hidden)
        # Update parameters
        weights_input_hidden, weights_hidden_output, bias_hidden, bias_output = update_parameters((weights_input_hidden, weights_hidden_output, bias_hidden, bias_output), grads, learning_rate)

        # Print loss every 1000 epochs
        if epoch % 1000 == 0:
            print(f"Epoch {epoch}, Loss: {loss}")

    return weights_input_hidden, weights_hidden_output, bias_hidden, bias_output

Conclusion:

Congratulations! You've successfully built a neural network from scratch using Python and NumPy. For Full code please go to this link Neural-network and feel free to experiment with different architectures, datasets, and hyperparameters to deepen your understanding of neural networks.

Happy Coding!👨‍💻

What is Prompt Engineering?

Nilavya Das — Tue, 10 Oct 2023 06:44:29 +0000

Today, we're unravelling the mysteries of Prompt Engineering, a fascinating technique that gives us the power to guide AI's behaviour and make it do some pretty amazing tricks.

The Basic part

Suppose you want to ask AI to do a work so you have to give the AI proper instructions or else it may create a mess. That's where Prompt Engineering comes into play !!

Let's begin with What is Prompt?

It refers to a set of instructions that is to be given to an AI model. It can be a phrase, sentence, or any text that helps the AI to generate a specific output or response.

Why that does matter?

Imagine you're trying to teach the robot (AI) a new language. If you just throw random words at it, it won't learn. But if you structure your words like a language lesson, the robot (AI) can learn and become a master linguist! Prompt Engineering helps us get the best out of AI by structuring our "lessons" effectively.

Key Points

Be Clear and Specific: Just like talking to a friend, being clear with AI helps you get the best responses.
Guide the AI: You're the driver, and the prompt is your car. Craft it to get the desired performance from AI.
Experiment and Learn: The more you play with prompts, the better you'll become at instructing AI to do what you want.

As I dig deeper into Prompt Engineering, I can't wait to see how it transforms the way we interact with AI and the potential it holds for countless applications.

Have you explored Prompt Engineering yet? Share your experiences and let's geek out about this together!

Unleashing the Power of Statistical Analysis in Machine Learning: Exploring Common Methods and Techniques

Nilavya Das — Wed, 03 May 2023 20:28:16 +0000

Hello readers today, I want to talk about the importance of statistical analysis in machine learning.

Statistical analysis involves

collecting
organizing,
summarizing,
interpreting data

to find patterns, trends, and relationships. This helps us better understand the data and make informed decisions based on evidence.

Statistical Techniques are utilized by machine learning to learn from data and provide predictions or recommendations, especially for challenging problems that traditional programming methods find difficult to handle.

Statistical analysis helps us prepare data for machine learning, select the right features, evaluate model performance, and interpret results. Machine learning helps us discover new insights from the data, test hypotheses, and make predictions.

There are some common statistical methods used in machine learning such as

descriptive statistics,

inferential statistics,

correlation,

regression,

classification,

clustering,

and dimensionality reduction.

I hope you have enjoyed reading the blog and grain some knowledge in it. Please leave a comment below if you have any questions or feedback. Thank you for your time and attention!

Find the Script Execution Time in Python

Nilavya Das — Mon, 19 Jul 2021 15:22:35 +0000

Being a Programmer our main aim will be to optimize the program and make sure it take less time to execute it.

Using the Time module in Python we can find the execution time of the program.

Lets start it by writing a simple python script

def BinarySearch(arr, val):
    first = 0
    last = len(arr)-1
    index = -1
    while (first <= last) and (index == -1):
        mid = (first+last)//2
        if arr[mid] == val:
            index = mid
        else:
            if val<arr[mid]:
                last = mid -1
            else:
                first = mid +1
    return index

array = [10, 7, 8, 1, 2, 4, 3]
result = BinarySearch(array, 4)

So now lets import the time module and

Initiate a variable to store the time at the beginning of the execution ,

Initiate another variable at the end to store the time after the execution ,

Then find the difference between the start and the end time to get the time required for the execution.

import time  #import the module

def BinarySearch(arr, val):
  first = 0
  last = len(arr)-1
  index = -1
  while (first <= last) and (index == -1):
      mid = (first+last)//2
      if arr[mid] == val:
          index = mid
      else:
          if val<arr[mid]:
              last = mid -1
          else:
              first = mid +1
  reurn index


start = time.time() #store the starting time 

a=[1,2,3,4,5,6,7,8,9,10]
result = BinarySearch(a, 5)
print(f'Number found at {result}')
time.sleep(1)  # sleeping for 1 sec to get 10 sec runtime

end= time.time() #store the ending time
time= "{:.3f}".format(end - start) #time required for the execution 
print(f"time required : {time} sec")

so the output will be like ...

@nilavya~/Desktop
> python script.py 
Number found at 4
time required : 1.001 sec
@nilavya~/Desktop
>

So in this way we can get the time required for the execution.

Calculating the time of execution of any program is very useful for optimizing your python script to perform better.
The technique come in handy when you have to optimize some complex algorithm in Python.

Do drop your views in the comment section 😊.