DEV Community: Sandip Subedi

Teaching a Webcam to See: Building a Real-Time Object Detection Pipeline with YOLOv8

Sandip Subedi — Thu, 16 Jul 2026 03:51:12 +0000

A few weeks ago, my portfolio was mostly tabular data — CSVs, groupby(), Seaborn charts, the occasional RAG pipeline. Useful skills, but all of it lived inside static datasets. I wanted to try something that moved: a model that could look at a live video feed and tell me what it was seeing, frame by frame, in real time.

This post walks through what I built, what broke along the way, and what I'd tell someone starting the same project today.

The goal

Detect people and vehicles — starting from a single photo, then working up to a live webcam feed. Nothing exotic. The interesting part wasn't the end result; it was the gap between "it works on one image" and "it works live, continuously, without falling over."

Stack: YOLOv8 (Ultralytics) for detection, OpenCV for video I/O and drawing, Python holding it all together.

Stage 1 — Prove it works on a single image first

Before touching a webcam, I ran YOLOv8 on one static test photo. This sounds like a trivial step, and it is — which is exactly why I didn't skip it.

import cv2
import matplotlib.pyplot as plt
from ultralytics import YOLO

image = cv2.imread("bike_test.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

plt.imshow(image)
plt.axis("off")

One thing that trips up almost everyone the first time: OpenCV reads images in BGR, not RGB. Skip the cv2.cvtColor(..., cv2.COLOR_BGR2RGB) step and your colors come out looking sunburnt — reds and blues swapped. It's a one-line fix, but only if you know to look for it.

From there, loading and running YOLOv8 is almost anticlimactic:

model = YOLO("yolov8n.pt")  # the "n" (nano) variant — fastest, good enough for a first pass
results = model("bike_test.jpg")
results[0].save(filename="output.jpg")

Three lines, and I had bounding boxes around a bicycle in the test image. Sanity-checking on a single frame here saved me from debugging model-loading issues and real-time performance issues at the same time later — one problem at a time.

Stage 2 — Take it live

Once the model worked on a photo, the next step was pointing it at a webcam instead of a file path. Conceptually it's the same call, just wrapped in a loop:

model = YOLO('yolov8n.pt')
cap = cv2.VideoCapture(0)  # 0 = default webcam

while True:
    success, frame = cap.read()
    if not success:
        break

    results = model(frame, stream=True)

    for result in results:
        for box in result.boxes:
            label = result.names[int(box.cls)]
            if label == 'person':
                x1, y1, x2, y2 = map(int, box.xyxy[0])
                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                cv2.putText(frame, label, (x1, y1 - 10),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)

    cv2.imshow('Live Person Detection', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

stream=True matters here — it tells Ultralytics to process frames as a generator instead of batching them, which keeps memory use flat instead of growing frame after frame. The first time I ran this and saw a green box lock onto me and follow me around the frame, it didn't feel like "a few lines of code" — it felt like the thing had noticed me.

Stage 3 — Reaching too far, and what I learned from pulling back

The natural next step felt obvious: why stop at "there's a person," when I could also guess their emotion and gender? I wired in DeepFace on top of the YOLO output, cropping each detected person and running facial analysis on the crop.

It worked — briefly, on paper. In practice, DeepFace pulls in TensorFlow as a dependency, and TensorFlow on Windows has a well-earned reputation for being finicky about versions, CUDA, and DLLs. I lost more time fighting installation errors than I spent on the actual computer vision.

I tried a leaner alternative — swapping DeepFace for two small models loaded directly through OpenCV's own dnn module (a Caffe model for gender, an ONNX model for emotion), which avoids the TensorFlow dependency entirely. It's a genuinely good pattern, and I'd recommend it to anyone who wants this feature. But for this project, I made a deliberate call to cut it. Not every feature that's technically possible is worth the added fragility, especially in something meant to be a clean, demonstrable portfolio piece. I'd rather ship two stages that run reliably every time than three stages where the last one might not run on your machine.

That's the real lesson from this project, more than any single line of code: knowing when to cut scope is part of the engineering, not a failure of it.

What I'd tell someone starting this today

Validate on one static example before going real-time. It isolates whether the problem is your model setup or your real-time loop.
cv2.cvtColor(..., cv2.COLOR_BGR2RGB) is not optional if you're displaying OpenCV images anywhere outside OpenCV's own imshow.
stream=True on video keeps Ultralytics from silently ballooning memory usage over a long-running loop.
Clamp your bounding boxes. max(0, x1) and min(frame.shape[1], x2) cost you two lines and save you from a crash the moment a detection box clips the edge of the frame.
Heavier isn't always better. A dependency like TensorFlow-via-DeepFace can cost you more in reliability than it gives you in features. Lighter, more surgical tools (OpenCV's own dnn module, in this case) are often the better trade for a portfolio project meant to just work when someone clones it.

What's next

I'm keeping the gender/emotion layer as a documented "next step" rather than shipping it half-working — I'd rather build it properly with the OpenCV DNN approach once I've got a clean way to package the model downloads. Beyond that, I want to add an on-screen FPS counter so I can actually quantify the cost of each added layer, instead of eyeballing "does this feel slower."

If you want to see the full notebook, it's on my GitHub — feedback and issues welcome, especially if you spot something I'd benefit from unlearning early rather than later.

Repo: github.com/sandipsubedi0
Connect: linkedin.com/in/sandip-subedi01

🤖 I Built a Semantic FAQ Bot That Understands Meaning Instead of Keywords | Project #4

Sandip Subedi — Fri, 03 Jul 2026 13:53:09 +0000

🤖 I Built a Semantic FAQ Bot That Understands Meaning Instead of Keywords | Project #4

Project #4 of my AI & Machine Learning journey

Most beginner FAQ chatbots work only when the user's question exactly matches the stored question.

Ask:

"What is Machine Learning?"

and it works.

But ask:

"ML"

"Can you explain machine learning?"

and many traditional FAQ bots completely fail.

I wanted to solve this problem by building a chatbot that understands the meaning behind a question instead of simply matching keywords.

That's exactly why I built my Semantic FAQ Bot.

🚀 What is a Semantic FAQ Bot?

A Semantic FAQ Bot uses sentence embeddings instead of keyword matching.

Rather than checking whether two sentences contain the same words, it converts both the user's query and every FAQ question into numerical vectors (embeddings).

It then finds the FAQ whose meaning is most similar to the user's question using Cosine Similarity.

This allows the chatbot to understand:

abbreviations
paraphrased questions
casual language
differently worded queries

without needing exact text matches.

🎯 Problem with Traditional FAQ Bots

Imagine your FAQ contains:

What is Machine Learning?

A traditional bot may fail for questions like:

ML
Explain Machine Learning
What does ML mean?
Tell me about Machine Learning

because none of them are exact matches.

Semantic Search solves this problem beautifully.

🧠 How My Bot Works

The workflow is surprisingly simple.

Step 1

Every FAQ question is converted into a 384-dimensional embedding using the Sentence Transformer model.

Step 2

When a user asks a question, that question is also converted into an embedding.

Step 3

The bot calculates the similarity between the user's embedding and every FAQ embedding using Cosine Similarity.

Step 4

The highest-scoring question is selected.

Step 5

If the similarity score is above a confidence threshold, the corresponding answer is returned.

Otherwise, the bot politely says it doesn't know the answer instead of giving incorrect information.

⚙️ Tech Stack

Python
Sentence Transformers
all-MiniLM-L6-v2
NumPy
Scikit-learn
Cosine Similarity

✨ Features

✅ Semantic Search instead of keyword matching

✅ Confidence Score for every prediction

✅ Around 90 built-in AI, Python, Data Science and Machine Learning FAQs

✅ Fast response using pre-computed embeddings

✅ Easily expandable knowledge base

✅ Clean and beginner-friendly implementation

📌 Example

User asks

ML

Bot understands

What is Machine Learning?

Response

Machine Learning is a field of AI where computers learn patterns from data.

Another example:

User asks

NLP stands for?

Bot correctly matches

What is NLP?

Even though the wording is completely different.

💡 What I Learned

While building this project, I learned about:

Sentence Embeddings
Vector Representations
Semantic Search
Cosine Similarity
Text Similarity
Efficient Embedding Reuse
Confidence Thresholding
Building Intelligent FAQ Systems

This project gave me a much deeper understanding of how modern AI systems retrieve relevant information.

🔥 Future Improvements

This project is only the beginning.

Some upgrades I plan to implement include:

Loading FAQs from CSV or JSON files
Integrating FAISS for large-scale vector search
Building a FastAPI backend
Creating a Streamlit web interface
Converting it into a Retrieval-Augmented Generation (RAG) chatbot using Large Language Models

📚 Why This Project Matters

Semantic Search is one of the core building blocks behind many modern AI applications.

Understanding embeddings and similarity search opens the door to building:

AI Chatbots
Document Search Systems
Recommendation Engines
RAG Applications
AI Knowledge Bases
Enterprise Search Systems

Building this project helped me move beyond basic Machine Learning and into practical NLP applications.

🎯 Final Thoughts

This is Project #4 in my AI & Machine Learning learning journey.

Every project I build teaches me something new, and this one introduced me to the power of semantic understanding.

Instead of matching words, the chatbot understands meaning—a small but important step toward building more intelligent AI systems.

There is still a long road ahead, but every project gets me closer to becoming a skilled AI Engineer.

Thanks for reading!

👨‍💻 About Me

Hi! I'm Sandip Subedi, an aspiring AI & Machine Learning Engineer from Nepal. I'm documenting my journey by building practical projects in Python, Machine Learning, NLP, and Retrieval-Augmented Generation (RAG), sharing everything I learn along the way.

📬 Let's Connect

GitHub:https://github.com/sandipsubedi0/semantic-faq-bote
LinkedIn: www.linkedin.com/in/sandip-subedi-5694b136a
Hashnode: https://hashnode.com/edit/cmr4zr5u100000akm6uavg0oc
Email: sandipsubedi012@gmail.com

If you enjoy following real-world AI projects, feel free to connect with me. I'm always excited to learn, collaborate, and grow with the developer community. 🚀

Titanic Survival Analysis — What the Data Reveals About Who Lived and Who Died

Sandip Subedi — Thu, 11 Jun 2026 09:41:42 +0000

The Titanic disaster of 1912 is one of the most studied events in history. Over 1,500 people lost their lives when the ship sank in the North Atlantic. But when you look at the passenger data, a clear pattern emerges — survival was not random. Your chances of surviving depended heavily on who you were and where you sat on the ship.

In this project, I analyzed the Titanic passenger dataset to answer one central question: What factors determined whether a passenger survived?

This is my second data analysis project. My first project was an HR Employee Attrition Analysis — if you haven't read that one yet, check it out. For this project, I followed the same structured 5-phase approach and pushed myself to go deeper with the visualizations.

Full notebook on GitHub: github.com/sandipsubedi0/titanic-survival-analysis

Dataset Overview
Source: Kaggle — Titanic: Machine Learning from Disaster

Rows: 891 passengers

Columns: 12 features including Age, Sex, Pclass, Fare, Cabin, Embarked, and Survived

Tools used: Python, Pandas, NumPy, Matplotlib, Seaborn

Phase 1 — Setup and Data Loading
I started by importing the necessary libraries and loading the dataset using a relative file path. A simple but important habit — never use a hardcoded local path like C:\Users... in a shared notebook, because it will break on every other computer.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set_style("whitegrid")

data = pd.read_csv("Titanic-Dataset.csv")
data.head()

First look at the data: 891 rows, 12 columns. The dataset includes passenger demographics, ticket details, cabin information, and whether they survived.

Phase 2 — Data Exploration (Before Cleaning)
Before touching anything, I explored the raw data to understand what I was working with.

Missing values:

Column Missing Count Missing %
Cabin 687 77.1%
Age 177 19.9%
Embarked 2 0.2%
The missing values heatmap made this visually clear — Cabin had a massive gap running through the entire column.

I also ran data.describe() to check the numerical columns. A few things stood out immediately:

Age ranged from 0.42 to 80 years — the youngest passenger was less than 1 year old

Fare had a huge range — minimum 0, maximum 512 — signaling strong economic inequality among passengers

Only 38% of passengers survived (Survived mean = 0.38)

For value_counts(), I only checked meaningful categorical columns: Survived, Pclass, Sex, and Embarked. Columns like PassengerId, Name, and Ticket are unique identifiers — running analysis on them produces no useful insight.

Phase 3 — Data Cleaning
I made a working copy of the original data before applying any changes — always keep the raw data intact as a reference.

df = data.copy()

Three cleaning decisions:

Age — filled with median Age had 177 missing values (19.9%). I filled with the median, not the mean. Why? Age has outliers (very young children, elderly passengers) that pull the mean away from the typical passenger. The median is more robust.

df["Age"].fillna(df["Age"].median(), inplace=True)

Cabin — dropped entirely 77.1% of values were missing. That's too high to fill reliably — any filling method would be guesswork on that scale. Dropping it was the right call.

df.drop(columns=["Cabin"], inplace=True)

Embarked — filled with mode Only 2 values missing. With such a small gap, filling with the most common value (mode) is perfectly safe.

df["Embarked"].fillna(df["Embarked"].mode()[0], inplace=True)

Verification: After cleaning, df.isnull().sum() showed zero missing values across all columns. The after-cleaning heatmap confirmed this — completely blank, exactly what we want to see.

Phase 4 — Exploratory Data Analysis and Visualizations
This is where the real story begins. I built 7 charts, each designed to answer a specific question.

Chart 1 — Overall Survival Count

The first question: how many people actually survived?

Out of 891 passengers, 342 survived (38.4%) and 549 did not (61.6%). More than 6 in 10 people on the Titanic did not make it. That sets the baseline for everything that follows.

Chart 2 — Survival Rate by Gender
This is where the data gets striking.

Female survival rate: ~74%

Male survival rate: ~19%

Women were nearly 4x more likely to survive than men. This is the clearest pattern in the entire dataset. The "women and children first" evacuation protocol was not just a phrase — the data confirms it was actually followed.

I used a bar chart here (not a pie chart). Survival rates are separate values for two groups — they don't add to 100% of anything, so a pie chart would be misleading.

Chart 3 — Survival Rate by Passenger Class

Passenger class tells us where on the ship you were located — and how close you were to the lifeboats.

1st Class: ~63% survival rate

2nd Class: ~47% survival rate

3rd Class: ~24% survival rate

The survival gap between 1st and 3rd class is enormous. Third-class passengers were housed in the lower decks — further from the lifeboats and with less time to reach the top deck. Economic status directly influenced survival chances.

Chart 4 — Age Distribution of All Passengers

A histogram of all passenger ages shows a right-skewed distribution. Most passengers were between 20 and 40 years old. There were relatively few children and elderly passengers compared to working-age adults.

The youngest passenger recorded was under 1 year old. The oldest was 80.

Chart 5 — Age by Survival (Overlapping Histogram)
This is one of the most informative charts in the project. By plotting two histograms on the same axes — one for survivors, one for non-survivors — with alpha=0.6 on both so they're visible through each other, the overlap pattern becomes clear.

plt.hist(df[df["Survived"]==1]["Age"], alpha=0.6, label="Survived", bins=20)
plt.hist(df[df["Survived"]==0]["Age"], alpha=0.6, label="Did not survive", bins=20)
plt.legend()

Young children (under ~10) show a higher proportion of survivors relative to non-survivors — consistent with "children first." For adults aged 20–40, non-survivors heavily outnumber survivors, reflecting the large number of 3rd-class male passengers in that age group.

Chart 6 — Fare Distribution

The fare histogram reveals extreme economic inequality on board. The distribution is heavily right-skewed — the vast majority of passengers paid low fares (under £50), while a small number paid extremely high amounts (up to 512).

This roughly maps to passenger class: 3rd-class passengers paid low fares, 1st-class passengers paid high fares. And as we saw in Chart 3, class directly correlated with survival.

Chart 7 — Gender × Class Survival Heatmap

This is the most powerful chart in the project. Instead of looking at gender and class separately, I combined them into a single heatmap using a pivot table.

pivot = df.pivot_table(values="Survived", index="Sex", columns="Pclass", aggfunc="mean")
sns.heatmap(pivot, annot=True, fmt=".2f", cmap="Blues")

The results:

1st Class 2nd Class 3rd Class
Female ~0.97 ~0.92 ~0.50
Male ~0.37 ~0.16 ~0.14
1st-class females had a ~97% survival rate. They were almost certain to survive. 3rd-class males had a ~14% survival rate. They had almost no chance.

The difference between those two groups is 83 percentage points — from the same disaster, on the same ship, at the same time.

Phase 5 — Key Findings and Conclusion
Key Findings
Only 38% of passengers survived — the majority of people on board did not make it.

Gender was the strongest single factor — female passengers survived at ~74% vs ~19% for males, confirming the "women and children first" evacuation protocol was followed.

Passenger class determined access to lifeboats — 1st class survived at ~63%, 3rd class at only ~24%. Where you sat on the ship directly affected your survival.

The combined effect was extreme — 1st-class females had a ~97% survival rate while 3rd-class males had only ~14%. The gap between best and worst case is 83 percentage points.

Children showed higher survival rates — the overlapping age histogram showed young children were more likely to survive relative to adults.

Fare inequality mirrored class inequality — most passengers paid very little, a few paid enormous amounts, and higher fare strongly correlated with higher survival.

Conclusion
The Titanic data tells a clear story: survival was not random. Gender and passenger class were the two dominant factors, and when combined, they produced an extreme range of outcomes. A 1st-class female passenger had near-certain survival. A 3rd-class male passenger had almost no chance.

The "women and children first" protocol was real — the data proves it. But access to the upper decks, proximity to lifeboats, and crew assistance were all filtered through socioeconomic status. Wealthier passengers had structural advantages that translated directly into survival.

This project taught me how to move from raw data to real insight — not just running code, but understanding what the numbers actually mean about human lives.

What's Next
This is Project 2 of my data analyst portfolio. I'm continuing to build projects that cover real-world datasets and develop my skills in Python, Pandas, and visualization.

Connect with me:

🔗 GitHub: github.com/sandipsubedi0

💼 LinkedIn: linkedin.com/in/sandip-subedi-5694b136a

📸 Instagram: @sandipsubedi0

Thanks for reading. If you found this useful, share it with someone learning data analysis.