<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: mungaime-25</title>
    <description>The latest articles on DEV Community by mungaime-25 (@mungaime25).</description>
    <link>https://dev.to/mungaime25</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2761089%2F4b268d4d-c9fb-42b6-824d-aea786fac2d7.jpg</url>
      <title>DEV Community: mungaime-25</title>
      <link>https://dev.to/mungaime25</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mungaime25"/>
    <language>en</language>
    <item>
      <title>UNSUPERVISED LEARNING</title>
      <dc:creator>mungaime-25</dc:creator>
      <pubDate>Sun, 27 Jul 2025 17:55:55 +0000</pubDate>
      <link>https://dev.to/mungaime25/unsupervised-learning-41b3</link>
      <guid>https://dev.to/mungaime25/unsupervised-learning-41b3</guid>
      <description>&lt;p&gt;&lt;strong&gt;INTRODUCTION TO UNSUPERVISED LEARNING&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It is a type of machine learning where the model is not given any labels. Instead, it tries to find patterns, structures, or relationships in the input data without any human supervision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;main characteristics of unsupervised learning&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;No labeled outputs.&lt;/li&gt;
&lt;li&gt;The system learns patterns from raw data.&lt;/li&gt;
&lt;li&gt;Focuses on data exploration and dimensionality reduction.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;TYPES OF UNSUPERVISED LEARNING&lt;br&gt;
There are two main types of unsupervised learning&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;CLUSTERING
Clustering is the process of grouping similar data points together such that:&lt;/li&gt;
&lt;li&gt;Points in the same cluster are very similar.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Points in different clusters are very different. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;DIMENSIONALITY REDUCTION.&lt;br&gt;
Reducing the number of input variables while preserving key information (e.g., PCA, t-SNE).&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Common Clustering Algorithms&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;_1. K-Means Clustering&lt;br&gt;
_&lt;/strong&gt;&lt;br&gt;
K: number of clusters to form&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Algorithm tries to find K centroids (central points)&lt;/li&gt;
&lt;li&gt;Assigns each data point to the nearest centroid
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.cluster import KMeans
import pandas as pd

# Sample data
data = pd.DataFrame({
    'Income': [45, 54, 67, 120, 130, 150],
    'Spending': [50, 60, 65, 90, 85, 95]
})

kmeans = KMeans(n_clusters=2)
kmeans.fit(data)

print("Cluster centers:\n", kmeans.cluster_centers_)
print("Labels:", kmeans.labels_)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Hierarchical Clustering&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Doesn’t require you to specify the number of clusters&lt;/li&gt;
&lt;li&gt;Creates a tree of clusters (dendrogram)&lt;/li&gt;
&lt;li&gt;You can "cut" the tree at any level to decide how many clusters you want&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Types:&lt;br&gt;
Agglomerative (Bottom-Up): Start with individual points and merge them&lt;br&gt;
Divisive (Top-Down): Start with one cluster and split&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import matplotlib.pyplot as plt
import pandas as pd
from scipy.cluster.hierarchy import dendrogram,linkage
from sklearn.cluster import AgglomerativeClustering
from sklearn.metrics import silhouette_score
from sklearn.preprocessing import StandardScaler

# Sample data
data = pd.DataFrame({
    'Age': [25, 30, 45, 35, 50, 23, 40, 60],
    'Income': [30000, 40000, 50000, 45000, 80000, 32000, 60000, 90000]
})

link = linkage(data, method= 'ward')

#plotting
plt.figure(figsize=(10,6))
dendrogram(link,labels=range(1,len(data)+1),orientation='top', distance_sort= 'ascending',show_leaf_counts= True)
plt.title('hierarchical dendrogram')
plt.xlabel('datapoint')
plt.show()

# Apply Agglomerative Clustering with 3 clusters
model = AgglomerativeClustering(n_clusters=3, linkage='ward')
data['Cluster'] = model.fit_predict(data)

# Visualize
plt.scatter(data['Age'], data['Income'], c=data['Cluster'], cmap='Accent')
plt.xlabel('Age')
plt.ylabel('Income')
plt.title('Agglomerative Clustering')
plt.show()

# Standardize
sl = StandardScaler()
scaled_data = sl.fit_transform(data[['Age', 'Income']])

# Cluster
model = AgglomerativeClustering(n_clusters=3, linkage='ward')
labels = model.fit_predict(scaled_data)

# evaluating
score = silhouette_score(scaled_data, labels)
print(f'Silhouette Score: {score:.4f}')

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Dimensionality Reduction – Finding Simplicity in Complexity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Principal Component Analysis (PCA)&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduces many variables into fewer that still capture most of the information.&lt;/li&gt;
&lt;li&gt;Helps visualize high-dimensional data in 2D or 3D.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.decomposition import PCA
from sklearn.datasets import load_iris

iris = load_iris()
X = iris.data

pca = PCA(n_components=2)
reduced = pca.fit_transform(X)

print("Reduced shape:", reduced.shape)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
    </item>
    <item>
      <title>🔍 Understanding Supervised Learning: A Guide for Beginners</title>
      <dc:creator>mungaime-25</dc:creator>
      <pubDate>Mon, 21 Jul 2025 15:15:17 +0000</pubDate>
      <link>https://dev.to/mungaime25/understanding-supervised-learning-a-guide-for-beginners-4j10</link>
      <guid>https://dev.to/mungaime25/understanding-supervised-learning-a-guide-for-beginners-4j10</guid>
      <description>&lt;p&gt;In today’s world of data and artificial intelligence, Supervised Learning is one of the most commonly used techniques in machine learning. It powers everything from predicting house prices to detecting spam emails. But what exactly is supervised learning, and how does it work?&lt;/p&gt;

&lt;p&gt;Let’s break it down in simple terms.&lt;br&gt;
&lt;strong&gt;📘 What is Supervised Learning?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Supervised Learning is a type of machine learning where the model learns from labeled data. That means, for every input in the dataset, we already know the correct output.&lt;/p&gt;

&lt;p&gt;Think of it like teaching a child using flashcards:&lt;/p&gt;

&lt;p&gt;You show them a picture of a cat and tell them, “This is a cat.”&lt;/p&gt;

&lt;p&gt;You show a picture of a dog and say, “This is a dog.”&lt;br&gt;
After seeing many such examples, the child begins to recognize the difference and can identify new animals on their own.&lt;/p&gt;

&lt;p&gt;Similarly, in supervised learning, the algorithm is trained on data with known answers (labels), so it can later predict outcomes for new, unseen data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚙️ How Does It Work?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Supervised learning works in two main stages:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Training Phase:&lt;/strong&gt;&lt;br&gt;
The model is fed a dataset containing inputs (also called features) and their correct outputs (labels). It tries to find a pattern or relationship between them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Testing or Prediction Phase:&lt;/strong&gt;&lt;br&gt;
Once trained, the model is given new inputs it hasn’t seen before, and it uses what it has learned to predict the outputs.&lt;br&gt;
&lt;strong&gt;🧠 Types of Supervised Learning Problems&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There are two main types of problems in supervised learning:&lt;/p&gt;

&lt;p&gt;Regression: Predicts a continuous value&lt;br&gt;
Example: Predicting the price of a car based on mileage, brand, and model year.&lt;/p&gt;

&lt;p&gt;Classification: Predicts a category or label&lt;br&gt;
Example: Classifying emails as “Spam” or “Not Spam”.&lt;br&gt;
&lt;strong&gt;🔍 Popular Supervised Learning Models&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let’s explore a few common models used in supervised learning:&lt;br&gt;
&lt;strong&gt;1. 📈 Linear Regression&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use: For predicting numeric values.&lt;/p&gt;

&lt;p&gt;How it works: It draws a straight line through the data points that best represents the relationship between the input and the output.&lt;/p&gt;

&lt;p&gt;Example: Predicting house prices based on the size of the house.&lt;br&gt;
&lt;strong&gt;2. 🌳 Decision Trees&lt;/strong&gt;&lt;br&gt;
Use: Can be used for both classification and regression.&lt;/p&gt;

&lt;p&gt;How it works: Think of it like a flowchart. It splits the data based on decision rules (e.g., “Is the age &amp;gt; 30?”), forming a tree-like structure.&lt;/p&gt;

&lt;p&gt;Example: Classifying whether a customer will buy a product based on age, income, and past behavior.&lt;br&gt;
&lt;strong&gt;3. 🚀 Gradient Boosting Machines (GBM)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use: For complex regression and classification tasks.&lt;/p&gt;

&lt;p&gt;How it works: GBM builds models in a sequence. Each new model tries to correct the errors of the previous one, gradually improving the performance.&lt;/p&gt;

&lt;p&gt;Example: Predicting loan default risk in financial applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. 🧮 K-Nearest Neighbors (KNN)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use: Simple and effective for small datasets.&lt;/p&gt;

&lt;p&gt;How it works: It looks at the ‘k’ closest points (neighbors) to a new input and assigns the most common label (for classification) or average value (for regression).&lt;/p&gt;

&lt;p&gt;Example: Classifying a flower species based on petal length and width.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. 🧠 Support Vector Machines (SVM)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use: Mainly for classification tasks.&lt;/p&gt;

&lt;p&gt;How it works: SVM finds the best boundary (or hyperplane) that separates different classes in the data.&lt;/p&gt;

&lt;p&gt;Example: Detecting whether an email is spam or not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧪 Real-Life Example: Predicting Student Grades&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let’s say we want to predict a student’s final grade based on:&lt;/p&gt;

&lt;p&gt;Hours studied&lt;/p&gt;

&lt;p&gt;Attendance rate&lt;/p&gt;

&lt;p&gt;Participation in class&lt;br&gt;
We would:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Collect data from past students with their actual grades.&lt;/li&gt;
&lt;li&gt;Train a regression model using this data.&lt;/li&gt;
&lt;li&gt;Use the model to predict the grade of a current student.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;🎯 Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Supervised learning is like learning with a teacher — the answers are given, and the model learns by example. It’s powerful, widely used, and forms the basis of many AI systems today.&lt;/p&gt;

&lt;p&gt;From predicting prices to classifying emails and diagnosing diseases, supervised learning is everywhere. By understanding its models — like Linear Regression, Decision Trees, and Gradient Boost — we unlock the potential to turn data into valuable predictions.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>datascience</category>
      <category>luxdev</category>
      <category>devops</category>
    </item>
    <item>
      <title>Introduction to SQL for data science</title>
      <dc:creator>mungaime-25</dc:creator>
      <pubDate>Wed, 16 Apr 2025 09:48:42 +0000</pubDate>
      <link>https://dev.to/mungaime25/introduction-to-sql-for-data-science-mom</link>
      <guid>https://dev.to/mungaime25/introduction-to-sql-for-data-science-mom</guid>
      <description>&lt;p&gt;Structured Query Language is a fundamental tool for any data scientist. It allows you to efficiently retrieve, manipulate and analyze structured data stored in  relational databases. SQL provides capabilities to extract insights.&lt;br&gt;
In this article we will cover the basics of SQL and essential queries for data science.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why SQL for data Science&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Retrieval&lt;/strong&gt; - SQL enables efficient extraction of data from databases.&lt;br&gt;
&lt;strong&gt;Data manipulation&lt;/strong&gt; - SQL enables us to filter, aggregate and transform data before analysis.&lt;br&gt;
&lt;strong&gt;Performance&lt;/strong&gt; - SQL is optimized for handling large datasets.&lt;br&gt;
&lt;strong&gt;Integration&lt;/strong&gt; - SQL works seamlessly with python, R and BI tools. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic SQL Queries&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SELECT statement&lt;/strong&gt;&lt;br&gt;
The select statement is used to retrieve data from a database.&lt;br&gt;
Example&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmyrnlrba4zzumniyp03p.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmyrnlrba4zzumniyp03p.PNG" alt="Image description" width="330" height="41"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;returns for us the first and the second name from the customer table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;WHERE clause&lt;/strong&gt;&lt;br&gt;
used to filter data from a table.&lt;br&gt;
EXAMPLE&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F960965moeifhrbju6g29.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F960965moeifhrbju6g29.PNG" alt="Image description" width="356" height="71"&gt;&lt;/a&gt;&lt;br&gt;
only counts for us the customers who are from Kisumu&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HAVING clause&lt;/strong&gt;&lt;br&gt;
used to filter aggregated data.&lt;br&gt;
EXAMPLE&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5fmdjfdb9xt6ev4nxb2.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5fmdjfdb9xt6ev4nxb2.PNG" alt="Image description" width="231" height="134"&gt;&lt;/a&gt;&lt;br&gt;
counts the total orders for only the customers who had more than one order&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ORDER BY&lt;/strong&gt;&lt;br&gt;
Used to sort data in a specified order. &lt;br&gt;
that is ASCENDING or DESCENDING &lt;br&gt;
&lt;strong&gt;N/B&lt;/strong&gt; - The default SQL order is Ascending &lt;br&gt;
Example&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq23m5qam79om9ts6s7a7.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq23m5qam79om9ts6s7a7.PNG" alt="Image description" width="327" height="52"&gt;&lt;/a&gt;&lt;br&gt;
 lists for us the price from the lowest to the highest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SUMMARY&lt;/strong&gt;&lt;br&gt;
in summary SQL is a crucial tool for data scientists. enabling efficient data retrieval, manipulation and analysis from a relational database.&lt;br&gt;
in this article we have covered key SQL concepts including basic queries as SELECT, WHERE, HAVING, ORDERBY for retrieving,  filtering and sorting data.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>luxdev</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
