<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ezeana Micheal</title>
    <description>The latest articles on DEV Community by Ezeana Micheal (@ezeanamichael).</description>
    <link>https://dev.to/ezeanamichael</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F795718%2F3885e716-3ecd-4c6e-9b7b-360badddf8f2.jpeg</url>
      <title>DEV Community: Ezeana Micheal</title>
      <link>https://dev.to/ezeanamichael</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ezeanamichael"/>
    <language>en</language>
    <item>
      <title>LegalCheck: Built for freelancers and remote workers after an issue I ran into.</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 28 Feb 2026 18:28:27 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/legalcheck-built-for-freelancers-and-remote-workers-after-an-issue-i-ran-into-3f88</link>
      <guid>https://dev.to/ezeanamichael/legalcheck-built-for-freelancers-and-remote-workers-after-an-issue-i-ran-into-3f88</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/weekend-2026-02-28"&gt;DEV Weekend Challenge: Community&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Community
&lt;/h2&gt;

&lt;p&gt;I built LegalCheck for freelancers, startup founders, indie builders, and non-legal tech practitioners who sign contracts without fully understanding what they’re agreeing to.&lt;/p&gt;

&lt;p&gt;In places like Nigeria, legal support isn’t always accessible or affordable. A lot of people copy templates from the internet, sign client agreements quickly, or accept vendor contracts under pressure without truly knowing the risks(like me once before).&lt;/p&gt;

&lt;p&gt;LegalCheck is for that community. The builders. The side-hustlers. The agency owners. The developers. The designers. The people shipping fast but still needing protection.&lt;/p&gt;

&lt;p&gt;I’m built(and am still scaling) this for people like me.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;LegalCheck is an AI-powered contract analysis tool.&lt;/p&gt;

&lt;p&gt;You upload a contract.&lt;br&gt;
It reads it.&lt;br&gt;
It highlights risky clauses.&lt;br&gt;
It explains them in plain English.&lt;/p&gt;

&lt;p&gt;No legal jargon. No intimidation.&lt;/p&gt;

&lt;p&gt;What it does:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Analyzes contracts (freelance agreements, NDAs, service contracts, etc.)&lt;/li&gt;
&lt;li&gt;Flags risky or one-sided clauses&lt;/li&gt;
&lt;li&gt;Explains complex legal language in simple terms&lt;/li&gt;
&lt;li&gt;Highlights missing protections&lt;/li&gt;
&lt;li&gt;Breaks down what each key clause actually means for you&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It doesn’t replace a lawyer.&lt;br&gt;
It gives you clarity before you sign.&lt;/p&gt;

&lt;p&gt;And that alone can save people from very expensive mistakes.&lt;/p&gt;
&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;


&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
      &lt;div class="c-embed__body flex items-center justify-between"&gt;
        &lt;a href="https://legalcheck.buildswithmike.com/" rel="noopener noreferrer" class="c-link fw-bold flex items-center"&gt;
          &lt;span class="mr-2"&gt;legalcheck.buildswithmike.com&lt;/span&gt;
          

        &lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;
&lt;br&gt;
(It takes some time to scan the contract about 5-10 mins, but it'll be worth the wait)

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqeubecgou1dwfpt42pd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqeubecgou1dwfpt42pd.png" alt="legalcheck1" width="800" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1055yry9hefbq52o98pa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1055yry9hefbq52o98pa.png" alt="legalcheck2" width="800" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9hbi1abetuwv3afpftdi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9hbi1abetuwv3afpftdi.png" alt="legalcheck3" width="800" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh6iyafl6jyhrwiugxmxu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh6iyafl6jyhrwiugxmxu.png" alt="legalcheck4" width="800" height="403"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fut499qwgpu4j6vxog6ju.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fut499qwgpu4j6vxog6ju.png" alt="legalcheck5" width="800" height="438"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Just-Mike4" rel="noopener noreferrer"&gt;
        Just-Mike4
      &lt;/a&gt; / &lt;a href="https://github.com/Just-Mike4/legal-check" rel="noopener noreferrer"&gt;
        legal-check
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      AI-Legal Contract Analyzer
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;LegalCheck - AI-Powered Legal Document Analysis&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;LegalCheck is an intelligent legal document analysis platform that extracts, identifies, and explains specific clauses in legal documents. It translates complex legal language into plain English and provides visual representations where helpful.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Features&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Instant Clause Detection&lt;/strong&gt;: Automatically identify all key clauses in legal documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plain English Explanations&lt;/strong&gt;: Complex legalese translated to simple language&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Risk Assessment&lt;/strong&gt;: Identify potentially risky clauses with severity levels&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual Representations&lt;/strong&gt;: Timelines and flowcharts for complex terms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document Comparison&lt;/strong&gt;: Compare clauses across similar documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Q&amp;amp;A System&lt;/strong&gt;: Ask questions about specific clauses&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Tech Stack&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: Flask, SQLAlchemy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authentication&lt;/strong&gt;: Google OAuth (Authlib)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI/ML&lt;/strong&gt;: Google Gemini AI for clause analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document Processing&lt;/strong&gt;: pypdf, python-docx, pytesseract (OCR)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Tailwind CSS, Vanilla JavaScript&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database&lt;/strong&gt;: SQLite (development), PostgreSQL (production)&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Setup Instructions&lt;/h2&gt;

&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Prerequisites&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Python 3.9 or higher&lt;/li&gt;
&lt;li&gt;Google OAuth credentials&lt;/li&gt;
&lt;li&gt;Gemini API key&lt;/li&gt;
&lt;li&gt;Tesseract OCR (for…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Just-Mike4/legal-check" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  How I Built It
&lt;/h2&gt;

&lt;p&gt;I built LegalCheck using Flask for the backend, Jinja2 for templating, Tailwind CSS for UI, and Google GenAI (Gemini 2.5) for the contract analysis.&lt;/p&gt;

&lt;p&gt;Thanks for Reading!.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>weekendchallenge</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Custom QuerySets in Django: Writing Cleaner, Reusable Queries</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 20 Sep 2025 07:38:57 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/custom-querysets-in-django-writing-cleaner-reusable-queries-d3a</link>
      <guid>https://dev.to/ezeanamichael/custom-querysets-in-django-writing-cleaner-reusable-queries-d3a</guid>
      <description>&lt;h1&gt;
  
  
  Custom QuerySets in Django: Writing Cleaner, Reusable Queries
&lt;/h1&gt;

&lt;p&gt;When building Django applications, especially as they scale when creating views and similar components, it's easy to find yourself using the same queries repeatedly in different logical steps. The repetition makes the code hard to maintain and can cause some confusion. &lt;/p&gt;

&lt;p&gt;This is where querysets come in. Custom querysets allow us to write cleaner code, keeping some business logic close to your models while avoiding duplication.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But first, what is a queryset?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A queryset is simply a collection of database queries in Django, it allows us to make queries through Django’s ORM to the database. By default, objects is Django’s built-in Manager that returns a QuerySet. But what if you always need the query? Rewriting the filter everywhere does not follow the DRY (Don’t Repeat Yourself) principle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from blog.models import Post

# Default QuerySets  
all_posts = Post.objects.all()  
published_posts = Post.objects.filter(status="published")  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let's get into the programming aspect. How do you define a custom queryset? We use the models imported from Django’s DB and inherit from the Queryset. Here’s an example, Suppose you have a Post model and you want to make some repeated queries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.db import models

class PostQuerySet(models.QuerySet):  
    def published(self):  
        return self.filter(status="published")

    def drafts(self):  
        return self.filter(status="draft")

    def by_author(self, author):  
        return self.filter(author=author)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's now attach this to our model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Post(models.Model):  
    STATUS_CHOICES = (  
        ("draft", "Draft"),  
        ("published", "Published"),  
    )

    title = models.CharField(max_length=200)  
    content = models.TextField()  
    status = models.CharField(max_length=10, choices=STATUS_CHOICES)  
    author = models.ForeignKey("auth.User", on_delete=models.CASCADE)

    # Attach the custom QuerySet  
    objects = PostQuerySet.as_manager()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can make queries like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# All published posts  
Post.objects.published()

# All drafts by a specific author  
Post.objects.drafts().by_author(user)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can even chain them,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Get published posts by a specific author  
Post.objects.published().by_author(user)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a way, querysets and managers are similar and different. How would you know when to use either?&lt;br&gt;&lt;br&gt;
Use Custom QuerySets when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want reusable filters (like .published(), .active()).
&lt;/li&gt;
&lt;li&gt;You need chainable queries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Custom Managers when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want to override get_queryset() itself.
&lt;/li&gt;
&lt;li&gt;You need queries that return something other than a QuerySet (like creating objects or aggregations).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this article, We’ve seen how qureysets helps us in keeping the DRY principle and writing cleaner code, in the next one we’ll consider when and how the 3 powerhouses in our models.py can be used together in &lt;em&gt;Custom Model Methods vs. Managers vs. QuerySets: When to Use Each.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>django</category>
      <category>drf</category>
      <category>backend</category>
      <category>database</category>
    </item>
    <item>
      <title>Understanding Django Managers: The Gateway to Your Data</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 13 Sep 2025 12:53:06 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/understanding-django-managers-the-gateway-to-your-data-1cih</link>
      <guid>https://dev.to/ezeanamichael/understanding-django-managers-the-gateway-to-your-data-1cih</guid>
      <description>&lt;p&gt;When working with Django models, you’re not just defining tables, you’re also defining how to interact with them. In this article, we’re going to use one of Django’s powerful tools when making queries, which is Django managers. Django managers define the interface for the interaction between your code and the database. The default one most are used to is the “&lt;em&gt;objects&lt;/em&gt;” as in “&lt;em&gt;students.objects.all()&lt;/em&gt;”.&lt;/p&gt;

&lt;p&gt;Examples;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Student(models.Model):  
    name = models.CharField(max_length=100)  
    age = models.PositiveIntegerField()

# Using the default manager  
all_students = Student.objects.all()  
adults = Student.objects.filter(age__gte=18)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Creating a Custom Manager
&lt;/h2&gt;

&lt;p&gt;Alongside this default manager, we can also create custom managers. But why would we need to do so?&lt;br&gt;&lt;br&gt;
While &lt;em&gt;objects&lt;/em&gt; is useful when performing basic queries, when dealing with larger projects, custom managers give user-defined reusable logic in code. Let's look at a practical example of this by using Django’s &lt;em&gt;models.&lt;/em&gt; We then define a class that inherits from it and overwrites the &lt;em&gt;get_queryset&lt;/em&gt; method in the student manager.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class ActiveStudentManager(models.Manager):  
    def get_queryset(self):  
        return super().get_queryset().filter(is_active=True)

class Student(models.Model):  
    name = models.CharField(max_length=100)  
    is_active = models.BooleanField(default=True)

    objects = models.Manager()        # Default manager  
    active = ActiveStudentManager()   # Custom manager  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;How then do you use this custom manager? Look below,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Student.objects.all()   # returns all students  
Student.active.all()    # returns only active students  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding Custom Methods to Managers
&lt;/h2&gt;

&lt;p&gt;You can also add methods to model managers to perform specific actions or queries. An example is sorting students by age group.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class StudentManager(models.Manager):  
    def teenagers(self):  
        return self.filter(age__gte=13, age__lte=19)

    def adults(self):  
        return self.filter(age__gte=18)

class Student(models.Model):  
    name = models.CharField(max_length=100)  
    age = models.PositiveIntegerField()

    objects = StudentManager()  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then using,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Student.objects.teenagers()  # all students aged 13–19  
Student.objects.adults()     # all students aged 18+  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Multiple Managers in a Model
&lt;/h2&gt;

&lt;p&gt;As seen in the first example, you can add more than 1 manager to a model to ease and streamline usage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class ActiveStudentManager(models.Manager):  
    def get_queryset(self):  
        return super().get_queryset().filter(is_active=True)

class Student(models.Model):  
    name = models.CharField(max_length=100)  
    is_active = models.BooleanField(default=True)

    objects = models.Manager()        # Default manager  
    active = ActiveStudentManager()   # Custom manager  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Managers vs. QuerySets
&lt;/h2&gt;

&lt;p&gt;Managers are not qureysets. Managers are entry points to the database while QuerySets represent the actual collection of records. In the next article, we’ll see how we can use Querysets. &lt;/p&gt;

&lt;h2&gt;
  
  
  Multiple Managers in a Model
&lt;/h2&gt;

&lt;p&gt;A model can have multiple managers, but they have one default manager, which is the &lt;em&gt;objects&lt;/em&gt;; this can be subject to change, though.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Student(models.Model):  
    name = models.CharField(max_length=100)  
    is_active = models.BooleanField(default=True)

    objects = models.Manager()       # default  
    active = ActiveStudentManager()  # custom  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; The default manager is used in admin and migrations, so always keep &lt;em&gt;objects = models.Manager()&lt;/em&gt; unless you know what you’re doing.&lt;/p&gt;

&lt;p&gt;We have seen how Django managers can be used in this article, and how they help in holding some business logic and reusable queries. In the next article, we will explore querysets in Django and how they are used alongside this.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>django</category>
      <category>backend</category>
      <category>data</category>
    </item>
    <item>
      <title>ForeignKey vs ManyToMany vs OneToOne: When to Use Each in Django</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 06 Sep 2025 07:21:35 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/foreignkey-vs-manytomany-vs-onetoone-when-to-use-each-in-django-44eg</link>
      <guid>https://dev.to/ezeanamichael/foreignkey-vs-manytomany-vs-onetoone-when-to-use-each-in-django-44eg</guid>
      <description>&lt;p&gt;In common Relational Database concepts, relationships between tables are nothing new; these help to represent connections between tables and link them together. In this article, I’ll explore some common DB relationships and how they are used in Django. These include One-to-One, One-to-Many (foreign key), and Many-to-Many Relationships.&lt;/p&gt;

&lt;h2&gt;
  
  
  One to One
&lt;/h2&gt;

&lt;p&gt;In a one-to-one relationship, one record in a model corresponds exactly to one record in another model. For example, a user’s relationship to his profile, using Django’s built-in user model, we can extend by creating a user profile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.contrib.auth.models import User

class Profile(models.Model):  
    user = models.OneToOneField(User, on_delete=models.CASCADE)  
    bio = models.TextField()  
    birth_date = models.DateField(null=True, blank=True)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  One to Many
&lt;/h2&gt;

&lt;p&gt;In a one-to-many relationship, one record in a model corresponds to many records in another model. This means a record in a table can be linked to multiple records in another table, and this is done using a foreign key. A practical example of this is of an author having many books.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.db import models

class Author(models.Model):  
    name = models.CharField(max_length=100)

class Book(models.Model):  
    title = models.CharField(max_length=200)  
    author = models.ForeignKey(Author, on_delete=models.CASCADE)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Many to Many
&lt;/h2&gt;

&lt;p&gt;In a many-to-many relationship, a table is linked to another table where multiple records in one table are related to various records in the other. An example is a relationship between students and courses. A student can enroll in many courses, and a course can have many students.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Student(models.Model):  
    name = models.CharField(max_length=100)

class Course(models.Model):  
    title = models.CharField(max_length=200)  
    students = models.ManyToManyField(Student)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In each of these types of relationships in Django, there are some common parameters and their meaning. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;On_delete: In a foreign key and the one-to-one field, there’s the on_delete parameter.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The common options are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;models.CASCADE: delete related objects too.
&lt;/li&gt;
&lt;li&gt;models.SET_NULL: set field to NULL (requires null=True).
&lt;/li&gt;
&lt;li&gt;models.PROTECT: prevent deletion if related objects exist.
&lt;/li&gt;
&lt;li&gt;models.SET_DEFAULT: set field to default value.
&lt;/li&gt;
&lt;li&gt;Related_name: This defines the name used to access the reverse relationship; without this, Django automatically adds _set, i.e, (model_name)_set.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   class Course(models.Model):  
       title = models.CharField(max_length=200)  
       students = models.ManyToManyField(  
           "Student",   
           related_name="enrolled_courses"  
       )  

   #referenced here using  

   student = Student.objects.first()  
   student.enrolled_courses.all()  # instead of student.course_set.all()  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;blank and null: Blank and Null are boolean parameters; null=True means the database can store NULL values, while blank=True means validation allows empty values.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   class Profile(models.Model):  
       bio = models.TextField(blank=True, null=True)  

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Through: This is unique for the many-to-many field, it lets you define a custom intermediate model for many-to-many relationships.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   class Enrollment(models.Model):  
       student = models.ForeignKey("Student", on_delete=models.CASCADE)  
       course = models.ForeignKey("Course", on_delete=models.CASCADE)  
       enrolled_at = models.DateTimeField(auto_now_add=True)  
       grade = models.CharField(max_length=2, blank=True)  

   class Course(models.Model):  
       title = models.CharField(max_length=200)  
       students = models.ManyToManyField("Student", through="Enrollment")  

   # usage below  
   student = Student.objects.first()  
   course = Course.objects.first()  
   Enrollment.objects.create(student=student, course=course, grade="A")  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From this article, we’ve been able to break down relationships in Django, their parameters, and how they’re referenced. Next, we’ll move into understanding Django managers.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>django</category>
      <category>backend</category>
      <category>programming</category>
    </item>
    <item>
      <title>Designing Better Django Models: Tips for Scalability and Clarity</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 30 Aug 2025 10:05:17 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/designing-better-django-models-tips-for-scalability-and-clarity-3ed2</link>
      <guid>https://dev.to/ezeanamichael/designing-better-django-models-tips-for-scalability-and-clarity-3ed2</guid>
      <description>&lt;p&gt;I’ve been using Django for a while now, and I’ve seen countless guides on how to write better code, structured code, and the list goes on. I decided to write mine to cover one major aspect, which is simplicity, for this guide, and following ones, I'll make use of simple wordings and easy to understand python code that can guide you (or anyone) in writing simple and powerful django applications, and to start with, the models, which are the backbone of traditional backend applications, a poorly written model design will cause a lot of problems in your application like optimization and scaling difficulty, lets explore practices and use cases, for this article ill make use of a blog model to explain the process. To begin, we start with defining a clear data model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Start with a Clear Data Model
&lt;/h2&gt;

&lt;p&gt;Before you start writing code, it is essential to understand what you want the code to do first. It's always best to get a clear picture of what you want your database to look like through design, including the relationships. Tools like Draw.io can help with that. (This article won't cover how to design your model; it's more on writing code using Django)&lt;/p&gt;

&lt;p&gt;After designing your database and relationships, we're ready to hop into the code. Keep 2 things in mind: &lt;strong&gt;naming matters,&lt;/strong&gt; and &lt;strong&gt;keep models simple.&lt;/strong&gt; For example, here's our post model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.db import models  
class Post(models.Model):  
    title = models.CharField(max_length=255)  
    content = models.TextField()  
    published_at = models.DateTimeField(auto_now_add=True)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's always best to use meaningful, singular model names like Post, not Posts, and we limit it to 3 fields: title, content, and published date. Next, let's look at choosing the right type of fields.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing the Right Field Types
&lt;/h2&gt;

&lt;p&gt;In Django’s models, there are several field types like CharField, TextField, and DateTimeField used in the above, but aside from this, there are others like URLField, EmailField. Let's take a look at the user profile table below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class UserProfile(models.Model):  
    email = models.EmailField(unique=True)     
    website = models.URLField(blank=True)  
    bio = models.TextField(blank=True)  
    reputation = models.DecimalField(max_digits=6, decimal_places=2, default=0.00)    
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's always best to use a specific field for some of the following reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Django provides built-in validation
&lt;/li&gt;
&lt;li&gt;Helps Django’s retrieval from the database
&lt;/li&gt;
&lt;li&gt;It saves space and optimizes in the database (e.g, using Charfield for fields like name with a max length set, rather than a text field, which is better suited for description)
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Now, let's consider primary keys and identifiers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Primary Keys &amp;amp; Identifiers
&lt;/h2&gt;

&lt;p&gt;When talking of primary keys, we consider the IDs of these tables since these are what link them up. It also brings up the question, when it is best to use Django’s default ID and a UUID.&lt;br&gt;&lt;br&gt;
By default, Django creates an ID this way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;id = models.BigAutoField(primary_key=True)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It auto-increments, and it's useful for small to medium-scale projects. The downside of this is that it's predictable, and it can expose user count or allow enumeration attacks in APIs.&lt;br&gt;&lt;br&gt;
UUIDs, on the other hand, are especially useful in &lt;strong&gt;distributed systems&lt;/strong&gt; or &lt;strong&gt;microservices&lt;/strong&gt;, where you can’t rely on a central database to generate sequential IDs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import uuid  
class UserProfile(models.Model):  
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)  
    email = models.EmailField(unique=True)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;UUIDs are harder to guess and good for public-facing URLs, but they require larger storage and are slower to index. Let's take a step into relationships.&lt;/p&gt;

&lt;h2&gt;
  
  
  Relationships Done Right
&lt;/h2&gt;

&lt;p&gt;In Django, there are 3 major types of relationships: one-to-one, one-to-many (Foreign key), and many-to-many. Below are examples of such relationships.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Post(models.Model):  
    author = models.ForeignKey("Author", on_delete=models.CASCADE, related_name="posts")  
    title = models.CharField(max_length=255)  
    content = models.TextField()

class Comment(models.Model):  
    post = models.ForeignKey("Post", on_delete=models.CASCADE, related_name="comments")  
    author = models.CharField(max_length=100)  
    body = models.TextField()

class Tag(models.Model):  
    name = models.CharField(max_length=50, unique=True)  
    posts = models.ManyToManyField("Post", related_name="tags")  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;models.Foreignkey, models.ManyToMany and models.OneToOne.&lt;br&gt;&lt;br&gt;
You’ll notice related name parameter placed in these relationship, it is the &lt;strong&gt;reverse relation name&lt;/strong&gt; Django will use when looking up related objects.&lt;br&gt;&lt;br&gt;
For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;post = Post.objects.first()  
post.comments.all()   # instead of post.comment_set.all()  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The on_delete argument tells Django what to do with related objects when the referenced object is deleted. In our case models.CASCADE means delete the dependent rows too. This is important so it doesnt retain dependent roles in the database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Methods &amp;amp; Business Logic
&lt;/h2&gt;

&lt;p&gt;When writing business logic in the model.py file, we must ensure that this is not where all our methods should be; basically, just simple and commonly used logic can be implemented here, for example, using sample model methods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Post(models.Model):  
    title = models.CharField(max_length=255)  
    content = models.TextField()  
    published_at = models.DateTimeField(auto_now_add=True)

    def is_published_recently(self):  
        from django.utils import timezone  
        return self.published_at &amp;gt;= timezone.now() - timezone.timedelta(days=7)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The post.is_published_recently() can now be called and works as intended from here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance Considerations
&lt;/h2&gt;

&lt;p&gt;Finally, for some performance considerations, here are some tips: use indexes for frequent lookups; they query faster.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class BlogPost(models.Model):  
    slug = models.SlugField(unique=True, db_index=True)  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Keeping Models Maintainable
&lt;/h2&gt;

&lt;p&gt;Utilizing some built-in functions in Django models to help readability and reusability is mostly advised, adding &lt;strong&gt;str&lt;/strong&gt; for readability, adding a class meta for verbose name, and ordering.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Post(models.Model):  
    title = models.CharField(max_length=255)  
    body = models.TextField()  
    created_at = models.DateTimeField(auto_now_add=True)

    Class Meta:  
        ordering = ["-created_at"]  
        verbose_name = "Blog Post"

    def __str__(self):  
        return self.title  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From this article, we’ve seen the need for simplicity in creating Django models and utilizing functionalities to optimize performance. In the next article, we’ll consider in-depth the use of foreign keys, one-to-one and many-to-many.&lt;/p&gt;

</description>
      <category>django</category>
      <category>backenddevelopment</category>
      <category>python</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Distributed Systems: Designing Scalable Python Backends</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Mon, 27 Jan 2025 07:08:27 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/distributed-systems-designing-scalable-python-backends-1lod</link>
      <guid>https://dev.to/ezeanamichael/distributed-systems-designing-scalable-python-backends-1lod</guid>
      <description>&lt;p&gt;Almost all systems today connected through the World Wide Web are distributed systems. Distributed systems are a group of multiple computers or servers that work together and functionalize optimally. This allows multiple users to utilize such software or services without facing slow loading times and poor performance. For example, imagine you build a website and host it on a single-user server, this would perform well until user traffic increases, demanding more resources and speed. Distributed systems aid performance and flexibility by splitting the application into individual services on separate servers that interact with each other. This would seem like a simple software or application to the user, but on the backend, it's multiple interconnected nodes talking to each other.&lt;/p&gt;

&lt;p&gt;Python programming language is one of the slowest, but one of the most useful languages today, ever since the advent of artificial intelligence, machine learning, and Large Language Models, python has been the go-to language for these, but no one wants a chatbot or an ml service that takes a long amount of processing time to work, distributed systems is a key to optimizing such on an application. In this article, we’ll consider key features of distributed systems, why you should use distributed systems, and how we can scale a distributed system with Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Features Of Distributed Systems
&lt;/h2&gt;

&lt;p&gt;The following are key features of distributed systems that make them work optimally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nodes&lt;/strong&gt;: Individual computers or processes that work together as part of the system; each node performs certain tasks and connects with others to ensure the system functions properly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Communication Protocols&lt;/strong&gt;: Nodes can communicate and share information thanks to protocols like HTTP, gRPC, or TCP/IP, which guarantee dependable communication between components even when they are on different networks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shared Resources&lt;/strong&gt;: Distributed systems frequently rely on resources like databases, file systems, or message queues; proper management enables consistent and efficient access across all nodes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fault Tolerance&lt;/strong&gt;: Even if a node fails, distributed systems continue running, eliminating a single point of failure. Redundancy and replication techniques ensure reliability and high availability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: The ability to handle increased load by adding more nodes (horizontal scaling) or enhancing the capacity of existing nodes (vertical scaling). Scalability ensures the system remains responsive under high demand.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Scalability Matters?
&lt;/h2&gt;

&lt;p&gt;Scalability as mentioned earlier is a system’s ability to handle increased load by adding resources. This ensures the system is always at optimal performance during traffic spikes. There are 2 major types of scaling:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Horizontal scaling&lt;/strong&gt;: This involves using more machines and servers for the application to work smoothly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vertical scaling&lt;/strong&gt;: This involves increasing the system's RAM, storage, and capacity.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  How to Design Scalable Python Backends
&lt;/h2&gt;

&lt;p&gt;Knowledge of the right tools is required to design scalable Python backends to allow the system to grow and remain efficient. Below are some key tools and strategies for building scalable Python backends.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;APIs&lt;/strong&gt;: Use lightweight frameworks like Flask or Fast API to build scalable backend APIs. They are relatively easy to use when creating rest APIs. Fast API is best for performance and its support for asynchronous programming.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Asynchronous processing&lt;/strong&gt;: To support the main application, it is wise to offload some background tasks(like emails or data processing) using Cerely with Redis as the message broker.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load Balancing&lt;/strong&gt;: To balance the load or traffic on the application, distribute incoming requests evenly across backend servers like Nginx or HAProxy in the distributed system.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: A task queue with Celery and Redis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# tasks.py
from celery import Celery

app = Celery('tasks', broker='redis://localhost:6379/0')

@app.task
def process_order(order_id):
    print(f"Processing order {order_id}")

# Adding a task to the queue
process_order.delay(123)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Data Management in Distributed Systems
&lt;/h2&gt;

&lt;p&gt;Maintaining the CAP theorem or properties when managing data in distributed systems.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt;: All nodes in the distributed system see the same data. If data is updated in a node, all nodes should reflect the updated value immediately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability&lt;/strong&gt;: The system responds even during a node failure. The system should always be operational.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Partition Tolerance&lt;/strong&gt;: It should work despite network failures between nodes. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Useful databases are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL Databases like PostgreSQL for transactional consistency.&lt;/li&gt;
&lt;li&gt;NoSQL Databases like MongoDB for scalable, flexible schemas.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An example is a case setting up a distributed MongoDB cluster to store and retrieve user data across multiple nodes, ensuring high availability and fault tolerance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools for Deployment and Scaling
&lt;/h2&gt;

&lt;p&gt;Deployment and scaling are when tools like docker and Kubernetes come in.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Docker&lt;/strong&gt;: Docker is used to containerize Python backend applications for consistent environments.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kubernetes&lt;/strong&gt;: This helps automate the deployment, scaling, and management of the containerized application.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: Below is a simple/basic example of deploying a Python backend application using docker and Kubernetes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dockerfile:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; FROM python:3.10
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Kubernetes Deployment:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; apiVersion: apps/v1
kind: Deployment
metadata:
  name: flask-backend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: flask-backend
  template:
    metadata:
      labels:
        app: flask-backend
    spec:
      containers:
      - name: flask-backend
        image: flask-app:latest
        ports:
        - containerPort: 5000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Monitoring and Maintenance
&lt;/h2&gt;

&lt;p&gt;In distributed systems, it is important to monitor and maintain nodes around the system as they interact and function as one. This will help to identify and fix possible faults. &lt;/p&gt;

&lt;p&gt;Examples of tools that do this are Prometheus and Grafana.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prometheus&lt;/strong&gt;: Prometheus helps to collect metrics on API performance, database latency, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grafana&lt;/strong&gt;: Grafana visualizes metrics with customizable dashboards.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Case Study: Scalable Ecommerce Backend
&lt;/h2&gt;

&lt;p&gt;Before I conclude this article, let's look at how to apply distributed systems during the development of an e-commerce system.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;FastAPI can be used for the backend system and handle API’s for order processing.&lt;/li&gt;
&lt;li&gt;Celery with Redis is used to handle asynchronous background processing of tasks like payments or inventory updates.&lt;/li&gt;
&lt;li&gt;The application is deployed on docker and Kubernetes to ensure the scaling of the system.&lt;/li&gt;
&lt;li&gt;The application is monitored using tools like Prometheus.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;By using Python tools like Flask, FastAPI, Celery, Docker, and Kubernetes, developers can build robust and scalable systems. In this article, we’ve covered the general terms related to distributed systems and how they can help, as well as some basic examples with Python. You can do advanced research on these tools and how they can work together and help. Start experimenting with these tools and create backends that can handle the challenges of real-world traffic and growth. Happy hacking.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>python</category>
      <category>distributedsystems</category>
      <category>fastapi</category>
    </item>
    <item>
      <title>Advanced Database Query Optimization Techniques: A Practical Approach with Django</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Mon, 20 Jan 2025 08:00:00 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/advanced-database-query-optimization-techniques-a-practical-approach-with-django-i8a</link>
      <guid>https://dev.to/ezeanamichael/advanced-database-query-optimization-techniques-a-practical-approach-with-django-i8a</guid>
      <description>&lt;p&gt;Quick information retrieval is necessary in today’s fast-paced world as it affects productivity and efficiency. This is also true about applications and databases. Many applications developed work hand in hand with databases through the backend interface. Understanding query optimization is essential for maintaining scalability, lowering latency, and assuring lower expenses. This article will reveal advanced techniques for optimizing database queries, specifically on Django, and their impact on query performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is query optimization?
&lt;/h2&gt;

&lt;p&gt;Query optimization improves database speed and effectiveness by selecting the most efficient way to perform a given query. Let us understand this in a problem-solving context. Naturally, there are many different ways of solving a problem, but the most efficient way will save more time and energy, improving performance. Query optimization is just like that, improving the quality of our queries will increase database performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why optimize query?
&lt;/h2&gt;

&lt;p&gt;Query optimization is important because it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Improves application speed.&lt;/li&gt;
&lt;li&gt;Reduces server load.&lt;/li&gt;
&lt;li&gt;Enhances user experience.&lt;/li&gt;
&lt;li&gt;Reduces operating expenses by using fewer resources.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Query Optimization Techniques in Django
&lt;/h2&gt;

&lt;p&gt;The following are, but not limited to, optimization techniques in Django:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Use Database Indexes:
&lt;/h3&gt;

&lt;p&gt;Making queries on unindexed fields can cause the database to scan the entire table looking for the query, leading to slower performance. On the other hand indexed queries are faster, especially for larger datasets.&lt;/p&gt;

&lt;h4&gt;
  
  
  With Indexed field
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Querying without an index
class Book(models.Model):
    title = models.CharField(max_length=200)
    #other fields
books = Book.objects.filter(title="Django Optimization")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Without Indexed field
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Book(models.Model):
    title = models.CharField(max_length=200, db_index=True) 
     #other fields
books = Book.objects.filter(title="Django Optimization")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Use Select Related and Prefetch Related:
&lt;/h3&gt;

&lt;p&gt;Select Related and Prefetch Related are database optimization techniques for querying related objects. They help avoid the N+1 Query problem. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Selected Related&lt;/strong&gt;: This method retrieves related data via a single SQL JOIN query. It is great for single-valued connections such as ForeignKey or OneToOneField. It returns the actual instance of the linked data without using several requests.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prefetched Related&lt;/strong&gt;:&lt;br&gt;
Runs separate queries for related objects (for multi-valued relationships like ManyToManyField or reverse ForeignKey), but Django caches and connects the related data to prevent repeating the searches.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Without select related
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# N+1 queries: Fetches each related Author object for every Book
books = Book.objects.all()
for book in books:
    print(book.author.name)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  With select related
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Optimized: Single query to fetch books and their related authors
books = Book.objects.select_related('author')
for book in books:
    print(book.author.name)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Avoid N+1 Query Problem:
&lt;/h3&gt;

&lt;p&gt;The N+1 problem occurs when queries that can be performed once are done repeatedly. For example, when fetching a list of items in an object, another set of queries is executed to get a list of related entities for each item.&lt;/p&gt;

&lt;h4&gt;
  
  
  N+1 problem example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Inefficient: Queries executed inside a loop
books = Book.objects.all()
for book in books:
    reviews = book.review_set.all()  # Separate query for each book's reviews
    print(reviews)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Solution
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Optimized: Prefetch all reviews in a single query
books = Book.objects.prefetch_related('review_set')
for book in books:
    print(book.review_set.all())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Filter Early, Fetch Less Data:
&lt;/h3&gt;

&lt;p&gt;This principle is to guide in filtering or querying only needed data rather than all. Performance is improved when we query only the data we need in some instances rather than querying all before filtering.&lt;/p&gt;

&lt;h4&gt;
  
  
  Without optimization
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;books = Book.objects.all()  # Loads all records into memory
filtered_books = [b for b in books if b.published_year &amp;gt;= 2020]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  With optimization
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;filtered_books = Book.objects.filter(published_year__gte=2020)  # Query fetches only relevant data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Use Queryset Defer and Only:
&lt;/h3&gt;

&lt;p&gt;Using Defer and Only helps us load only necessary fields from the database to our application.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Defer&lt;/strong&gt;: Does not fetch the input fields in the query.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Only&lt;/strong&gt;: Retrieves only the set of fields while deferring the rest. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Without optimization
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Fetches all fields, including a large 'content' field
articles = Article.objects.all()
for article in articles:
    print(article.title)  # Only the 'title' is used, but all fields are fetched
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  With optimization
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Excludes the 'content' field from the query
articles = Article.objects.defer('content')
for article in articles:
    print(article.title)  # Fetches only the 'title' field
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Paginate Large Datasets:
&lt;/h3&gt;

&lt;p&gt;Fetching and processing large data in a database increases memory usage thereby limiting performance. Use pagination to break down into smaller chunks, this reduces memory usage and speeds up response time.&lt;/p&gt;

&lt;h4&gt;
  
  
  Without Pagination
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;books = Book.objects.all()  # Loads all records at once
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  With pagination
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.core.paginator import Paginator
paginator = Paginator(Book.objects.all(), 10)  # 10 items per page
page1 = paginator.get_page(1)  # Fetches only the first 10 records
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Cache Frequent Queries:
&lt;/h3&gt;

&lt;p&gt;Cache queries that are used frequently. This prevents recurring queries and reduces database load.&lt;/p&gt;

&lt;h4&gt;
  
  
  Without cache
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;books = Book.objects.all()  # Query hits the database each time
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  With cache
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.core.cache import cache
books = cache.get_or_set('all_books', Book.objects.all(), timeout=3600)  # Cached for 1 hour
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  8. Optimize Aggregations:
&lt;/h3&gt;

&lt;p&gt;Django provides powerful aggregation functions for querying aggregated data directly from the database. Database computations are faster than in Python, this improves speed.&lt;/p&gt;

&lt;h4&gt;
  
  
  Without Aggregations
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;products = Product.objects.all()
total_price = sum(product.price for product in products)  # Aggregation in Python
print(f"Total Price: {total_price}")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  With Aggregations
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.db.models import Sum

total_price = Product.objects.aggregate(total=Sum('price'))
print(f"Total Price: {total_price['total']}")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  9. Monitor and Profile Queries:
&lt;/h3&gt;

&lt;p&gt;To optimize database queries it's important to know how to monitor queries. This can be done using Django’s connection method. This helps to identify what slows down the database and resolve it.&lt;/p&gt;

&lt;h4&gt;
  
  
  Unmonitored query
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Blind execution without monitoring
books = Book.objects.all()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Monitored query
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.db import connection
books = Book.objects.all()
print(connection.queries)  # Shows executed queries
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  10. Use Q objects for complex queries:
&lt;/h3&gt;

&lt;p&gt;Rather than executing multiple filters during a certain query, it is better to utilize the Q object for better readability and efficiency.&lt;/p&gt;

&lt;h4&gt;
  
  
  Without Q
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;books = Book.objects.filter(title__icontains='Django').filter(author__name__icontains='Smith')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  With Q
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from django.db.models import Q
books = Book.objects.filter(Q(title__icontains='Django') | Q(author__name__icontains='Smith'))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Optimizing database queries is vital for keeping your Django application operating smoothly as it scales. Key optimization techniques, including indexing, caching, avoiding the N+1 problem, and monitoring the database regularly with tools like Django’s connection method or utilizing the Django-debug-toolbar, can ensure a faster and more efficient web application.&lt;/p&gt;

</description>
      <category>python</category>
      <category>database</category>
      <category>django</category>
      <category>sql</category>
    </item>
    <item>
      <title>Implementing Linear Regression Algorithm from scratch.</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 08 Jul 2023 13:26:41 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/implementing-linear-regression-algorithm-from-scratch-52co</link>
      <guid>https://dev.to/ezeanamichael/implementing-linear-regression-algorithm-from-scratch-52co</guid>
      <description>&lt;p&gt;The basis of many machine learning models used to make predictions today is statistics, from statistical analysis brings forth their implementation in the form of algorithms. In this article, we will look into one of sklearn’s supervised machine learning linear algorithms for predicting regression values, Linear Regression.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Linear Regression
&lt;/h2&gt;

&lt;p&gt;Linear regression is based on the concept:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Y=C+M*X&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Where C is the intercept of the plotted graph, it determines where the line touches the y-axis, Y is the target or predicted value and X represents the features, sometimes in the dataset X may vary in the sense that there may be more than one feature. Which leads to an equation.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Y=C+M1*X1+M2*X2….+Mn*Xn&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The C and M are regarded as the bias in Machine Learning because it is added to the offset of all predictions that will be made.&lt;/p&gt;

&lt;p&gt;To get the values of M.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo2zm8tgx8oqlugns06x5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo2zm8tgx8oqlugns06x5.png" alt="Statistics of coefficients" width="800" height="154"&gt;&lt;/a&gt;&lt;br&gt;
Where Xi are the different input values of X used to predict Y. and Mean are the average values of the number of X inputs.&lt;/p&gt;

&lt;p&gt;And to derive the value of the intercept c.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftll8t7531t25wnah2dpo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftll8t7531t25wnah2dpo.png" alt="Statistics of coefficients" width="644" height="114"&gt;&lt;/a&gt;    &lt;/p&gt;
&lt;h2&gt;
  
  
  Creating the Model from Scratch
&lt;/h2&gt;

&lt;p&gt;So we now have a brief review of how the model works, now to implement it.&lt;br&gt;
We start by importing the numpy library as we would make use of it for some numerical calculations. We create the class LinearRegression and set some initialized parameters.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
import numpy as np
class LinearRegression:
    def __init__(self, fit_intercept=True):
        self.fit_intercept = fit_intercept
        self.coef_ = None
        self.intercept_ = None


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The class takes in the fit_intercept parameter which determines whether an intercept term would be included in the model. Its default value would be true. The coefficients and the intercept would be set as none for default values which would later be changed when the model is trained with data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def _add_intercept(self, X):
        return np.concatenate((np.ones((X.shape[0], 1)), X), axis=1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we create a function ‘_add_intercept’ so that the model takes in the values of X and It creates an array of ones with the shape (X.shape[0], 1), representing the intercept term, and concatenates it with X along the y-axis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; def fit(self, X, y):
        if self.fit_intercept:
            X = self._add_intercept(X)
        self.coef_ = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
        if self.fit_intercept:
            self.intercept_ = self.coef_[0]
            self.coef_ = self.coef_[1:]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we create a function in the class that fits the data into the model and performs some algorithmic calculations. It uses a closed-form solution that involves the use of matrix operations. It first takes the parameters X and Y and if fit_intercept is set to True it passes X through the _add_intercept function to add an intercept term. Next is the algorithm to compute the values of the coefficients by getting the inverse of the dot products of X and the dot product of the transpose of X and get the dot product of those with y. Then if the fit_intercept function is True, it takes the first value in self.coef as the intercept, and the rest are the coefficients.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def predict(self, X):
        if self.fit_intercept:
            X = self._add_intercept(X)
        return X.dot(np.concatenate(([self.intercept_], self.coef_)))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we create predict function which takes in the value of X and if ‘fit_intercept’ is true it passes the values of X into the _add_intercept function and returns the dot product of X and the concatenated array of the intercept and coefficients.&lt;/p&gt;

&lt;p&gt;Here’s the full code below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import numpy as np
 class LinearRegression:
    def __init__(self, fit_intercept=True):
        self.fit_intercept = fit_intercept
        self.coef_ = None
        self.intercept_ = None
    def _add_intercept(self, X):
        return np.concatenate((np.ones((X.shape[0], 1)), X), axis=1)
    def fit(self, X, y):
        if self.fit_intercept:
            X = self._add_intercept(X)
        self.coef_ = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
        if self.fit_intercept:
            self.intercept_ = self.coef_[0]
            self.coef_ = self.coef_[1:]
    def predict(self, X):
        if self.fit_intercept:
            X = self._add_intercept(X)
        return X.dot(np.concatenate(([self.intercept_], self.coef_))) 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thanks for reading. Like, and comment your thoughts below.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>ai</category>
    </item>
    <item>
      <title>How to build a simple Machine Learning Clustering Model.</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 01 Jul 2023 05:42:31 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/how-to-build-a-simple-machine-learning-clustering-model-1bf6</link>
      <guid>https://dev.to/ezeanamichael/how-to-build-a-simple-machine-learning-clustering-model-1bf6</guid>
      <description>&lt;p&gt;Clustering is an unsupervised machine learning concept which groups or rows of data based on their features or properties in a dataset. &lt;br&gt;
In Unsupervised machine learning, there is no target feature, which means unlike supervised machine learning models that predicts a value ,clustering don’t, instead you only have the dataset to classify into separate clusters.Clustering techniques are applied to target segmentation, social network analysis, image segmentation and so on. In this article, we’ll explore one of sklearn's unsupervised machine learning algorithms which is KMeans.The link for the dataset is &lt;a href="https://www.kaggle.com/datasets/themrityunjaypathak/covid-cases-and-deaths-worldwide" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Training the model.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Import useful libraries.
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;import numpy as np&lt;br&gt;
import pandas as pd&lt;br&gt;
import matplotlib.pyplot as plt&lt;br&gt;
import seaborn as sns&lt;br&gt;
from sklearn.cluster import KMeans&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We start by importing useful libraries:&lt;br&gt;
&lt;strong&gt;Numpy:&lt;/strong&gt; A python numerical library for statistical calculations.&lt;br&gt;
&lt;strong&gt;Pandas:&lt;/strong&gt;A python library for reading and manipulation of the dataset in csv format.&lt;br&gt;
&lt;strong&gt;Seaborn:&lt;/strong&gt;A python visualization library.&lt;br&gt;
&lt;strong&gt;Sklearn:&lt;/strong&gt;A python library for manipulation of data and supervised and unsupervised machine learning concepts. In this article, we’ll consider KMeans classifier, to segment a group of data into different classes.&lt;br&gt;
Reading the data.&lt;br&gt;
After importing the necessary modules, we read the data.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;data=pd.read_csv('/Users/user/Downloads/covid_worldwide.csv')&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We can check the contents of our data by using the pandas head command.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;data.head()&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F923azis2lomgzhjr68mp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F923azis2lomgzhjr68mp.png" alt="Pandas head cluster" width="800" height="189"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Expository Data Analysis
&lt;/h2&gt;

&lt;p&gt;We find out more about the data here.To get data and understand the columns we use pandas info method, to get the datatypes and number of data in each column.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;data.info()&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F03vwnjed399z5zc7ee8x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F03vwnjed399z5zc7ee8x.png" alt="pandas info cluster" width="800" height="331"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the output, we notice some significant issues with the dataset.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;There are some missing entries or values in some columns.&lt;/li&gt;
&lt;li&gt;The columns are of datatype ‘object’, so we need to convert them to integers.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Data Cleaning And Preparation
&lt;/h2&gt;

&lt;p&gt;In order to make our data more readable and to use it to train a clustering model, we need to solve those issues stated above.&lt;br&gt;
To handle the missing values, we can drop the missing roles(This method is not advisable in some cases as it results in data loss, in future articles, we’ll see other ways we can handle missing values).&lt;br&gt;
To drop the missing columns, Execute:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;data.dropna(inplace=True)&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Next, we want to remove the commas(,) present in the numbers in order to convert them to integers. We do so with the simple commas which takes all commas in the dataset.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;data.replace(",","",regex=True , inplace=True)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We then collect the useful numerical columns into a variable named X.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;X=data.drop(['Serial Number','Country'], axis=1)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;So we convert the remaining column’s datatype to floats using the code below.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;def tofloat(X,col):&lt;br&gt;
    X[col]=X[col].astype(float)&lt;br&gt;
    return X[col]&lt;br&gt;
for col in X.columns:&lt;br&gt;
    X[col]=tofloat(X,col)&lt;/code&gt;&lt;br&gt;
Then we use the pandas info() method to check our data again.&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ketdwt1ol9x5uiei9k1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ketdwt1ol9x5uiei9k1.png" alt="Pandas info cluster" width="800" height="219"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can also use the pandas describe function to see the statistical inferences of the data.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;X.describe()&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw1rnl2f37zwmjcrkq58d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw1rnl2f37zwmjcrkq58d.png" alt="Pandas statistical values" width="800" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Building
&lt;/h2&gt;

&lt;p&gt;To train the model, we first load the model.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;kmeans=KMeans(n_clusters=2,n_init=10)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The n_clusters parameter is to specify the number of clusters or number of groups you want to create from the dataset. The n_init parameter states the number of times Kmeans would run from different starting points in the dataset.&lt;/p&gt;

&lt;p&gt;The next few lines of code, we use the fit predict function to both fit the dataset into the model and collect the prediction as a new column in X as a categorical datatype.We use the total cases and total deaths of covid cases to classify them.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;X["Cluster"] = kmeans.fit_predict(X[['Total Cases','Total Deaths']])&lt;br&gt;
X["Cluster"] = X["Cluster"].astype("category")&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We can visualize the cluster in a simple way with the code.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sns.relplot(&lt;br&gt;
    x="Total Cases", y="Total Deaths", hue="Cluster", data=X, height=6,&lt;br&gt;
);&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcy1uu7r70oe2tf43mn6m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcy1uu7r70oe2tf43mn6m.png" alt="seaborn cluster" width="800" height="574"&gt;&lt;/a&gt;&lt;br&gt;
Hope you enjoyed reading this article as i did writing it, like and feel free to comment your thoughts below.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>python</category>
      <category>beginners</category>
    </item>
    <item>
      <title>How to build a simple Machine Learning Classification Model.</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 24 Jun 2023 06:08:52 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/how-to-build-a-simple-machine-learning-classification-model-1l95</link>
      <guid>https://dev.to/ezeanamichael/how-to-build-a-simple-machine-learning-classification-model-1l95</guid>
      <description>&lt;p&gt;When you hear the word classify, what comes to mind is a group of things based on their differences. The same is so when it comes to machine learning, based on features and data gathered, the machine learning model can learn to distinguish between different classes through different patterns found by the machine learning model. In this article, you’ll learn how to build a supervised classification machine-learning model.&lt;br&gt;
The dataset that would be used is a loaded dataset obtained from kaggle. The model we train in this article would be able to predict the possibility of a person repaying his/her loan based on their circumstances, Y is for loan approval and N is for loan declined.&lt;/p&gt;

&lt;h2&gt;
  
  
  Training the model.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Import useful libraries.
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;&lt;br&gt;
import numpy as np&lt;br&gt;
import pandas as pd&lt;br&gt;
import matplotlib as plt&lt;br&gt;
from sklearn.model_selection import train_test_split&lt;br&gt;
from sklearn.ensemble import RandomForestClassifier&lt;br&gt;
from sklearn.metrics import ConfusionMatrixDisplay, confusion_matrix, classification_report&lt;br&gt;
import seaborn as sns&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
We import &lt;strong&gt;Numpy&lt;/strong&gt;, a Python based library used for numerical calculations.&lt;br&gt;
&lt;strong&gt;Pandas&lt;/strong&gt;: To read the dataset and store it in data frame format in order to perform analysis, cleaning, and some calculations in the data frame.&lt;br&gt;
&lt;strong&gt;Sklearn&lt;/strong&gt;: Also known as Scikit-learn, which is one of the most popular libraries which contains several classes and some functions for analysis and machine learning models, One of which we’ll use in the article is the Random Forest Classifier. The Random forest classifier model is a very powerful classification model which can achieve high accuracy with little or no hyperparameter tuning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reading the data.
&lt;/h2&gt;

&lt;p&gt;Using the pandas' library which has been imported as ‘pd’, we import the dataset using the datasets file path.&lt;br&gt;
&lt;code&gt;df=pd.read_csv('/Users/user/Documents/loan_sanction_train.csv')&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
To check if you’ve imported the file correctly type:&lt;br&gt;
&lt;code&gt;df.head()&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftfhw8s7ocddgodtsrzj9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftfhw8s7ocddgodtsrzj9.png" alt="Dataframe head" width="800" height="152"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Expository Data Analysis
&lt;/h2&gt;

&lt;p&gt;Let's find out more about our data before we start training the model.&lt;br&gt;
&lt;code&gt;df.info()&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frzg6h6691onrubytba7v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frzg6h6691onrubytba7v.png" alt="dataframe info" width="800" height="376"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From above we deduce the presence of 614 rows and 13 columns. We can also see some values are missing in them, to see the number of values missing in each column use,&lt;br&gt;
&lt;code&gt;df.isnull().sum()&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1fk9gvpdbhfyfcp07ck.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1fk9gvpdbhfyfcp07ck.png" alt="pandas null values check" width="800" height="243"&gt;&lt;/a&gt;&lt;br&gt;
We’ll discuss how to handle missing values in later articles, for now, let's drop all rows with missing values.&lt;br&gt;
&lt;code&gt;df=df.dropna()&lt;/code&gt;&lt;br&gt;
We can get the summary of statistical values which include, mean, minimum value, maximum value, standard deviation, 25th percentile, 50th percentile(mid), 75th percentile, and number of the numerical columns in the distribution.&lt;br&gt;
&lt;code&gt;df.describe()&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzh87x8o7yyhsh5adxny8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzh87x8o7yyhsh5adxny8.png" alt="dataset statistics" width="800" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Cleaning And Preparation
&lt;/h2&gt;

&lt;p&gt;Machine learning models only accept numerical and boolean (true or false) values, therefore in our features or columns, we’ll need to convert all strings or objects to integers, floats, or boolean values.&lt;br&gt;
We can check the various categories in the feature with the code.&lt;br&gt;
Example:&lt;br&gt;
&lt;code&gt;df['Property_Area'].value_counts()&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fojyoq6nw3e3vlrdfo4jr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fojyoq6nw3e3vlrdfo4jr.png" alt="dataframe column unique values" width="800" height="108"&gt;&lt;/a&gt;&lt;br&gt;
We can then convert all the features that aren’t numerical or boolean by creating a function, calling the df.astype(“category”).cat.codes, and passing the dataframe and columns through it.&lt;br&gt;
&lt;code&gt;def category_val(df,col):&lt;br&gt;
    df[col]=df[col].astype('category')&lt;br&gt;
    df[col]=df[col].cat.codes&lt;br&gt;
    return df[col]&lt;br&gt;
df['Gender']=category_val(df,'Gender')&lt;br&gt;
df['Married']=category_val(df,'Married') &lt;br&gt;
df['Education']=category_val(df,'Education')&lt;br&gt;
df['Property_Area']=category_val(df,'Property_Area')&lt;br&gt;
df['Self_Employed']=category_val(df,'Self_Employed')&lt;br&gt;
df['Dependents']=category_val(df,'Dependents')&lt;/code&gt;&lt;br&gt;
You can confirm by either the value_counts function to see your new categorical values.&lt;br&gt;
&lt;code&gt;df['Property_Area'].value_counts()&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3o50gtb3r407bp3wd5dj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3o50gtb3r407bp3wd5dj.png" alt="value checker" width="800" height="89"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Building
&lt;/h2&gt;

&lt;p&gt;Now that we’ve cleaned our data a bit, let’s split the features and target variable, lets's say x are our features and y is our target.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;x= df.drop(['Loan_ID','Loan_Status'], axis =1)&lt;br&gt;
y=df['Loan_Status']&lt;/code&gt;&lt;br&gt;
We use the dataframe.drop function to remove the feature that won’t be used and the target feature, then we collect the target feature into y.&lt;/p&gt;

&lt;p&gt;Next, we would split our data into a training set and a test set. To check how our model would perform on some real-world data, we use a test split, we divide our dataset into a training set and test set. We train our model with the training set and evaluate how it performs on the test set.&lt;br&gt;
To split our model, we use the train_test_split function setting our test size to 30%(generally it's advisable to use 20%-30% of our dataset for the test set, so as not to lose too much data when training the model).&lt;/p&gt;

&lt;p&gt;&lt;code&gt;x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=True)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Next, we load our classification model to be used, which in this case, is the random forest classifier.&lt;br&gt;
The random forest classifier is built off multiple decision trees, which is a model that branches to a particular result based on certain parameters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhrgjucaqasiylbbpjbch.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhrgjucaqasiylbbpjbch.png" alt=" " width="800" height="536"&gt;&lt;/a&gt;&lt;br&gt;
The random forest model is then made of multiple decision trees, which makes thousands of branches on features and evaluates outcomes.&lt;br&gt;
&lt;code&gt;rf=RandomForestClassifier()&lt;/code&gt;&lt;br&gt;
After loading our model, we fit the training set into our model.&lt;br&gt;
&lt;code&gt;rf.fit(x_train,y_train)&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Evaluation
&lt;/h2&gt;

&lt;p&gt;There are several ways to evaluate our machine learning in this article, we’ll consider one.&lt;br&gt;
&lt;strong&gt;Accuracy&lt;/strong&gt;&lt;br&gt;
Accuracy determines how well your model performed when it predicts the test target data compared with the actual target results.&lt;br&gt;
We can check our model accuracy after fitting the model with the simple code.&lt;br&gt;
&lt;code&gt;rf.score(x_test,y_test)&lt;/code&gt;&lt;br&gt;
This gives the score of 0.743 which when converted is 74.3%.&lt;/p&gt;

&lt;p&gt;That's it, you’ve successfully trained your first machine-learning model. In future articles, we’ll consider how to train unsupervised machine learning models, how to deploy them, and other ways of evaluating machine learning models.&lt;br&gt;
Give a like and feel free to comment your thoughts down below.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>programming</category>
      <category>python</category>
    </item>
    <item>
      <title>How to build a simple Machine Learning Regression Model.</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 17 Jun 2023 06:02:14 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/how-to-build-a-simple-machine-learning-regression-model-1dog</link>
      <guid>https://dev.to/ezeanamichael/how-to-build-a-simple-machine-learning-regression-model-1dog</guid>
      <description>&lt;p&gt;In this article, I'll provide a step to step method of building a regression model using sklearn's linear regression.&lt;br&gt;
A regression model is a supervised machine learning model which predicts numerical values based on numeric or boolean inputs and data provided., for example house pricing prediction.&lt;br&gt;
In this article, we'll be using the dataset obtained from &lt;a href="https://www.kaggle.com/datasets/mirichoi0218/insurance" rel="noopener noreferrer"&gt;kaggle&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Training the model.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Import useful libraries.
&lt;/h2&gt;

&lt;p&gt;First, you import the useful functions we'll make use of and read the CSV file using pandas.&lt;br&gt;
&lt;code&gt;&lt;br&gt;
import numpy as np&lt;br&gt;
import pandas as pd&lt;br&gt;
import matplotlib.pyplot as plt&lt;br&gt;
import seaborn as sns&lt;br&gt;
from sklearn.model_selection import train_test_split&lt;br&gt;
from sklearn.linear_model import LinearRegression&lt;br&gt;
from sklearn.metrics import mean_squared_error&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Numpy&lt;/strong&gt; is a numerical library in python which contains faster access to some statistical calculations.&lt;br&gt;
Pandas is a python library which is used for accessing and manipulating the dataset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Matplotlib&lt;/strong&gt; and &lt;strong&gt;seaborn&lt;/strong&gt; are visualization libraries which we would use in later articles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sklearn&lt;/strong&gt; is a python library that contains several machine learning models and tools for model evaluation. Today we’ll be using one of the libraries which is the linear regression model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reading the data.
&lt;/h2&gt;

&lt;p&gt;First you can read the data by using pandas read csv method.&lt;br&gt;
&lt;code&gt;data=pd.read_csv(‘/Users/user/loan_sanction_train.csv’)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;You can find out more about the data by checking the first 5 rows using the data.head() method.&lt;br&gt;
&lt;code&gt;data.head()&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faf9hwc0td97kyu26y0ts.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faf9hwc0td97kyu26y0ts.png" alt="pandas dataframe head" width="800" height="236"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Expository Data Analysis
&lt;/h2&gt;

&lt;p&gt;To get more information about the data we can check more by using the pandas info method. &lt;/p&gt;

&lt;p&gt;&lt;code&gt;data.info()&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2nxuwqbdw5n6gibhcs92.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2nxuwqbdw5n6gibhcs92.png" alt="pandas dataframe information" width="800" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From there, we read about the number of columns and rows present in the dataset as well as their respective datatypes.&lt;br&gt;
To get statistical inference from the dataset, we use the pandas describe method.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;data.describe()&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw8w7yvgiqfquttco7fyg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw8w7yvgiqfquttco7fyg.png" alt="pandas dataframe describe statistics" width="800" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the output seen, we can see the overall statistics in the dataset, which contains the mean, median,25th percentile, 75th percentile, standard deviation, minimum and maximum value.&lt;br&gt;
I'll provide more explanation of these as well as useful visualizations in a future article on EDA(Expository data analysis).&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Cleaning and Preparation
&lt;/h2&gt;

&lt;p&gt;To train your model you need to convert some of the categorical values into numerical variables. You can then transform each column using pandas value counts to see each unique values in a column.&lt;br&gt;
Example:&lt;br&gt;
&lt;code&gt;data[‘sex’].value_counts()&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdlesqr65x07ba86snvym.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdlesqr65x07ba86snvym.png" alt="pandas dataframe valuecounts" width="800" height="77"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can create a function that takes the data and column and then gets an assigned number to the particular category to convert all categorical variables into numeric datatypes.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;def category_val(df,col):&lt;br&gt;
    df[col]=df[col].astype('category')&lt;br&gt;
    df[col]=df[col].cat.codes&lt;br&gt;
    return df[col]&lt;br&gt;
data['sex']=category_val(data,'sex')&lt;br&gt;
data['smoker']=category_val(data,'smoker') &lt;br&gt;
data['region']=category_val(data,'region')&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Building.
&lt;/h2&gt;

&lt;p&gt;Next, You separate your data into x and y where y is the target variable.&lt;br&gt;
X takes the rest of the features since we're using all the features to predict the target variable y which will only contain the target column.&lt;br&gt;
&lt;code&gt;x=data.drop('charges', axis=1)&lt;br&gt;
y=data['charges']&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Next, we split our values of x and y into the training set and test set, the training set is the set used to train the model while the test set is used to evaluate the model based on what it's learnt in the training set. It is advisable to use between 20% to 30% of our data for the test set and the remaining 70% to 80% for our training set.in this article, we’ll use 30% for our test set.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3, random_state=42)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We then load the linear regression model and fit our training set data into it&lt;/p&gt;

&lt;p&gt;&lt;code&gt;model=LinearRegression()&lt;br&gt;
model.fit(x_train,y_train)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We’ve successfully trained our machine learning model. Now how do we evaluate our data with the test set, there are several tools we can use to do this, more would be provided in future articles, we’ll use the mean squared error in this article.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Evaluation.
&lt;/h2&gt;

&lt;p&gt;To evaluate the model, we’ll need to compare what the model predicts as the y_test values with the actual y_test results.&lt;br&gt;
We create a variable named y_pred and use the model to predict the x_test values.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;y_pred=model.predict(x_test)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Finally, we evaluate the values predicted using mean squared error.&lt;br&gt;
&lt;code&gt;mean_squared_error(y_test,y_pred)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Which gives a result of&lt;br&gt;
33805466.898688614&lt;/p&gt;

&lt;p&gt;there you have it, you just trained your first regression model, feels great?&lt;br&gt;
Here's what happens underneath.&lt;/p&gt;

&lt;p&gt;Linear regression uses an algorithm similar to the equation of a line &lt;br&gt;
Y=mx+c &lt;br&gt;
C is the intercept while x is a feature, m is the gradient and y is the target.&lt;/p&gt;

&lt;p&gt;We can find how these relate using model.coef_ and model.intercept_&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fboayis0zitmhx1qdbarw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fboayis0zitmhx1qdbarw.png" alt="machine learning regression coeffecient and intercept" width="800" height="129"&gt;&lt;/a&gt;&lt;br&gt;
The intercept is given as -12364.39 &lt;br&gt;
And the coefficients were given in form of an array&lt;br&gt;
Its easier to then represent the equation generated as this&lt;/p&gt;

&lt;p&gt;charges=-12364.39+(age*261.63)+(sex*109.65)+(bmi*344.54)+(children*424.37)+(smoker*23620.80)+(region*-326.46)&lt;/p&gt;

&lt;p&gt;Hope you enjoyed this article, like, share, and comment your thoughts thanks.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>programming</category>
      <category>python</category>
    </item>
    <item>
      <title>Machine learning overview</title>
      <dc:creator>Ezeana Micheal</dc:creator>
      <pubDate>Sat, 10 Jun 2023 05:06:05 +0000</pubDate>
      <link>https://dev.to/ezeanamichael/machine-learning-overview-4d51</link>
      <guid>https://dev.to/ezeanamichael/machine-learning-overview-4d51</guid>
      <description>&lt;p&gt;After the gathering, cleaning, and preprocessing of data to understand and recover insight from it, one way of implementing these is through the development of machine learning models.&lt;/p&gt;

&lt;p&gt;Machine learning can be said to be a subfield in the data science field as well as in AI(artificial intelligence), as the name implies a computer can learn and make predictions based on what its data learned. This is important as it has led to the creation of important models like weather forecasts, fraud detection, and so on. &lt;br&gt;
There are three types of machine learning&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supervised machine learning&lt;/li&gt;
&lt;li&gt;Unsupervised machine learning &lt;/li&gt;
&lt;li&gt;Reinforcement machine learning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Supervised machine learning:&lt;/strong&gt; This is the type of machine learning whereby it has a target value. This consists of data used to make the prediction (independent variables) and a target value to be predicted (dependent variables).&lt;br&gt;
There are two types of supervised machine learning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Regression&lt;/li&gt;
&lt;li&gt;Classification&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Regression:&lt;/strong&gt; Regression is the aspect of supervised machine learning that deals with the prediction of quantitative or numerical data. In this type of supervised machine learning, all values are numeric and it plots them(the features of the data to be used) on a graph and then draws a line over them, this line can be called the line of best fit. This prediction algorithm can be in the form y=mx+c (equation of a line) where x is the feature(provided only one feature was used for the prediction), and m and c are constants obtained from the findings of the computer. For example, y=26x+7 takes the feature value in new data x=15 and then uses the formula to apply it y=26*15+7, y=397&lt;br&gt;
When the number of features is increased it can be written as y=c+mx, mx2, mx3...&lt;br&gt;
This can be used when predicting house price values or other numerical operations. A model that does this is Sk-learn's Linear Regression model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Classification:&lt;/strong&gt; Unlike regression, the classification uses qualitative values as target values. The data consists of values that are numerical or boolean while the target values are the classified value names to be predicted, each feature is evaluated and separated by some complex algorithm by the computer so when it evaluates each of the features of the test data it will fit it into one of the categories listed. This is done in various real-life scenarios like a fraud detection model or an image classification model. There are several models for classification like a Decision tree classifier, logistic regression(it says regression but it is used to build classification models), K neighbors classifier, Random forest classifier, and so on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unsupervised machine learning:&lt;/strong&gt; This is a type of machine learning algorithm where there is no target value, but the computer, finding matches between them, classifies or groups the data in clusters based on their values. As mentioned above there is no target value but the aim of the computer is to find a connection between each value, this is used in algorithms like movie recommendation systems, based on the movies and get as you watch you might like. This modelling technique is called clustering. One of Sk-learn's clustering algorithms is KMeans clustering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reinforcement machine learning:&lt;/strong&gt; this is an aspect of machine learning where the computer learns to interact with the environment, examples of these are games like chess, in which the computer doesn't know the game but learns how to play it with time according to the rules of the game, another example is self-driving cars which have motion sensors in them which help the computer learn how to move under certain scenarios.&lt;/p&gt;

&lt;p&gt;The four main steps or lines in a model building in machine learning after the visualisation and preprocessing are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Separate the processed data into the train set and test set.&lt;/li&gt;
&lt;li&gt;Load or import the model to be used.&lt;/li&gt;
&lt;li&gt;Use the training set to train the model.&lt;/li&gt;
&lt;li&gt;Evaluate the model using the appropriate method with the test set.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;From simple modelling like the prediction of house pricing to complex ones like image recognition, machine learning has automated a lot of tasks and improved Information Technology in a lot of ways.&lt;br&gt;
Thanks for reading this article, like, share, and comment on what you think thanks.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>beginners</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
