<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Navas Herbert</title>
    <description>The latest articles on DEV Community by Navas Herbert (@navashub).</description>
    <link>https://dev.to/navashub</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2996501%2F64ad8d7a-7a10-48b4-8d1f-59aefcd93ae0.jpeg</url>
      <title>DEV Community: Navas Herbert</title>
      <link>https://dev.to/navashub</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/navashub"/>
    <language>en</language>
    <item>
      <title>My First Week with SQL: A Beginner's Guide to Building, Filling, and Querying a Real Database</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Sat, 11 Apr 2026 17:52:05 +0000</pubDate>
      <link>https://dev.to/navashub/my-first-week-with-sql-a-beginners-guide-to-building-filling-and-querying-a-real-database-294p</link>
      <guid>https://dev.to/navashub/my-first-week-with-sql-a-beginners-guide-to-building-filling-and-querying-a-real-database-294p</guid>
      <description>&lt;p&gt;&lt;em&gt;From CREATE TABLE to CASE WHEN - everything I learned this week, explained simply&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction - Why SQL?
&lt;/h2&gt;

&lt;p&gt;If you have ever wondered how apps like M-Pesa know your balance, how a hospital tracks your records, or how a school manages thousands of student results - the answer is almost always a &lt;strong&gt;database&lt;/strong&gt;. And the language used to talk to those databases is &lt;strong&gt;SQL&lt;/strong&gt; - Structured Query Language.&lt;/p&gt;

&lt;p&gt;This week I started learning SQL from scratch. By the end of the week, I could build a database from nothing, fill it with real data, search through it, filter results, and even create custom labels for every row. This article walks through everything I learned - written in plain English so that any complete beginner can follow along.&lt;/p&gt;

&lt;p&gt;We built our practice database around &lt;strong&gt;Nairobi Academy&lt;/strong&gt; - a fictional secondary school in Nairobi. Three tables: students, subjects, and exam results. Let's go through everything step by step.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 1 - Building the Database (DDL)
&lt;/h2&gt;

&lt;p&gt;Before you can store any data, you need to &lt;strong&gt;create the structure&lt;/strong&gt; that will hold it. This is called &lt;strong&gt;DDL - Data Definition Language&lt;/strong&gt;. Think of it like building the shelves before you put any books on them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1 - Create a Schema
&lt;/h3&gt;

&lt;p&gt;A schema is a container - it groups all your tables together in one named space. Think of it like a folder on your computer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="n"&gt;nairobi_academy&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;set&lt;/span&gt; &lt;span class="n"&gt;search_path&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;nairobi_academy&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;CREATE SCHEMA makes the folder. set search_path tells SQL to go inside it.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2 - Create Tables
&lt;/h3&gt;

&lt;p&gt;A table is where data actually lives - like a spreadsheet with rows and columns. When creating a table you define every column, what type of data it holds, and what rules it must follow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;student_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;gender&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;date_of_birth&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key words to know:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PRIMARY KEY&lt;/strong&gt; - a unique ID for every row. No two students can share the same student_id.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NOT NULL&lt;/strong&gt; - this field is required. You cannot add a student without a first_name.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VARCHAR(n)&lt;/strong&gt; - text up to n characters long.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;INT&lt;/strong&gt; - a whole number. Perfect for IDs and counts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DATE&lt;/strong&gt; - a calendar date stored as YYYY-MM-DD.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 3 - Modify Tables with ALTER TABLE
&lt;/h3&gt;

&lt;p&gt;Sometimes after creating a table you realise something needs to change. &lt;strong&gt;ALTER TABLE&lt;/strong&gt; lets you add, rename, or remove columns without deleting the whole table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Add a column&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;phone_number&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Rename a column&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;subjects&lt;/span&gt; &lt;span class="k"&gt;RENAME&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;credits&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;credit_hours&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Remove a column completely&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;phone_number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;** Warning:** DROP COLUMN permanently removes the column and all its data. Always double-check before running it - there is no undo.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Part 2 - Filling the Database (DML)
&lt;/h2&gt;

&lt;p&gt;Once the tables exist, we fill them with data. This is called &lt;strong&gt;DML - Data Manipulation Language&lt;/strong&gt;. Think of it like finally putting the books on the shelves.&lt;/p&gt;

&lt;h3&gt;
  
  
  INSERT INTO - Adding Rows
&lt;/h3&gt;

&lt;p&gt;This is how you add data into a table. You list the columns you are filling, then provide the values in the same order.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;date_of_birth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Amina'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Wanjiku'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-03-12'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Brian'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Ochieng'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2007-07-25'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mombasa'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Cynthia'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mutua'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-11-05'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kisumu'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;** Tips:** Text values always go in single quotes: 'Nairobi'. Numbers do not need quotes: 1, 2, 3. You can insert multiple rows at once by separating each set with a comma.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  UPDATE - Changing Existing Data
&lt;/h3&gt;

&lt;p&gt;When data needs to change - like a student moving to a different city - you use &lt;strong&gt;UPDATE&lt;/strong&gt;. Always include &lt;strong&gt;WHERE&lt;/strong&gt; to target only the specific row you want to change.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;** Golden rule:** NEVER run UPDATE without WHERE. Without it, SQL updates every single row in the table. Always target specific rows using WHERE.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  DELETE - Removing Rows
&lt;/h3&gt;

&lt;p&gt;To remove a specific row from a table, use &lt;strong&gt;DELETE FROM&lt;/strong&gt;. Again - always use WHERE.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;result_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Part 3 - Finding What You Need (Filtering with WHERE)
&lt;/h2&gt;

&lt;p&gt;Just pulling all the data with SELECT * is rarely useful in real life. You almost always need to &lt;strong&gt;filter&lt;/strong&gt; - to tell SQL: give me only the rows that match my conditions. All filtering uses the &lt;strong&gt;WHERE clause&lt;/strong&gt; followed by an operator.&lt;/p&gt;

&lt;h3&gt;
  
  
  Comparison Operators - The Basics
&lt;/h3&gt;

&lt;p&gt;These compare a column value against something specific:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find all Form 4 students&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Find exam results above 70&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Find students NOT from Nairobi&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AND / OR / NOT - Combining Conditions
&lt;/h3&gt;

&lt;p&gt;Sometimes one condition is not enough. &lt;strong&gt;AND&lt;/strong&gt; requires both conditions to be true. &lt;strong&gt;OR&lt;/strong&gt; requires at least one to be true. &lt;strong&gt;NOT&lt;/strong&gt; flips a condition.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- AND: Form 3 students from Nairobi only&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- OR: Form 2 or Form 4 students&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 2'&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  BETWEEN - Checking a Range
&lt;/h3&gt;

&lt;p&gt;Instead of writing &amp;gt;= and &amp;lt;= separately, BETWEEN is a clean shortcut. It is &lt;strong&gt;inclusive&lt;/strong&gt; - both the lower and upper values are included.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Marks between 50 and 80 (includes 50 and 80)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Exams in a date range&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;exam_date&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="s1"&gt;'2024-03-15'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="s1"&gt;'2024-03-18'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  IN and NOT IN - Matching a List
&lt;/h3&gt;

&lt;p&gt;When you want to match any value from a list, &lt;strong&gt;IN&lt;/strong&gt; is much cleaner than writing many OR conditions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Instead of: WHERE city = 'Nairobi' OR city = 'Mombasa' OR city = 'Kisumu'&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mombasa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kisumu'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- NOT IN: exclude Form 2 and Form 3&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Form 2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  LIKE - Searching for Patterns
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;LIKE&lt;/strong&gt; lets you search for patterns inside text. The &lt;strong&gt;%&lt;/strong&gt; symbol means 'any number of characters'.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Starts with 'A'&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'A%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Contains the word 'Studies'&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subjects&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;subject_name&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'%Studies%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Ends with 'i'&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'%i'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  COUNT - Counting Rows
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;COUNT(*)&lt;/strong&gt; tells you how many rows match your condition. Very useful for quick summaries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- How many students are in Form 3?&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;form3_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Part 4 - Smart Labels with CASE WHEN
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;CASE WHEN&lt;/strong&gt; is SQL's way of saying &lt;strong&gt;'if this, then that'&lt;/strong&gt;. It lets you create a new column in your results with a label or category based on conditions you define. The original table is never changed - you are just adding a label when you SELECT the data.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Basic Structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;condition1&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'result1'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;condition2&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'result2'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'default_result'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;new_column_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;your_table&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Real Example - Labelling Exam Results
&lt;/h3&gt;

&lt;p&gt;Instead of showing just the number, let's label each result as Distinction, Merit, Pass, or Fail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;result_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Distinction'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Merit'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Pass'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Fail'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;performance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;** Important - Order matters!** SQL checks conditions from top to bottom and stops at the first one that is TRUE. Always put the most specific (highest) condition first. If you put marks &amp;gt;= 40 first, every result above 40 would get 'Pass' and never reach Merit or Distinction.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Labelling Students as Senior or Junior
&lt;/h3&gt;

&lt;p&gt;We can use IN inside CASE WHEN to check multiple values at once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Senior'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Junior'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;student_level&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Part 5 - What I Learned This Week
&lt;/h2&gt;

&lt;p&gt;Looking back at the week, here are the most important things that stuck with me:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Golden Rules I Will Never Forget
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Always use WHERE with UPDATE and DELETE&lt;/strong&gt; - without it you change or delete every single row in the table.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text values need single quotes&lt;/strong&gt; - 'Nairobi' &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Numbers never need quotes&lt;/strong&gt; - WHERE marks &amp;gt; 70 &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Every SQL statement ends with a semicolon ( ; )&lt;/strong&gt; - think of it as a full stop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BETWEEN is inclusive&lt;/strong&gt; - BETWEEN 50 AND 80 includes 50 and 80 themselves.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CASE WHEN checks top to bottom&lt;/strong&gt; - put the most specific condition first, not last.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LIKE with % is flexible&lt;/strong&gt; - 'A%' starts with A, '%A' ends with A, '%A%' contains A.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IN is cleaner than many ORs&lt;/strong&gt; - IN ('A','B','C') vs city='A' OR city='B' OR city='C'.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Moments That Made It Click
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;librarian analogy&lt;/strong&gt; was what made SQL make sense to me. You do not understand everything inside a library - but you can walk up to the librarian and say 'I need books about cooking published after 2020, arranged by title'. SQL is exactly that. You describe what you want and the database finds it.&lt;/p&gt;

&lt;p&gt;The moment I ran my first &lt;strong&gt;CASE WHEN&lt;/strong&gt; and saw 'Distinction', 'Merit', 'Pass', 'Fail' appear next to marks - instead of just numbers  it felt like the data was finally speaking in human language. That was genuinely exciting.&lt;/p&gt;

&lt;p&gt;And the first time I made the mistake of running UPDATE without WHERE - and saw all the cities change to the same value - I understood immediately why that golden rule exists.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Is Coming Next
&lt;/h3&gt;

&lt;p&gt;Next week we go deeper:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Row-level functions&lt;/strong&gt; - UPPER, LENGTH, ROUND, DATE_FORMAT and more&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CAST and formatting&lt;/strong&gt; - converting between data types&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JOINs&lt;/strong&gt; - combining data from multiple tables at once (this is where SQL gets really powerful)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;If you are a complete beginner reading this - SQL is not as scary as it looks. The commands are written in almost plain English. &lt;strong&gt;SELECT&lt;/strong&gt; means 'get'. &lt;strong&gt;FROM&lt;/strong&gt; means 'from'. &lt;strong&gt;WHERE&lt;/strong&gt; means 'but only where'. &lt;strong&gt;INSERT INTO&lt;/strong&gt; means 'add this to'. Once you see the pattern, it feels natural very quickly.&lt;/p&gt;

&lt;p&gt;The best advice I can give from one week of learning: &lt;strong&gt;type every query yourself&lt;/strong&gt;. Do not just read examples. Open your SQL tool, build the table, insert the rows, run the filters. The mistakes you make while doing it are worth more than a hundred examples you only read.&lt;/p&gt;

&lt;p&gt;See you in the next article - where I will cover JOINs.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>database</category>
      <category>sql</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Day 1 Internship Report</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Mon, 06 Oct 2025 19:44:21 +0000</pubDate>
      <link>https://dev.to/navashub/day-1-internship-report-jci</link>
      <guid>https://dev.to/navashub/day-1-internship-report-jci</guid>
      <description>&lt;h2&gt;
  
  
  Africa Energy Portal Data Extraction and MongoDB Integration
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Intern:&lt;/strong&gt; Navas Herbert&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Date:&lt;/strong&gt; October 6, 2025&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Project:&lt;/strong&gt; Energy Data Collection and Storage System&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/Navashub/lux-internship/tree/main/energytest1" rel="noopener noreferrer"&gt;https://github.com/Navashub/lux-internship/tree/main/energytest1&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Executive Summary
&lt;/h2&gt;

&lt;p&gt;Successfully developed aa complete ETL (Extract, Transform, Load) pipeline to collect energy-related data from the Africa Energy Portal for all 54 African countries. The data has been successfully stored in MongoDB Atlas (database: &lt;code&gt;energyd2&lt;/code&gt;, collection: &lt;code&gt;test&lt;/code&gt;) and is fully queryable with appropriate indexes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Achievements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Web scraping system for 54 African countries&lt;/li&gt;
&lt;li&gt;✅ Complete data transformation pipeline (wide to long format)&lt;/li&gt;
&lt;li&gt;✅ Successful MongoDB integration with 6 documents loaded&lt;/li&gt;
&lt;li&gt;✅ Database query functionality confirmed (see attached screenshot)&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Project Objective
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Goal:&lt;/strong&gt; Extract energy-related data from the Africa Energy Portal (&lt;a href="https://africa-energy-portal.org/" rel="noopener noreferrer"&gt;https://africa-energy-portal.org/&lt;/a&gt;) for all African countries spanning 2000–2024 and store it in a MongoDB collection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Required Schema:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;["country", "country_serial", "metric", "unit", "sector", "sub_sector", 
 "sub_sub_sector", "source_link", "source", "2000", "2001", ..., "2024"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Technical Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. &lt;strong&gt;Data Extraction (&lt;code&gt;scraper_complete.py&lt;/code&gt;)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Technology Stack:&lt;/strong&gt; Python, Selenium WebDriver, Pandas&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automated browser navigation using Selenium Chrome WebDriver&lt;/li&gt;
&lt;li&gt;Visited all 54 African country pages systematically&lt;/li&gt;
&lt;li&gt;Extracted data from HTML tables and page content&lt;/li&gt;
&lt;li&gt;Implemented 2-second rate limiting to respect server resources&lt;/li&gt;
&lt;li&gt;Captured metadata: country names, sectors, source links&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Features:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dynamic content loading with 8-second wait times&lt;/li&gt;
&lt;li&gt;Regex pattern matching for electricity access rates&lt;/li&gt;
&lt;li&gt;Comprehensive error handling and logging&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; &lt;code&gt;africa_energy_complete_{timestamp}.csv&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Countries Covered:&lt;/strong&gt; All 54 African nations from Algeria to Zimbabwe&lt;/p&gt;




&lt;h3&gt;
  
  
  2. &lt;strong&gt;Data Transformation (&lt;code&gt;transformer.py&lt;/code&gt; + &lt;code&gt;transform_to_long_format.py&lt;/code&gt;)&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Phase 1: Schema Standardization&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Created country serial mapping (1-54, alphabetical order)&lt;/li&gt;
&lt;li&gt;Standardized column names to match required schema&lt;/li&gt;
&lt;li&gt;Mapped raw data fields to structured format:

&lt;ul&gt;
&lt;li&gt;Title → metric&lt;/li&gt;
&lt;li&gt;Commitment in UA → unit&lt;/li&gt;
&lt;li&gt;Sector → sector&lt;/li&gt;
&lt;li&gt;Sovereign/Non-Sovereign → sub_sector&lt;/li&gt;
&lt;li&gt;Status → sub_sub_sector&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Generated year columns (2000-2024)&lt;/li&gt;

&lt;li&gt;Removed duplicate records&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; &lt;code&gt;africa_energy_transformed_{timestamp}.csv&lt;/code&gt; (wide format)&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Phase 2: Long Format Conversion&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Rationale:&lt;/strong&gt; Optimize for MongoDB time-series queries and storage efficiency&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Converted wide format (1 row × 25 year columns) to long format (multiple rows)&lt;/li&gt;
&lt;li&gt;Used &lt;code&gt;pd.melt()&lt;/code&gt; to unpivot year columns into individual records&lt;/li&gt;
&lt;li&gt;Removed null values to eliminate empty year entries&lt;/li&gt;
&lt;li&gt;Sorted data by country → metric → year&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduced storage overhead (no empty year columns)&lt;/li&gt;
&lt;li&gt;Improved query performance for time-range filters&lt;/li&gt;
&lt;li&gt;Better scalability for future data additions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; &lt;code&gt;africa_energy_long_format_{timestamp}.csv&lt;/code&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  3. &lt;strong&gt;Database Loading (&lt;code&gt;load_to_mongodb.py&lt;/code&gt;)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Database Configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Platform:&lt;/strong&gt; MongoDB Atlas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database:&lt;/strong&gt; &lt;code&gt;energyd2&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collection:&lt;/strong&gt; &lt;code&gt;test&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection:&lt;/strong&gt; Secure connection via environment variables (.env)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Loading Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Established secure MongoDB connection&lt;/li&gt;
&lt;li&gt;Cleared existing collection data to prevent duplicates&lt;/li&gt;
&lt;li&gt;Converted CSV records to MongoDB documents (BSON format)&lt;/li&gt;
&lt;li&gt;Bulk inserted all documents efficiently&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Indexes Created:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;country &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ascending&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;year &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ascending&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;country&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;year &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;compound&lt;/span&gt; &lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;sector &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ascending&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Data Verification:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Total documents loaded: 6&lt;/li&gt;
&lt;li&gt;Unique countries: Zimbabwe (sample shown)&lt;/li&gt;
&lt;li&gt;Query functionality: ✅ Confirmed operational&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - Sample query tested: &lt;code&gt;{country_serial: 54}&lt;/code&gt;
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Results and Verification
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Database Status: ✅ Operational&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Screenshot Evidence:&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3i9ygpywo8vbb3ypvyn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3i9ygpywo8vbb3ypvyn.png" alt=" " width="800" height="372"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Successfully queried Zimbabwe (country_serial: 54)&lt;/li&gt;
&lt;li&gt;Retrieved document showing:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Country:&lt;/strong&gt; Zimbabwe&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metric:&lt;/strong&gt; "Djibouti - Geothermal Exploration Project in the Lake Assal Region"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unit:&lt;/strong&gt; 10740000&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sector:&lt;/strong&gt; Power&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sub-sector:&lt;/strong&gt; Sovereign&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sub-sub-sector:&lt;/strong&gt; Implementation&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Query Performance:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filter capability: Confirmed on country_serial field&lt;/li&gt;
&lt;li&gt;Data integrity: All fields populated correctly&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Data Schema Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Document Structure in MongoDB:&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ObjectId(&lt;/span&gt;&lt;span class="s2"&gt;"68e405f0a5eca175ab909e1c"&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"country"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Zimbabwe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"country_serial"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;54&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metric"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Djibouti - Geothermal Exploration Project in the Lake Assal Region"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"unit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10740000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sector"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Power"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub_sector"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sovereign"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub_sub_sector"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Implementation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source_link"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://africa-energy-portal.org/aep/country/zimbabwe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Africa Energy Portal"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Data Types:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Strings: country, metric, sector, sub_sector, source&lt;/li&gt;
&lt;li&gt;Integer: country_serial, unit (financial values)&lt;/li&gt;
&lt;li&gt;ObjectId: MongoDB auto-generated _id&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Challenges and Solutions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Challenge 1: Format Optimization&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Issue:&lt;/strong&gt; Initial wide format (25 year columns) inefficient for sparse data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Implemented two-phase transformation&lt;/li&gt;
&lt;li&gt;Converted to long format for MongoDB best practices&lt;/li&gt;
&lt;li&gt;Eliminated null values for storage optimization&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Challenge 2: Dynamic Content Loading&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Issue:&lt;/strong&gt; Portal uses JavaScript for content rendering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Implemented Selenium WebDriver for browser automation&lt;/li&gt;
&lt;li&gt;Added 8-second wait times for complete page loads&lt;/li&gt;
&lt;li&gt;Used BeautifulSoup for post-render HTML parsing&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Technical Specifications
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Development Environment:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Language:&lt;/strong&gt; Python 3.x&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Scraping:&lt;/strong&gt; Selenium WebDriver 4.x, BeautifulSoup4&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Processing:&lt;/strong&gt; Pandas, NumPy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database:&lt;/strong&gt; MongoDB Atlas (cloud-hosted)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version Control:&lt;/strong&gt; Git/GitHub&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Project Structure:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;energytest1/
├── extract/
│   └── scraper_complete.py
├── transform/
│   ├── transformer.py
│   └── transform_to_long_format.py
├── load/
│   └── load_to_mongodb.py
│   └── mongodb_loader.py
└── .env (MongoDB credentials)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Deliverables Completed
&lt;/h2&gt;

&lt;p&gt;✅ &lt;strong&gt;1. Web Scraper&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extracts data from 54 African countries&lt;/li&gt;
&lt;li&gt;Comprehensive error handling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✅ &lt;strong&gt;2. Data Transformation Pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standardizes to required schema&lt;/li&gt;
&lt;li&gt;Converts to database-optimized format&lt;/li&gt;
&lt;li&gt;Removes duplicates and null values&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✅ &lt;strong&gt;3. MongoDB Integration&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Secure Atlas connection&lt;/li&gt;
&lt;li&gt;Indexed collection for performance&lt;/li&gt;
&lt;li&gt;Query-ready data structure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✅ &lt;strong&gt;4. Documentation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Well-commented code&lt;/li&gt;
&lt;li&gt;GitHub repository with all files&lt;/li&gt;
&lt;li&gt;This comprehensive report&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Successfully completed Day 1 objectives by building a production-ready ETL pipeline that extracts Africa energy data and stores it in MongoDB. The system is automated, scalable, and follows best practices for web scraping and database design.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Metrics:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Countries Covered:&lt;/strong&gt; 54/54 (100%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Sources:&lt;/strong&gt; Africa Energy Portal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database Status:&lt;/strong&gt; ✅ Operational with 6 documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query Performance:&lt;/strong&gt; ✅ Optimized with indexes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code Quality:&lt;/strong&gt; ✅ Documented and version-controlled&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The foundation is now in place for ongoing data collection and analysis. The MongoDB collection is query-ready.&lt;/p&gt;




&lt;h2&gt;
  
  
  Repository
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/Navashub/lux-internship/tree/main/energytest1" rel="noopener noreferrer"&gt;https://github.com/Navashub/lux-internship/tree/main/energytest1&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Prepared by:&lt;/strong&gt; Navas Herbert&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Submitted to:&lt;/strong&gt; LuxDevHQ&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Date:&lt;/strong&gt; October 6, 2025&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>mongodb</category>
      <category>showdev</category>
    </item>
    <item>
      <title>🤖 AI Web Scraper &amp; Q&amp;A</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Tue, 26 Aug 2025 10:40:01 +0000</pubDate>
      <link>https://dev.to/navashub/ai-web-scraper-qa-5hhn</link>
      <guid>https://dev.to/navashub/ai-web-scraper-qa-5hhn</guid>
      <description>&lt;p&gt;A powerful web scraping tool that combines intelligent content extraction with AI-powered question answering. Built with Streamlit, LangChain, and Ollama for local AI processing.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚀 Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Smart Web Scraping&lt;/strong&gt;: Automatically extracts content from any URL using multiple fallback methods&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI-Powered Q&amp;amp;A&lt;/strong&gt;: Ask questions about scraped content and get intelligent responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local AI Processing&lt;/strong&gt;: Uses Ollama for privacy-focused, offline AI processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple Scraping Methods&lt;/strong&gt;: 

&lt;ul&gt;
&lt;li&gt;Selenium WebDriver for JavaScript-heavy sites&lt;/li&gt;
&lt;li&gt;Simple HTTP requests for basic HTML pages&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Interactive Chat Interface&lt;/strong&gt;: Real-time conversation with the scraped content&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Content Chunking&lt;/strong&gt;: Intelligent text splitting for better context retrieval&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Source Citations&lt;/strong&gt;: See exactly which parts of the content were used to answer your questions&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Error Recovery&lt;/strong&gt;: Robust error handling with graceful fallbacks&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  🛠 Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Streamlit&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI/LLM&lt;/strong&gt;: Ollama (llama3.2)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Scraping&lt;/strong&gt;: Selenium WebDriver, BeautifulSoup&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text Processing&lt;/strong&gt;: LangChain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector Store&lt;/strong&gt;: In-memory vector storage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embeddings&lt;/strong&gt;: Ollama embeddings for semantic search&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📋 Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before running this application, make sure you have:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Python 3.8+&lt;/strong&gt; installed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; installed and running&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chrome browser&lt;/strong&gt; installed (for Selenium)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🔧 Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Clone the Repository
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone &amp;lt;your-repo-url&amp;gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;ai-scraper
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Install Python Dependencies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Install and Setup Ollama
&lt;/h3&gt;

&lt;h4&gt;
  
  
  On Windows/Mac/Linux:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install Ollama from https://ollama.ai&lt;/span&gt;
&lt;span class="c"&gt;# Then pull the required model&lt;/span&gt;
ollama pull llama3.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Start Ollama Service:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Verify Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check if Ollama is running&lt;/span&gt;
curl http://localhost:11434/api/tags

&lt;span class="c"&gt;# Check if llama3.2 model is installed&lt;/span&gt;
ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🚀 Usage
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Starting the Application
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;streamlit run ai_scraper.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The app will open in your browser at &lt;code&gt;http://localhost:8501&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How to Use
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Enter a URL&lt;/strong&gt; in the input field (e.g., &lt;code&gt;https://example.com&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Click "Load &amp;amp; Process URL"&lt;/strong&gt; to scrape and index the content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wait for processing&lt;/strong&gt; - you'll see progress indicators&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ask questions&lt;/strong&gt; in the chat interface about the scraped content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;View sources&lt;/strong&gt; - expand the sources section to see which content was used&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Example Workflows
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Scraping a News Article
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Enter: https://example-news-site.com/article
2. Wait for "Documents indexed successfully!"
3. Ask: "What is the main topic of this article?"
4. Ask: "Who are the key people mentioned?"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Analyzing Documentation
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Enter: https://docs.example.com/api-guide
2. Wait for processing
3. Ask: "How do I authenticate with this API?"
4. Ask: "What are the rate limits?"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  ⚙️ Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Environment Variables (Optional)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Set custom Ollama host&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:11434

&lt;span class="c"&gt;# Set custom model&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;llama3.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Customizing the AI Model
&lt;/h3&gt;

&lt;p&gt;You can use different Ollama models by changing the model name in the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# In ai_scraper.py, change:
&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OllamaEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3.2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OllamaLLM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3.2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# To:
&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OllamaEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# or another model
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OllamaLLM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Available models:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llama3.2&lt;/code&gt; (recommended)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;llama2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mistral&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;codellama&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🔍 Troubleshooting
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Common Issues
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Segmentation Fault
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cause&lt;/strong&gt;: Chrome/Selenium driver issues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution&lt;/strong&gt;: The app automatically handles this with fallback methods&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  "Ollama not found"
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check if Ollama is running&lt;/span&gt;
ollama serve

&lt;span class="c"&gt;# Check if model is installed&lt;/span&gt;
ollama pull llama3.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Chrome Driver Issues
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# The app automatically downloads Chrome driver&lt;/span&gt;
&lt;span class="c"&gt;# If issues persist, manually install:&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--upgrade&lt;/span&gt; webdriver-manager
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Empty Content
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cause&lt;/strong&gt;: Website blocks automated scraping&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution&lt;/strong&gt;: Try different URLs or check the website's robots.txt&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Slow Processing
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cause&lt;/strong&gt;: Large pages or complex content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solutions&lt;/strong&gt;: 

&lt;ul&gt;
&lt;li&gt;Use more specific URLs&lt;/li&gt;
&lt;li&gt;Wait for processing to complete&lt;/li&gt;
&lt;li&gt;Consider using a more powerful model&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Performance Tips
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Use specific URLs&lt;/strong&gt; rather than homepages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Close unused browser tabs&lt;/strong&gt; to free memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use headless mode&lt;/strong&gt; (already enabled)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear chat history&lt;/strong&gt; regularly for better performance&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🔒 Privacy &amp;amp; Security
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Local Processing&lt;/strong&gt;: All AI processing happens locally with Ollama&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Data Sent to Cloud&lt;/strong&gt;: Your scraped content stays on your machine&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secure Scraping&lt;/strong&gt;: Respects robots.txt and rate limits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Persistent Storage&lt;/strong&gt;: Data is only stored in memory during the session&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🤝 Contributing
&lt;/h2&gt;

&lt;p&gt;Contributions are welcome! Here's how to contribute:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fork the repository&lt;/li&gt;
&lt;li&gt;Create a feature branch (&lt;code&gt;git checkout -b feature/amazing-feature&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Commit your changes (&lt;code&gt;git commit -m 'Add amazing feature'&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Push to the branch (&lt;code&gt;git push origin feature/amazing-feature&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Open a Pull Request&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Development Setup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone and setup development environment&lt;/span&gt;
git clone https://github.com/Navashub/AI-Agents/tree/main/ai-scraper
&lt;span class="nb"&gt;cd &lt;/span&gt;ai-scraper
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt



&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  📈 Roadmap
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] &lt;strong&gt;Multi-language Support&lt;/strong&gt; - Support for more Ollama models&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;PDF Scraping&lt;/strong&gt; - Add PDF document processing&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Batch Processing&lt;/strong&gt; - Process multiple URLs at once&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Export Functionality&lt;/strong&gt; - Export Q&amp;amp;A sessions&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Advanced Filtering&lt;/strong&gt; - Content filtering and preprocessing&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;API Mode&lt;/strong&gt; - REST API for programmatic access&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Docker Support&lt;/strong&gt; - Containerized deployment&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Cloud Deployment&lt;/strong&gt; - Deploy to cloud platforms&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📄 License
&lt;/h2&gt;

&lt;p&gt;This project is licensed under the MIT License - see the &lt;a href="https://dev.toLICENSE"&gt;LICENSE&lt;/a&gt; file for details.&lt;/p&gt;

&lt;h2&gt;
  
  
  🙏 Acknowledgments
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; - For providing excellent local AI capabilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangChain&lt;/strong&gt; - For the powerful document processing framework&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit&lt;/strong&gt; - For the amazing web app framework&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Selenium&lt;/strong&gt; - For robust web scraping capabilities&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📞 Support
&lt;/h2&gt;

&lt;p&gt;If you encounter any issues or have questions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check the Troubleshooting section&lt;/li&gt;
&lt;li&gt;Search existing &lt;a href="https://dev.toissues"&gt;GitHub Issues&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Create a new issue with:

&lt;ul&gt;
&lt;li&gt;Your operating system&lt;/li&gt;
&lt;li&gt;Python version&lt;/li&gt;
&lt;li&gt;Error message (if any)&lt;/li&gt;
&lt;li&gt;Steps to reproduce&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🌟 Show Your Support
&lt;/h2&gt;

&lt;p&gt;If this project helped you, please consider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;⭐ Starring the repository&lt;/li&gt;
&lt;li&gt;🔄 Sharing it with others&lt;/li&gt;
&lt;li&gt;🐛 Reporting bugs&lt;/li&gt;
&lt;li&gt;💡 Suggesting new features&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Happy Scraping! 🎉&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Built with using Python, Streamlit, and Ollama.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>rag</category>
      <category>langchain</category>
    </item>
    <item>
      <title>Trying out the Openai new open-source models</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Mon, 11 Aug 2025 13:32:57 +0000</pubDate>
      <link>https://dev.to/navashub/trying-out-the-openai-new-open-source-models-3fap</link>
      <guid>https://dev.to/navashub/trying-out-the-openai-new-open-source-models-3fap</guid>
      <description>&lt;h1&gt;
  
  
  voicegptoss
&lt;/h1&gt;

&lt;h1&gt;
  
  
  🎤 Voice Agent with gpt-oss-120b - Openai open source model
&lt;/h1&gt;

&lt;p&gt;A lightning-fast voice AI agent powered by OpenAI's new gpt-oss-120b model, running locally with Cerebras AI acceleration and Vapi integration. Experience blazing-fast Time To First Token (TTFT) of &lt;strong&gt;0.3-0.7 seconds&lt;/strong&gt; for real-time conversational AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  ✨ Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ultra-Low Latency&lt;/strong&gt;: TTFT of 0.3-0.7s using OpenAI's gpt-oss-120b model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local Deployment&lt;/strong&gt;: Run your voice agent locally with public tunnel access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cerebras AI Acceleration&lt;/strong&gt;: Leverages Cerebras AI's inference infrastructure for optimal performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vapi Integration&lt;/strong&gt;: Seamless voice interface through Vapi's telephony platform&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Processing&lt;/strong&gt;: True real-time voice conversations with minimal delay&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🚀 Performance
&lt;/h2&gt;

&lt;p&gt;This implementation achieves exceptional performance metrics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Time To First Token (TTFT)&lt;/strong&gt;: 0.3-0.7 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model&lt;/strong&gt;: OpenAI GPT-4o Realtime (OSS)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure&lt;/strong&gt;: Cerebras AI + Local deployment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt;: Optimized for real-time voice interactions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🛠️ Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AI Model&lt;/strong&gt;: OpenAI gpt-oss-120b&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inference&lt;/strong&gt;: Cerebras AI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Voice Platform&lt;/strong&gt;: Vapi&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tunneling&lt;/strong&gt;: ngrok&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: Python&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment&lt;/strong&gt;: Local with public exposure&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📋 Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.8+&lt;/li&gt;
&lt;li&gt;Git&lt;/li&gt;
&lt;li&gt;ngrok account and installation&lt;/li&gt;
&lt;li&gt;Cerebras AI API key&lt;/li&gt;
&lt;li&gt;Vapi account&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🚀 Quick Start
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Clone the Repository
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone git@github.com:Navashub/voicegptoss.git
&lt;span class="nb"&gt;cd &lt;/span&gt;voicegptoss
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Set Up Environment
&lt;/h3&gt;

&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file in the project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;touch&lt;/span&gt; .env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add your Cerebras AI API key to the &lt;code&gt;.env&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CEREBRAS_API_KEY=your_cerebras_api_key_here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Get Cerebras AI API Key
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Visit &lt;a href="https://www.cerebras.ai/" rel="noopener noreferrer"&gt;Cerebras AI&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Sign up for an account&lt;/li&gt;
&lt;li&gt;Navigate to API keys section&lt;/li&gt;
&lt;li&gt;Generate a new API key&lt;/li&gt;
&lt;li&gt;Copy the key to your &lt;code&gt;.env&lt;/code&gt; file&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  4. Set Up ngrok
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Create an account at &lt;a href="https://ngrok.com/" rel="noopener noreferrer"&gt;ngrok.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Install ngrok on your system:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="c"&gt;# Windows (using chocolatey)&lt;/span&gt;
   choco &lt;span class="nb"&gt;install &lt;/span&gt;ngrok

   &lt;span class="c"&gt;# macOS (using homebrew)&lt;/span&gt;
   brew &lt;span class="nb"&gt;install &lt;/span&gt;ngrok/ngrok/ngrok

   &lt;span class="c"&gt;# Linux&lt;/span&gt;
   curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://ngrok-agent.s3.amazonaws.com/ngrok.asc | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/apt/trusted.gpg.d/ngrok.asc &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null
   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"deb https://ngrok-agent.s3.amazonaws.com buster main"&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/apt/sources.list.d/ngrok.list
   &lt;span class="nb"&gt;sudo &lt;/span&gt;apt update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;ngrok
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Authenticate ngrok with your token:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   ngrok config add-authtoken YOUR_NGROK_AUTHTOKEN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Install Dependencies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Run the Application
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python main.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The application will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start the local server&lt;/li&gt;
&lt;li&gt;Create an ngrok tunnel&lt;/li&gt;
&lt;li&gt;Display the public URL in the console&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  7. Configure Vapi
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Copy the public ngrok URL from your console output&lt;/li&gt;
&lt;li&gt;Go to your Vapi dashboard&lt;/li&gt;
&lt;li&gt;Add the public URL as your webhook endpoint&lt;/li&gt;
&lt;li&gt;Configure your voice agent settings&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  8. Test Your Voice Agent
&lt;/h3&gt;

&lt;p&gt;Your voice agent is now live and ready to handle calls through Vapi!&lt;/p&gt;

&lt;h2&gt;
  
  
  🔧 Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Environment Variables
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CEREBRAS_API_KEY&lt;/code&gt;: Your Cerebras AI API key for model inference&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;NGROK_AUTHTOKEN&lt;/code&gt;: Your ngrok authentication token (optional, can be set via ngrok config)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Customization
&lt;/h3&gt;

&lt;p&gt;You can modify the voice agent behavior by editing the configuration in &lt;code&gt;main.py&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adjust model parameters&lt;/li&gt;
&lt;li&gt;Modify response formatting&lt;/li&gt;
&lt;li&gt;Configure webhook endpoints&lt;/li&gt;
&lt;li&gt;Set custom voice settings&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📊 Performance Optimization
&lt;/h2&gt;

&lt;p&gt;This setup is optimized for minimal latency:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cerebras AI&lt;/strong&gt;: Provides fast inference for the GPT-4o model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local Deployment&lt;/strong&gt;: Eliminates additional network hops&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ngrok Tunneling&lt;/strong&gt;: Secure public access without complex networking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimized Code&lt;/strong&gt;: Streamlined request/response handling&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🐛 Troubleshooting
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Common Issues
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;ngrok Authentication Error&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Make sure you're using the tunnel authtoken, not API key&lt;/span&gt;
ngrok config add-authtoken YOUR_TUNNEL_AUTHTOKEN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cerebras API Key Issues&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verify your API key is correctly added to &lt;code&gt;.env&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Check your Cerebras AI account has sufficient credits&lt;/li&gt;
&lt;li&gt;Ensure API key has proper permissions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Connection Issues&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check firewall settings&lt;/li&gt;
&lt;li&gt;Verify ngrok tunnel is active&lt;/li&gt;
&lt;li&gt;Confirm webhook URL in Vapi matches ngrok public URL&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📈 Monitoring
&lt;/h2&gt;

&lt;p&gt;Monitor your voice agent performance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check console logs for TTFT metrics&lt;/li&gt;
&lt;li&gt;Monitor Cerebras AI usage in their dashboard&lt;/li&gt;
&lt;li&gt;Track call quality in Vapi analytics&lt;/li&gt;
&lt;li&gt;Use ngrok dashboard for tunnel statistics&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🤝 Contributing
&lt;/h2&gt;

&lt;p&gt;Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.&lt;/p&gt;

&lt;h2&gt;
  
  
  📄 License
&lt;/h2&gt;

&lt;p&gt;This project is licensed under the MIT License - see the LICENSE file for details.&lt;/p&gt;

&lt;h2&gt;
  
  
  🙏 Acknowledgments
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI for GPT-4o Realtime model&lt;/li&gt;
&lt;li&gt;Cerebras AI for high-performance inference&lt;/li&gt;
&lt;li&gt;Vapi for voice interface platform&lt;/li&gt;
&lt;li&gt;ngrok for secure tunneling solution&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📞 Support
&lt;/h2&gt;

&lt;p&gt;If you encounter any issues or have questions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check the troubleshooting section above&lt;/li&gt;
&lt;li&gt;Open an issue on GitHub&lt;/li&gt;
&lt;li&gt;Review the logs for error details&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;strong&gt;⚡ Ready to build the future of voice AI? Get started now!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>python</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Content Generator from youtube video id</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Mon, 23 Jun 2025 20:26:05 +0000</pubDate>
      <link>https://dev.to/navashub/content-generator-from-youtube-video-id-3j6j</link>
      <guid>https://dev.to/navashub/content-generator-from-youtube-video-id-3j6j</guid>
      <description>&lt;h2&gt;
  
  
  🚗 Audispot Content Writer
&lt;/h2&gt;

&lt;p&gt;An AI-powered content generation tool that creates platform-specific social media content for automotive enthusiasts. Built specifically for audispot254, this tool generates engaging posts for LinkedIn, Instagram, and Twitter from YouTube automotive video transcripts.&lt;/p&gt;

&lt;h2&gt;
  
  
  🌟 Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;YouTube Integration&lt;/strong&gt;: Automatically extracts transcripts from YouTube automotive videos&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform-Specific Content&lt;/strong&gt;: Creates tailored content for different social media platforms:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LinkedIn&lt;/strong&gt;: Professional, analytical content for automotive industry professionals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instagram&lt;/strong&gt;: Casual, enthusiast-focused content with emojis and hashtags&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Twitter&lt;/strong&gt;: Concise, opinionated takes designed to spark engagement&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Modern UI&lt;/strong&gt;: Clean, dark-mode Streamlit interface&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;AI-Powered&lt;/strong&gt;: Uses OpenAI GPT-4 for intelligent content generation&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Real-time Generation&lt;/strong&gt;: Live content creation with loading indicators&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  🎯 Target Platforms
&lt;/h2&gt;

&lt;h3&gt;
  
  
  LinkedIn
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Audience&lt;/strong&gt;: Automotive professionals, engineers, business leaders&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tone&lt;/strong&gt;: Professional, analytical, thought-provoking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Focus&lt;/strong&gt;: Technical insights, industry trends, business implications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Length&lt;/strong&gt;: 180-220 words with professional hashtags&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Instagram
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Audience&lt;/strong&gt;: Car enthusiasts, Gen Z/Millennial car lovers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tone&lt;/strong&gt;: Excited, casual, community-driven&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Focus&lt;/strong&gt;: Cool features, performance specs, visual appeal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Length&lt;/strong&gt;: 100-130 words with emojis and trendy hashtags&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Twitter
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Audience&lt;/strong&gt;: Quick scrollers, debate starters, car Twitter community&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tone&lt;/strong&gt;: Sharp, opinionated, conversation-starter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Focus&lt;/strong&gt;: Hot takes, surprising facts, debate points&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Length&lt;/strong&gt;: Under 250 characters with strategic hashtags&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🛠️ Technology Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Python 3.11+&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit&lt;/strong&gt; - Web interface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI GPT-4&lt;/strong&gt; - Content generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;YouTube Transcript API&lt;/strong&gt; - Video transcript extraction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;python-dotenv&lt;/strong&gt; - Environment variable management&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;openai-agents&lt;/strong&gt; - Agent orchestration&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📋 Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.11 or higher&lt;/li&gt;
&lt;li&gt;OpenAI API key&lt;/li&gt;
&lt;li&gt;UV package manager (recommended)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🚀 Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Clone the Repository
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Navashub/AI-Agents.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AI-Agents/audispot_content_writer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Set Up Virtual Environment with UV
&lt;/h3&gt;

&lt;p&gt;This project uses UV for dependency management. Install UV if you haven't already:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install UV (if not already installed)&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;uv

&lt;span class="c"&gt;# Initialize virtual environment&lt;/span&gt;
uv venv

&lt;span class="c"&gt;# Activate virtual environment&lt;/span&gt;
&lt;span class="c"&gt;# On Windows:&lt;/span&gt;
.venv&lt;span class="se"&gt;\S&lt;/span&gt;cripts&lt;span class="se"&gt;\a&lt;/span&gt;ctivate
&lt;span class="c"&gt;# On macOS/Linux:&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Install Dependencies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install all dependencies from requirements.txt&lt;/span&gt;
uv add &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Environment Configuration
&lt;/h3&gt;

&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file in the project root and add your API keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPENAI_API_KEY=your_openai_api_key_here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get your OpenAI API key from &lt;a href="https://platform.openai.com/api-keys" rel="noopener noreferrer"&gt;OpenAI Platform&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Never commit your &lt;code&gt;.env&lt;/code&gt; file to version control&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;.env&lt;/code&gt; file should be added to your &lt;code&gt;.gitignore&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🎮 Usage
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Running the Application
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;streamlit run app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The application will open in your default web browser at &lt;code&gt;http://localhost:8501&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Using the Interface
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Enter YouTube Video ID&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extract the ID from a YouTube URL (e.g., for &lt;code&gt;https://www.youtube.com/watch?v=6hr6wZr1N_8&lt;/code&gt;, the ID is &lt;code&gt;6hr6wZr1N_8&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Customize Your Query&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Modify the default query or add specific instructions&lt;/li&gt;
&lt;li&gt;The tool works best with automotive content&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Select Platforms&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose which social media platforms you want content for&lt;/li&gt;
&lt;li&gt;Each platform generates unique, tailored content&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generate Content&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click "Generate Content" and wait for the AI to process&lt;/li&gt;
&lt;li&gt;Content will be displayed in separate cards for each platform&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Example Usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Direct API usage example
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;audispot_content_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_transcript&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content_creator_agent&lt;/span&gt;

&lt;span class="c1"&gt;# Get transcript
&lt;/span&gt;&lt;span class="n"&gt;video_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;6hr6wZr1N_8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;transcript&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_transcript&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;video_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Generate content (async)
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;Runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content_creator_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  📁 Project Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;audispot_content_writer/
├── app.py                      # Streamlit web interface
├── audispot_content_agent.py   # Core AI agent and content generation logic
├── requirements.txt            # Python dependencies
├── pyproject.toml             # Project configuration
├── uv.lock                    # UV lock file
├── transcript_errors.log      # Error logging
├── .env                       # Environment variables (create this)
└── README.md                  # Project documentation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🔧 Key Components
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Content Generation Agent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Platform-specific tools&lt;/strong&gt;: Separate functions for LinkedIn, Instagram, and Twitter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intelligent prompting&lt;/strong&gt;: Tailored prompts for each platform's audience&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error handling&lt;/strong&gt;: Robust transcript fetching with logging&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Web Interface
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dark mode design&lt;/strong&gt;: Modern, professional appearance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Responsive layout&lt;/strong&gt;: Works on desktop and mobile&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time processing&lt;/strong&gt;: Live updates and loading states&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content extraction&lt;/strong&gt;: Smart parsing of AI-generated content&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🎨 Content Strategy
&lt;/h2&gt;

&lt;p&gt;The tool follows audispot254's content strategy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Perspective&lt;/strong&gt;: Content written from the viewpoint of someone who watched the video&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authenticity&lt;/strong&gt;: Natural, genuine reactions to automotive content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform Optimization&lt;/strong&gt;: Each platform receives content optimized for its audience&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engagement Focus&lt;/strong&gt;: Content designed to drive comments, shares, and interactions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🐛 Troubleshooting
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Common Issues
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;"Could not fetch transcript" error&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verify the YouTube video ID is correct&lt;/li&gt;
&lt;li&gt;Check if the video has available transcripts/captions&lt;/li&gt;
&lt;li&gt;Some videos may have transcripts disabled&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;OpenAI API errors&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verify your API key is correct in the &lt;code&gt;.env&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;Check your OpenAI account has available credits&lt;/li&gt;
&lt;li&gt;Ensure the &lt;code&gt;.env&lt;/code&gt; file is in the project root&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;UV dependency issues&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Make sure UV is properly installed: &lt;code&gt;pip install uv&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Try recreating the virtual environment: &lt;code&gt;uv venv --force&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Reinstall dependencies: &lt;code&gt;uv add -r requirements.txt&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Error Logging
&lt;/h3&gt;

&lt;p&gt;Check &lt;code&gt;transcript_errors.log&lt;/code&gt; for detailed error information when transcript fetching fails.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤝 Contributing
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Fork the repository&lt;/li&gt;
&lt;li&gt;Create a feature branch (&lt;code&gt;git checkout -b feature/new-feature&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Commit your changes (&lt;code&gt;git commit -am 'Add new feature'&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Push to the branch (&lt;code&gt;git push origin feature/new-feature&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Create a Pull Request&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  📄 License
&lt;/h2&gt;

&lt;p&gt;This project is part of the &lt;a href="https://github.com/Navashub/AI-Agents/tree/main/audispot_content_writer" rel="noopener noreferrer"&gt;AI-Agents repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔗 Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Project Repository&lt;/strong&gt;: &lt;a href="https://github.com/Navashub/AI-Agents/tree/main/audispot_content_writer" rel="noopener noreferrer"&gt;GitHub - Audispot Content Writer&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;YouTube Transcript API&lt;/strong&gt;: &lt;a href="https://pypi.org/project/youtube-transcript-api/" rel="noopener noreferrer"&gt;PyPI - youtube-transcript-api&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI Platform&lt;/strong&gt;: &lt;a href="https://platform.openai.com/" rel="noopener noreferrer"&gt;OpenAI API&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit Documentation&lt;/strong&gt;: &lt;a href="https://docs.streamlit.io/" rel="noopener noreferrer"&gt;Streamlit Docs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🙏 Acknowledgments
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://pypi.org/project/youtube-transcript-api/" rel="noopener noreferrer"&gt;YouTube Transcript API&lt;/a&gt; by Jonas Depoix&lt;/li&gt;
&lt;li&gt;OpenAI for GPT-4 API&lt;/li&gt;
&lt;li&gt;Streamlit for the web framework&lt;/li&gt;
&lt;li&gt;UV for modern Python dependency management&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Made with for automotive content creators&lt;/strong&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Simple MCP with LangGraph</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Mon, 23 Jun 2025 17:04:16 +0000</pubDate>
      <link>https://dev.to/navashub/simple-mcp-with-langgraph-35mj</link>
      <guid>https://dev.to/navashub/simple-mcp-with-langgraph-35mj</guid>
      <description>&lt;p&gt;It's wild how far we have come - not too long ago, many of us were still figuring out how to scrape public data or call basic APIs . &lt;/p&gt;

&lt;p&gt;But now we are entering a whole new level. Building MCP servers and clients - basically custom tools that LLMs cam talk to directly.&lt;/p&gt;

&lt;p&gt;MCP is a protocol that lets you expose any custom logic or service (like a weather API, Calculator, or even your own DB) and plug it I to an AI agent - it's clean , fast.&lt;/p&gt;

&lt;p&gt;You can build your own MCP server with Python ( like using FastMCP) then connect it to an LLM via a client. The LLM can then ask your tool for answers in real time. &lt;/p&gt;

&lt;p&gt;Real protocol. Real structure.&lt;/p&gt;

&lt;p&gt;MCP is actually the official spec - &lt;a href="https://modelcontextprotocol.io/introduction" rel="noopener noreferrer"&gt;model context protocol&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While we have the official spec live ☝️☝️&lt;/p&gt;

&lt;p&gt;You can explore and share your MCP - compatible tools on &lt;a href="https://smithery.ai/" rel="noopener noreferrer"&gt;smithery&lt;/a&gt;- think of it like the Hugging Face for MCPs.&lt;/p&gt;

&lt;p&gt;**&lt;/p&gt;

&lt;h2&gt;
  
  
  🚨 Security Note 🚨
&lt;/h2&gt;

&lt;p&gt;**&lt;/p&gt;

&lt;p&gt;Ofcourse the MCP setups involve some subprocessss, API calls and server logic - means a code is running.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Be cautious when connecting them with your Gitub , or accounts.&lt;/li&gt;
&lt;li&gt;if you are testing stuff, I would recommend a fresh email, new GitHub and maybe even an isolated virtual environment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✍️ - not every example out there is Hardened for safety. &lt;/p&gt;

&lt;p&gt;Especially the GitHub MCP, I have used for a while and it has all access to everything in GitHub. &lt;/p&gt;

&lt;p&gt;📌 want to see a working example? Here is a repo Where I have been experimenting building an MCP weather server , and a maths server (I just have addition and multiplication in there,you can add more and more maths and even more server. ) and connecting then to an AI agent using LangGraph. &lt;/p&gt;

&lt;p&gt;here is the link to the project in GitHub - &lt;a href="https://github.com/Navashub/AI-Agents/tree/main/mcplangchain" rel="noopener noreferrer"&gt;mcp-langchain&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  You will find:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;a weather server &lt;/li&gt;
&lt;li&gt;math server &lt;/li&gt;
&lt;li&gt;a client setup that lets an LLM Use those tools intelligently &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You will go through readme for setup. &lt;/p&gt;

&lt;p&gt;A simple one , but can give you a roadmap and more insights.&lt;/p&gt;

&lt;p&gt;Explore it, fork it, run it. 🤝&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Complete Beginner's Guide: Building a Weather ETL Pipeline with PySpark</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Fri, 06 Jun 2025 09:45:23 +0000</pubDate>
      <link>https://dev.to/navashub/complete-beginners-guide-building-a-weather-etl-pipeline-with-pyspark-2op3</link>
      <guid>https://dev.to/navashub/complete-beginners-guide-building-a-weather-etl-pipeline-with-pyspark-2op3</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Welcome to the exciting world of data engineering! In this comprehensive tutorial, you'll learn how to build your first ETL (Extract, Transform, Load) pipeline using PySpark to fetch weather data from the OpenWeatherMap API and store it in a PostgreSQL database.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is ETL?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extract&lt;/strong&gt;: Get data from a source (in our case, OpenWeatherMap API)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transform&lt;/strong&gt;: Clean, process, and structure the data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load&lt;/strong&gt;: Store the processed data in a destination (PostgreSQL database)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the end of this tutorial, you'll have hands-on experience with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Server management and SSH connections&lt;/li&gt;
&lt;li&gt;Python virtual environments&lt;/li&gt;
&lt;li&gt;PySpark for data processing&lt;/li&gt;
&lt;li&gt;API integration&lt;/li&gt;
&lt;li&gt;Database connections&lt;/li&gt;
&lt;li&gt;Project organization best practices&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;Before we begin, make sure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Access to a Linux server (cloud instance or local machine)&lt;/li&gt;
&lt;li&gt;Basic knowledge of command line operations&lt;/li&gt;
&lt;li&gt;A free OpenWeatherMap API account (&lt;a href="https://openweathermap.org" rel="noopener noreferrer"&gt;OpenWeatherMap &lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;PostgreSQL installed on your server&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Connecting to Your Server
&lt;/h2&gt;

&lt;p&gt;First, we need to establish a secure connection to our server using SSH (Secure Shell).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh user@your_server_ip_address
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What's happening here?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ssh&lt;/code&gt; is the command to establish a secure connection&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;user&lt;/code&gt; is your username on the server&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;your_server_ip_address&lt;/code&gt; is the IP address of your server&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9s2ef5f9tgrpzxbtermc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9s2ef5f9tgrpzxbtermc.png" alt="Image description" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After entering this command, you'll be prompted to enter your password. Once authenticated, you'll see your server's command prompt, indicating you're now connected.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Furcjl1ivst7jwru2sx3l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Furcjl1ivst7jwru2sx3l.png" alt="Image description" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Setting Up Your Project Directory
&lt;/h2&gt;

&lt;p&gt;Now that we're connected to the server, let's create a dedicated folder for our weather ETL project.&lt;/p&gt;

&lt;p&gt;example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;navas_weather_etl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why create a separate folder?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Keeps your project organized&lt;/li&gt;
&lt;li&gt;Prevents conflicts with other projects&lt;/li&gt;
&lt;li&gt;Makes it easier to manage dependencies&lt;/li&gt;
&lt;li&gt;Follows professional development practices&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmp6tmg3tawgbddlaxykd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmp6tmg3tawgbddlaxykd.png" alt="Image description" width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, navigate into your newly created directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd navas_weather_etl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F52hy1y223sxjsahur0sf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F52hy1y223sxjsahur0sf.png" alt="Image description" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Creating a Python Virtual Environment
&lt;/h2&gt;

&lt;p&gt;Virtual environments are crucial in Python development. Let's create one for our project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3 -m venv myvenv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why Use Virtual Environments?
&lt;/h3&gt;

&lt;p&gt;Virtual environments are isolated Python environments that allow you to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;- &lt;strong&gt;Dependency Isolation&lt;/strong&gt;: Each project can have its own set of packages without conflicts&lt;/li&gt;
&lt;li&gt;- &lt;strong&gt;Version Control&lt;/strong&gt;: Different projects can use different versions of the same package&lt;/li&gt;
&lt;li&gt;- &lt;strong&gt;Clean Development&lt;/strong&gt;: Prevents system-wide package installations that could break other projects&lt;/li&gt;
&lt;li&gt;- &lt;strong&gt;Reproducibility&lt;/strong&gt;: Makes it easier to replicate your environment on other machines&lt;/li&gt;
&lt;li&gt;- &lt;strong&gt;Professional Standard&lt;/strong&gt;: Industry best practice for Python development&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Think of a virtual environment as a separate &lt;em&gt;"workspace"&lt;/em&gt; for each project, ensuring that what you install for one project doesn't interfere with another.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3tyd0lc6hc0bpibdi88b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3tyd0lc6hc0bpibdi88b.png" alt="Image description" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, let's activate our virtual environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;source myvenv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll notice your command prompt changes to show &lt;code&gt;(myvenv)&lt;/code&gt; at the beginning, indicating the virtual environment is active.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42tfpu5unny37jyw2g64.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42tfpu5unny37jyw2g64.png" alt="Image description" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Creating Project Files
&lt;/h2&gt;

&lt;p&gt;Let's create the essential files for our project using the &lt;code&gt;touch&lt;/code&gt; command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;touch weather_etl.py .env requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  File Breakdown:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;weather_etl.py&lt;/code&gt;: Contains our main ETL code&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.env&lt;/code&gt;: Stores sensitive information like API keys (never commit to version control!)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;requirements.txt&lt;/code&gt;: Lists all Python packages our project needs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsx2hcolnaf0w813gogje.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsx2hcolnaf0w813gogje.png" alt="Image description" width="479" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Setting Up Dependencies
&lt;/h2&gt;

&lt;p&gt;Let's populate our &lt;code&gt;requirements.txt&lt;/code&gt; file with the necessary packages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;certifi==2025.4.26
charset-normalizer==3.4.2
idna==3.10
psycopg2-binary==2.9.10
py4j==0.10.9.9
pyspark==4.0.0
requests==2.32.3
urllib3==2.4.0
python-dotenv==1.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Package Explanations:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;pyspark&lt;/code&gt;: Apache Spark's Python API for big data processing&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;requests&lt;/code&gt;: For making HTTP requests to the OpenWeatherMap API&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;psycopg2-binary&lt;/code&gt;: PostgreSQL adapter for Python&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;python-dotenv&lt;/code&gt;: Loads environment variables from .env file&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 6: Database Setup
&lt;/h2&gt;

&lt;p&gt;Ensure you have PostgreSQL set up with a database and user for this project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 7: The Complete ETL Code
&lt;/h2&gt;

&lt;p&gt;Now, let's create our main ETL script. Edit the &lt;code&gt;weather_etl.py&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests
import os
from dotenv import load_dotenv
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, StringType, DoubleType, IntegerType

# Load environment variables from .env file
load_dotenv()

# Get API key from environment variable
API_KEY = os.getenv("API_KEY")
CITIES = ["Nairobi", "Mombasa", "Kisumu"]

def fetch_weather(city):
    """Fetch weather data for a specific city from OpenWeatherMap API"""
    url = f"https://api.openweathermap.org/data/2.5/weather?q={city}&amp;amp;appid={API_KEY}&amp;amp;units=metric"
    return requests.get(url).json()

def extract_data():
    """Extract weather data for all cities"""
    return [fetch_weather(city) for city in CITIES]

def transform(spark, data):
    """Transform raw weather data into structured DataFrame"""
    schema = StructType([
        StructField("city", StringType()),
        StructField("temp", DoubleType()),
        StructField("feels_like", DoubleType()),
        StructField("humidity", IntegerType()),
        StructField("pressure", IntegerType()),
        StructField("wind_speed", DoubleType()),
        StructField("weather_main", StringType()),
        StructField("weather_desc", StringType())
    ])

    rows = [(d["name"], d["main"]["temp"], d["main"]["feels_like"],
             d["main"]["humidity"], d["main"]["pressure"], d["wind"]["speed"],
             d["weather"][0]["main"], d["weather"][0]["description"])
            for d in data]

    return spark.createDataFrame(rows, schema)

def load(df):
    """Load DataFrame to PostgreSQL database"""
    df.write \
        .format("jdbc") \
        .option("url", "jdbc:postgresql://localhost:5432/weather_db") \
        .option("dbtable", "public.navas_weather_data") \
        .option("user", "postgres") \
        .option("password", "12345") \
        .option("driver", "org.postgresql.Driver") \
        .mode("append") \
        .save()

def main():
    """Main ETL pipeline execution"""
    # Check if API key is loaded
    if not API_KEY:
        raise ValueError("API_KEY not found in environment variables. Please check your .env file.")

    # Create Spark session
    spark = SparkSession.builder \
        .appName("WeatherETL") \
        .config("spark.jars.packages", "org.postgresql:postgresql:42.6.0") \
        .getOrCreate()

    try:
        # Execute ETL pipeline
        data = extract_data()
        df = transform(spark, data)
        df.show()
        load(df)
        print("ETL pipeline completed successfully!")

    except Exception as e:
        print(f"Error in ETL pipeline: {str(e)}")

    finally:
        # Stop Spark session
        spark.stop()

if __name__ == "__main__":
    main()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 9: Installing Dependencies
&lt;/h2&gt;

&lt;p&gt;Before running our code, we need to install all the required packages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install -r requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 10: Running the ETL Pipeline
&lt;/h2&gt;

&lt;p&gt;Now for the exciting part - running our ETL pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python weather_etl.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If everything is set up correctly, you should see output showing the extraction, transformation, and loading process.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0vsttswe6wt3quf2n5p1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0vsttswe6wt3quf2n5p1.png" alt="Image description" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 10: Version Control Best Practices
&lt;/h2&gt;

&lt;p&gt;Before pushing your code to GitHub, create a &lt;code&gt;.gitignore&lt;/code&gt; file to exclude sensitive files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;touch .gitignore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the following content to .gitignore:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Environment variables
.env

# Virtual environment
myvenv/
venv/
env/

# Python cache
__pycache__/
*.pyc
*.pyo

# IDE files
.vscode/
.idea/

# OS files
.DS_Store
Thumbs.db
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Security Note&lt;/strong&gt;: Never commit &lt;code&gt;.env&lt;/code&gt; files to version control. They contain sensitive information!&lt;/p&gt;

&lt;h3&gt;
  
  
  Why use .gitignore?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Prevents sensitive information (like API keys) from being committed&lt;/li&gt;
&lt;li&gt;Keeps repository clean by excluding temporary files&lt;/li&gt;
&lt;li&gt;Prevents virtual environment files from being tracked&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;You've successfully created a complete ETL pipeline that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extracts real-time weather data from an API&lt;/li&gt;
&lt;li&gt;Transforms it with PySpark for analysis&lt;/li&gt;
&lt;li&gt;Loads it into a PostgreSQL database for storage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This project demonstrates fundamental data engineering concepts and provides a solid foundation for more complex data pipelines. Remember to always follow best practices like using virtual environments, keeping secrets secure, and maintaining clean code structure.&lt;/p&gt;

&lt;p&gt;Happy data engineering! 🚀&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;This tutorial was created to help beginners start their data engineering journey with practical, hands-on experience using industry-standard tools and practices.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>dataengineering</category>
      <category>spark</category>
      <category>etl</category>
    </item>
    <item>
      <title>Building a Crypto ETL Pipeline with Apache Airflow and Astro CLI</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Tue, 03 Jun 2025 16:17:57 +0000</pubDate>
      <link>https://dev.to/navashub/building-a-crypto-etl-pipeline-with-apache-airflow-and-astro-cli-5a62</link>
      <guid>https://dev.to/navashub/building-a-crypto-etl-pipeline-with-apache-airflow-and-astro-cli-5a62</guid>
      <description>&lt;p&gt;In this comprehensive guide, we'll walk through building a complete cryptocurrency ETL (Extract, Transform, Load) pipeline using Apache Airflow orchestrated through Astronomer's Astro CLI. This project demonstrates how to create a robust data pipeline that extracts cryptocurrency data from APIs, transforms it, and loads it into a PostgreSQL database, all while leveraging containerization for consistent development and deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Apache Airflow?
&lt;/h2&gt;

&lt;p&gt;Apache Airflow is an open-source platform designed to programmatically author, schedule, and monitor workflows. It allows you to define workflows as Directed Acyclic Graphs (DAGs) of tasks, making it perfect for ETL processes where data flows through multiple stages of processing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Docker?
&lt;/h2&gt;

&lt;p&gt;Docker is a containerization platform that packages applications and their dependencies into lightweight, portable containers. Think of it as a virtual box that contains everything your application needs to run - the code, runtime, system tools, libraries, and settings&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before starting, ensure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Windows machine with administrative privileges&lt;/li&gt;
&lt;li&gt;Docker installed and running&lt;/li&gt;
&lt;li&gt;Visual Studio Code&lt;/li&gt;
&lt;li&gt;Basic understanding of Python and SQL&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Project Setup
&lt;/h1&gt;

&lt;h3&gt;
  
  
  Step 1: Project Initialization
&lt;/h3&gt;

&lt;p&gt;First, create a dedicated folder for your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;CryptoETL
&lt;span class="nb"&gt;cd &lt;/span&gt;CryptoETL
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open the folder in Visual Studio Code by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;code &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Installing Astro CLI
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcywyosxf7qph0b4l2ktw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcywyosxf7qph0b4l2ktw.png" alt="Image description" width="800" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Astro CLI is Astronomer's command-line tool that makes it easy to develop and deploy Airflow projects locally. Since we're on Windows, we'll use the Windows Package Manager (winget) for installation.&lt;br&gt;
In your VS Code terminal, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;winget &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nt"&gt;--id&lt;/span&gt; Astronomer.Astro
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: After installation, restart Visual Studio Code to ensure the Astro CLI is properly loaded and available in your terminal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Initializing the Astro Project
&lt;/h3&gt;

&lt;p&gt;With the Astro CLI installed, initialize your Airflow project&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;astro dev init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command creates a complete Airflow development environment by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pulling the latest Astro Runtime (which includes Apache Airflow)&lt;/li&gt;
&lt;li&gt;Creating necessary project structure and configuration files&lt;/li&gt;
&lt;li&gt;Setting up Docker containers for local development&lt;/li&gt;
&lt;li&gt;Initializing an empty Astro project in your current directory&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Project Architecture
&lt;/h2&gt;

&lt;p&gt;Our ETL pipeline consists of three main components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extract&lt;/strong&gt;: Fetch cryptocurrency data from external APIs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transform:&lt;/strong&gt; Process and clean the raw data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load:&lt;/strong&gt; Store the processed data in PostgreSQL database&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Building the DAG
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Understanding DAGs
&lt;/h3&gt;

&lt;p&gt;A Directed Acyclic Graph (DAG) in Airflow represents a workflow where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Directed:&lt;/strong&gt; Tasks have a specific order and direction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Acyclic:&lt;/strong&gt; No circular dependencies (tasks can't loop back)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph:&lt;/strong&gt; Visual representation of task relationships&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The complete code for this Astro Airflow ETL pipeline is available on &lt;a href="https://github.com/Navashub/lux-projects/tree/main/Crypto-ETL" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Docker Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Docker Compose Setup
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fef144mgq38y0sksb65wx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fef144mgq38y0sksb65wx.png" alt="Image description" width="800" height="396"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This configuration sets up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PostgreSQL database for storing cryptocurrency data&lt;/li&gt;
&lt;li&gt;Environment variables for database connection&lt;/li&gt;
&lt;li&gt;Port mapping for external access&lt;/li&gt;
&lt;li&gt;Persistent volume for data storage&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Running the Project
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 4: Starting the Development Environment
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;astro dev start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Builds Docker containers based on your project configuration&lt;/li&gt;
&lt;li&gt;Starts all necessary services (Airflow webserver, scheduler, database)&lt;/li&gt;
&lt;li&gt;Makes the Airflow UI available at &lt;code&gt;http://localhost:8080&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 5: Accessing the Airflow UI
&lt;/h3&gt;

&lt;p&gt;Once the containers are running, open your web browser and navigate to:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;http://localhost:8080&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Default credentials:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Username: admin&lt;/li&gt;
&lt;li&gt;Password: admin&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Configuration and Connections
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Setting Up Airflow Connections
&lt;/h3&gt;

&lt;p&gt;For your ETL pipeline to work properly, you need to configure connections in Airflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Navigate to Admin &amp;gt; Connections&lt;/strong&gt; in the Airflow UI&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Add PostgreSQL Connection:&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Connection Id: &lt;code&gt;postgres_default&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Connection Type: &lt;code&gt;Postgres&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Host: &lt;code&gt;postgres&lt;/code&gt; (Docker service name)&lt;/li&gt;
&lt;li&gt;Schema: &lt;code&gt;crypto_db&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Login: &lt;code&gt;airflow&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Password: &lt;code&gt;airflow&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Port: &lt;code&gt;5432&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;3.&lt;strong&gt;Add API Connections&lt;/strong&gt; (if using authenticated APIs):&lt;/p&gt;

&lt;p&gt;Configure HTTP connections for your cryptocurrency APIs&lt;br&gt;
Store API keys securely using Airflow Variables or Connections&lt;/p&gt;

&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;Consider these enhancements for your pipeline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Implement data quality monitoring&lt;/li&gt;
&lt;li&gt;Add email notifications for task failures&lt;/li&gt;
&lt;li&gt;Create data visualization dashboards&lt;/li&gt;
&lt;li&gt;Implement automated testing for DAG logic&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Building a Multilingual Business Assistant for Kenya</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Sat, 26 Apr 2025 06:14:20 +0000</pubDate>
      <link>https://dev.to/navashub/building-a-multilingual-business-assistant-for-kenya-231</link>
      <guid>https://dev.to/navashub/building-a-multilingual-business-assistant-for-kenya-231</guid>
      <description>&lt;h2&gt;
  
  
  How AI Can Bridge Language Gaps
&lt;/h2&gt;

&lt;p&gt;In Kenya's diverse linguistic landscape, providing business guidance across language barriers represents both a challenge and an opportunity. &lt;br&gt;
When I first demoed our AI business assistant,  I was asked if our agent could understand a question in Kiswahili or Sheng.&lt;br&gt;
At that moment, the system couldn't. But I immediately knew this was critical — especially because the majority of the client's audience are &lt;strong&gt;"waseh wa mtaa"&lt;/strong&gt; (neighborhood community members and hustlers), who often mix Kiswahili and Sheng in daily conversation.&lt;/p&gt;

&lt;p&gt;I quickly realized something important was missing - the rich linguistic diversity of Kenya's business community. The initial version could handle English queries well, but when asked questions in Swahili or Sheng, it fell short. That's when I knew we needed to go deeper into local language support including Kiswahili and Sheng, the popular urban slang.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Problem Worth Solving
&lt;/h2&gt;

&lt;p&gt;Kenya's business landscape is vibrant but fragmented by language. Many "waseh wa mtaa" who could benefit from business advice are more comfortable communicating in Kiswahili or Sheng rather than English. Traditional business resources often fail to reach these entrepreneurs because of this language gap.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Solution: A Multilingual AI Assistant
&lt;/h2&gt;

&lt;p&gt;For now a simple, powerful tool powered by OpenAI's GPT technology, wrapped in a lightweight Python + Streamlit application.&lt;br&gt;
The goal? Help anyone ask questions about starting or running a business in Kenya, whether in Kiswahili, Sheng, or English.&lt;/p&gt;
&lt;h2&gt;
  
  
  Here's a snippet of what that looked like:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkm2hbr092n8jleto92e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkm2hbr092n8jleto92e.png" alt="Response in Sheng!" width="775" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, the assistant can detect if someone is speaking in Sheng or Kiswahili and respond appropriately, staying authentic to how people actually communicate.&lt;/p&gt;

&lt;p&gt;When I tested it again — asking a question fully in Sheng:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;rada mse ni biz gani inaweza nipaea pesa mzuri&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;the assistant responded &lt;em&gt;perfectly&lt;/em&gt; in a casual, streetwise tone:&lt;/p&gt;


&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;This demo is simple — it's running on &lt;strong&gt;very limited context&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
I don't even have a proper API to pull structured business information dynamically yet.&lt;br&gt;&lt;br&gt;
I'm relying purely on basic keyword matching and GPT's ability to infer and generate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Imagine the possibilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If they expose a structured API or database with real business opportunities.&lt;/li&gt;
&lt;li&gt;If we feed it updated, hyper-local information (even specific to different parts of Nairobi or Kenya).&lt;/li&gt;
&lt;li&gt;If we continuously fine-tune or add memory.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We could deliver &lt;strong&gt;super-accurate&lt;/strong&gt;, &lt;strong&gt;culturally fluent&lt;/strong&gt; AI support to thousands of hustlers, shop owners, and entrepreneurs — in the exact language they use every day.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Language Detection Breakthrough
&lt;/h2&gt;

&lt;p&gt;Here's a peek at how we implemented multilingual support:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_language&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;text_lower&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;swahili_clues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;biashara&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jinsi&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kampuni&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nchini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kodi&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shirika&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;huduma&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nitaanzaje&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;sheng_clues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;msee&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;biz&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shugli&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kuomoka&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hustle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ngeta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mbogi&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keja&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;naanzaje&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nduthi&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mpesa&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text_lower&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sheng_clues&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sheng&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text_lower&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;swahili_clues&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;swahili&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;english&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This simple yet effective language detection system allows our assistant to identify whether a user is speaking Sheng, Swahili, or English based on keyword clues. For our target audience, this Sheng support can be a gamechanger in making the technology feel familiar and accessible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tailoring Responses to Local Context
&lt;/h3&gt;

&lt;p&gt;Once we detect the language, we customize the assistant's persona accordingly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_system_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;swahili&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Wewe ni msaidizi wa biashara unayetoa ushauri kuhusu kuanzisha au kuendesha biashara nchini Kenya kwa Kiswahili fasaha.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sheng&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Wewe ni msee wa biashara Kenya. Toa maelezo kwa lugha ya mtaa (Sheng) kuhusu mambo ya biashara hapa mtaani.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant focused only on giving advice related to starting and running businesses in Kenya.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Potential Impact
&lt;/h2&gt;

&lt;p&gt;This simple demonstration shows just the tip of the iceberg. Even with limited context and without access to structured API data, the assistant provides helpful responses. Imagine what would be possible with:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Full API access to comprehensive business information in Kenya&lt;/li&gt;
&lt;li&gt;More extensive training on Sheng vocabulary and expressions&lt;/li&gt;
&lt;li&gt;Integration with local business registration resources&lt;/li&gt;
&lt;li&gt;Personalization based on location within Kenya&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Data + AI + Local Context = 🔥 Massive potential.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This small demo captures what happens when we stop thinking of AI as a "global" one-size-fits-all tool, and start &lt;strong&gt;aligning it with real people, real communities, real language&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is just the beginning.  &lt;/p&gt;

</description>
      <category>data</category>
      <category>ai</category>
      <category>programming</category>
      <category>python</category>
    </item>
    <item>
      <title>How to Connect to PostgreSQL and Create a Database, User, and Tables</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Fri, 25 Apr 2025 09:00:30 +0000</pubDate>
      <link>https://dev.to/navashub/how-to-connect-to-postgresql-and-create-a-database-user-and-tables-51n1</link>
      <guid>https://dev.to/navashub/how-to-connect-to-postgresql-and-create-a-database-user-and-tables-51n1</guid>
      <description>&lt;p&gt;PostgreSQL is a powerful open-source relational database system that's popular for web applications, data analytics, and more. In this guide, I'll walk you through connecting to a PostgreSQL server, creating a database and user, setting up tables, and connecting via DBeaver.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Access to a Linux server with PostgreSQL installed&lt;/li&gt;
&lt;li&gt;SSH client on your local machine&lt;/li&gt;
&lt;li&gt;Basic command line knowledge&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Connect to Your Server via SSH
&lt;/h2&gt;

&lt;p&gt;First, connect to your remote server using SSH:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh navas@172.184.XXX.XXX
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enter your password when prompted. Once logged in, you'll need to access the PostgreSQL command line interface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Access the PostgreSQL CLI
&lt;/h2&gt;

&lt;p&gt;PostgreSQL creates a default &lt;strong&gt;postgres&lt;/strong&gt; user during installation. Switch to this user:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; postgres
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now access the PostgreSQL interactive terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;psql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should now see the PostgreSQL prompt: &lt;strong&gt;postgres=#&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Create a New User
&lt;/h2&gt;

&lt;p&gt;Let's create a dedicated user for your database operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;USER&lt;/span&gt; &lt;span class="n"&gt;navas&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;PASSWORD&lt;/span&gt; &lt;span class="s1"&gt;'your_secure_password'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For development purposes, you might want to grant superuser privileges:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;USER&lt;/span&gt; &lt;span class="n"&gt;navas&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;SUPERUSER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Note:&lt;/strong&gt; In production, grant only the necessary privileges following the principle of least privilege.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 4: Create a Database
&lt;/h2&gt;

&lt;p&gt;Create a new database owned by your user:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;navasdb&lt;/span&gt; &lt;span class="k"&gt;OWNER&lt;/span&gt; &lt;span class="n"&gt;navas&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 5: Connect to Your New Database
&lt;/h2&gt;

&lt;p&gt;Connect to your newly created database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="n"&gt;navasdb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 6: Create Tables
&lt;/h2&gt;

&lt;p&gt;Now let's create a sample table. Here's an example users table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;username&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 7: Verify Your Setup
&lt;/h2&gt;

&lt;p&gt;Check the existing tables in your database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;View the schemas:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 8: Exit PostgreSQL
&lt;/h2&gt;

&lt;p&gt;When you're done, exit the PostgreSQL CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then exit the postgres user session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;exit&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 9: Restart PostgreSQL (If Needed)
&lt;/h2&gt;

&lt;p&gt;If you've made configuration changes that require a restart:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart postgresql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Connecting with DBeaver
&lt;/h2&gt;

&lt;p&gt;DBeaver is a popular database GUI tool. Here's how to connect to your PostgreSQL database:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Install DBeaver&lt;/strong&gt; if you haven't already (available at &lt;a href="https://dbeaver.io/" rel="noopener noreferrer"&gt;dbeaver.io&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open DBeaver&lt;/strong&gt; and click on "New Database Connection"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select PostgreSQL&lt;/strong&gt; from the database list&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enter connection details&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Host: &lt;code&gt;172.184.XXX.XXX&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Port: &lt;code&gt;5432&lt;/code&gt; (default PostgreSQL port)&lt;/li&gt;
&lt;li&gt;Database: &lt;code&gt;navasdb&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Username: &lt;code&gt;navas&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Password: &lt;code&gt;your_secure_password&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test Connection&lt;/strong&gt; to verify everything works&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Click Finish&lt;/strong&gt; to save the connection&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Troubleshooting Tips
&lt;/h2&gt;

&lt;p&gt;If you encounter connection issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verify PostgreSQL is running:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  &lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status postgresql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;You've now successfully:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connected to your PostgreSQL server
&lt;/li&gt;
&lt;li&gt;Created a new database user
&lt;/li&gt;
&lt;li&gt;Established a new database
&lt;/li&gt;
&lt;li&gt;Created tables
&lt;/li&gt;
&lt;li&gt;Connected via DBeaver for graphical management
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This setup gives you a solid foundation for developing applications with PostgreSQL.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remember to always:&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use secure passwords
&lt;/li&gt;
&lt;li&gt;Follow proper privilege management in production environments
&lt;/li&gt;
&lt;li&gt;Regularly backup your databases
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Happy databasing! &lt;/p&gt;

</description>
    </item>
    <item>
      <title>Build an AI Instagram Caption Generator for Car Enthusiasts Using OpenAI and Streamlit 🚗🔥</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Sat, 19 Apr 2025 01:24:02 +0000</pubDate>
      <link>https://dev.to/navashub/build-an-ai-instagram-caption-generator-for-car-enthusiasts-using-openai-and-streamlit-4007</link>
      <guid>https://dev.to/navashub/build-an-ai-instagram-caption-generator-for-car-enthusiasts-using-openai-and-streamlit-4007</guid>
      <description>&lt;p&gt;If you post car content on Instagram, you know the pain of writing fresh captions every time. I built an AI-powered caption generator that helps you auto-generate captions, hashtags, and even TikTok sound ideas — just from a car photo.&lt;/p&gt;

&lt;p&gt;In this article, I’ll show you how I built it using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🧠 OpenAI GPT-4 Vision&lt;/li&gt;
&lt;li&gt;🖼️ Image analysis&lt;/li&gt;
&lt;li&gt;🌐 Streamlit app interface&lt;/li&gt;
&lt;li&gt;🛠️ My personal caption history as training context&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔧 What We'll Build
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftb9log88be0b7wfuduie.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftb9log88be0b7wfuduie.png" alt="Image description" width="660" height="536"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://streamlit.io/" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt; – for the UI&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://platform.openai.com/docs/guides/vision" rel="noopener noreferrer"&gt;OpenAI GPT-4 Vision&lt;/a&gt; – to understand car photos&lt;/li&gt;
&lt;li&gt;Python (with &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;streamlit&lt;/code&gt;, &lt;code&gt;dotenv&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📁 Project Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;car-caption-generator-ai/
├── app.py                     &lt;span class="c"&gt;# Streamlit app logic&lt;/span&gt;
├── utils/
│   ├── vision.py              &lt;span class="c"&gt;# GPT-4 Vision logic&lt;/span&gt;
│   ├── captions.py            &lt;span class="c"&gt;# Captions and hashtags generator&lt;/span&gt;
│   └── prompts.py             &lt;span class="c"&gt;# Stores the prompt template&lt;/span&gt;
├── requirements.txt           &lt;span class="c"&gt;# Dependencies&lt;/span&gt;
└── README.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setup Instructions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1 .Clone the repo:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/Navashub/caption_generator_ai.git
cd car-caption-generator-ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2.Create and activate a virtual environment:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python -m venv myvenv
source myvenv/bin/activate  # Windows: myvenv\Scripts\activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.Install dependencies:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install -r requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.Add your OpenAI API key in a .env file
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPENAI_API_KEY=""
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5.Run the app:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;streamlit run app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🧠 How It Works
&lt;/h2&gt;

&lt;p&gt;1.The app uses GPT-4 Vision to describe your uploaded car image.&lt;br&gt;
2.That description is passed into a prompt template (along with your past captions).&lt;br&gt;
3.The model returns:&lt;br&gt;
    - An Instagram caption&lt;br&gt;
    - Hashtags&lt;br&gt;
    - TikTok sound vibes&lt;/p&gt;

&lt;p&gt;Here’s a sample output for an Audi RS5:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Brutal beauty in carbon black. The RS5 doesn’t speak — it growls. Welcome to the autobahn attitude. 💨🔥
#RS5Power #FavouriteFourRings #AudiLife"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
    </item>
    <item>
      <title>Week 1 at LuxDev: Kicking off my Data Engineering Journey</title>
      <dc:creator>Navas Herbert</dc:creator>
      <pubDate>Tue, 08 Apr 2025 06:00:06 +0000</pubDate>
      <link>https://dev.to/navashub/week-1-at-luxdev-kicking-off-my-data-engineering-journey-48fm</link>
      <guid>https://dev.to/navashub/week-1-at-luxdev-kicking-off-my-data-engineering-journey-48fm</guid>
      <description>&lt;p&gt;I recently started a new chapter in my tech journey by joining LuxDev, an institution focused on practical, in-depth training in data analysis, data science, and data engineering.&lt;/p&gt;

&lt;p&gt;We kicked off our classes on March 31st, and after just one week, I’m already feeling the momentum. If you're curious about what diving into data engineering looks like — especially from day one — here’s a recap of what we covered in our Week 1.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Getting Oriented: What is Data Engineering?&lt;/strong&gt; &lt;br&gt;
Before jumping into the heavy tools and tech, we took time to understand what data engineering really is. From data pipelines to ETL processes, we discussed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The role of a data engineer in the modern data stack&lt;/li&gt;
&lt;li&gt;How data engineering connects to data science and analytics&lt;/li&gt;
&lt;li&gt;Real-world use cases where solid data infrastructure is a game-changer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Tooling up&lt;/strong&gt;&lt;br&gt;
We then moved straight into setting up our working environments. Here's what we installed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Python — Our go-to language for scripting and automations&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;PostgreSQL — A robust relational database&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;DBeaver — A universal database tool that makes it easy to interact with PostgreSQL (and others)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AWS CLI — To interface with Amazon Web Services directly from the terminal&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Aiven.io — For managed cloud data infrastructure&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Git Bash — Our preferred terminal on Windows systems&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Connecting to Servers, Cloud &amp;amp; Terminal&lt;/strong&gt;&lt;br&gt;
Things got real-world quickly when we started connecting to actual remote and cloud-based servers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;We used Linux systems and command-line tools to SSH into servers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Connected to a LuxDev-hosted cloud server — this involved working in a real Linux environment.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Established remote connections from our terminal to AWS and Aiven instances.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Set up secure, terminal-only connections between a local machine and cloud-hosted PostgreSQL databases.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of this was done without a GUI — just pure terminal power 💪.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔜 Up Next&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Looking forward to next week, we'll be diving deeper into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Data modeling&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Schema design&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;ETL pipelines&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;and probably... some Python scripting magic!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>ai</category>
      <category>python</category>
    </item>
  </channel>
</rss>
