<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Njeri Kimaru</title>
    <description>The latest articles on DEV Community by Njeri Kimaru (@njeri_kimaru).</description>
    <link>https://dev.to/njeri_kimaru</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3402741%2F93188859-acfe-4fd3-8462-d9762fcdcc6d.jpeg</url>
      <title>DEV Community: Njeri Kimaru</title>
      <link>https://dev.to/njeri_kimaru</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/njeri_kimaru"/>
    <language>en</language>
    <item>
      <title>Does ramalama make AI boring?? Running AI models with Ramalama.</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Mon, 30 Mar 2026 16:48:27 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/does-ramalama-make-ai-boring-running-ai-models-with-ramalama-2kml</link>
      <guid>https://dev.to/njeri_kimaru/does-ramalama-make-ai-boring-running-ai-models-with-ramalama-2kml</guid>
      <description>&lt;h1&gt;
  
  
  What is ramalama
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://ramalama.ai/" rel="noopener noreferrer"&gt;Ramalama&lt;/a&gt; is an open source command line tool that makes running AI models locally simple by treating them like containers.&lt;br&gt;
Ramalama runs models with podman/docker and there's no config needed.&lt;br&gt;
It is GPU optimizedand accelerates performance.&lt;br&gt;
It is compatible with llama.cpp, openvino, vLLM, whisper.cpp and manymore.&lt;/p&gt;
&lt;h2&gt;
  
  
  Installing ramalama
&lt;/h2&gt;

&lt;p&gt;Ramalama is easy to install.&lt;br&gt;
After installing check the version you are using.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf pip &lt;span class="nb"&gt;install &lt;/span&gt;python3-ramalama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ramalama version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftfalaol8rn6zay0lpkwr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftfalaol8rn6zay0lpkwr.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ramalama supports multiple model registries(transports);&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Ollama
&lt;/h2&gt;

&lt;p&gt;It is the quickest and easiest registry.&lt;br&gt;
Here are a few AI models i ran using ollama.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ramalama run granite moe3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkodanaa77q5ei0cm70nb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkodanaa77q5ei0cm70nb.png" alt=" "&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ramalama run ollama://llama4:scout
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyz2d215ai9bkc9d7p4hy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyz2d215ai9bkc9d7p4hy.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Hugging face
&lt;/h2&gt;

&lt;p&gt;Some hugging face model require one to login.&lt;br&gt;
Here are some that don't require logging in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ramalama run huggingface://instructlab/granite-7b-lab-Q4_K_M.gguf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ulbhr6gpzj1lj79e8l8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ulbhr6gpzj1lj79e8l8.png" alt=" "&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ramalama run huggingface://microsoft/Phi-3-mini-4k-instruct-q4.gguf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcascyyedeihsgbmb6jjd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcascyyedeihsgbmb6jjd.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Modelscope
&lt;/h2&gt;

&lt;p&gt;Model scope worked quite well too.&lt;br&gt;
but I had to upgrade ramalama's version.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf upgrade ramalama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8oz9h8y8onpduc1ax8rp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8oz9h8y8onpduc1ax8rp.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here are some of modelscope's model I used;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ramalama run modelscope://Qwen/Qwen2.5-7B-Instruct-GGUF/qwen2.5-7b-instruct-q4_k_m.gguf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8htkg8xp4pzn9031mp1f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8htkg8xp4pzn9031mp1f.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  4. OCI registries
&lt;/h2&gt;

&lt;p&gt;Let's start with what is OCI?&lt;br&gt;
OCI(Open Container Initiative), is a standard or a specification which defines how containers and their images should be packaged and determined.&lt;br&gt;
There are several OCI registries;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;quay.io&lt;/li&gt;
&lt;li&gt;docker.io&lt;/li&gt;
&lt;li&gt;github container registry(ghcr.io)
In github I had to login first then get an authentication token.
Afterwards, I pushed a model then accessed using the ghcr.io
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ramalama convert ollama://mistral oci://ghcr.io/njeri-kimaru/mistral:gguf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ramalama run oci://ghcr.io/njeri-kimaru/mistral:gguf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fom5vi2ea2tvkkburyrig.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fom5vi2ea2tvkkburyrig.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;google container registry(gcr.io)&lt;/li&gt;
&lt;li&gt;amazon elastic container registry(ecr.io)&lt;/li&gt;
&lt;li&gt;Ramalama Container Registry(rlcr.io)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5.URL based source
&lt;/h2&gt;

&lt;p&gt;RamaLama also supports loading models directly from URLs instead of registries.&lt;br&gt;&lt;br&gt;
They include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;https://&lt;/code&gt; → download from the internet
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;file://&lt;/code&gt; → load from your local machine
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6.Hosted API
&lt;/h2&gt;

&lt;p&gt;For a model like Openai to run it requires a secret key which you get from &lt;a href="https://platform.openai.com/api-keys" rel="noopener noreferrer"&gt;openai API-keys&lt;/a&gt; then you'll have to pay for your model to run successfully.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb62e5nfx0aazdy5ddoxb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb62e5nfx0aazdy5ddoxb.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ramalama</category>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Docling CLI to parse PDFs and export it to multiple formats</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Sat, 28 Mar 2026 03:22:25 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/docling-cli-to-parse-pdfs-and-export-it-to-multiple-formats-3cgc</link>
      <guid>https://dev.to/njeri_kimaru/docling-cli-to-parse-pdfs-and-export-it-to-multiple-formats-3cgc</guid>
      <description>&lt;h2&gt;
  
  
  What is Docling ???
&lt;/h2&gt;

&lt;p&gt;Docling is an open source document processing library that converts various document formats into structured outputs.&lt;br&gt;
Docling plays an important part in the RAG pipeline.&lt;/p&gt;
&lt;h2&gt;
  
  
  I'll be taking you through the process of parsing PDFs into structured formats.
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Step 1: Set up
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Create the project structure in your terminal;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;docling_cli
&lt;span class="nb"&gt;cd &lt;/span&gt;docling_cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;Create your virtual environment and activate it.
Fedora&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcknjzi2wulm9mu25uxqo.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcknjzi2wulm9mu25uxqo.jpeg" alt=" " width="800" height="87"&gt;&lt;/a&gt;&lt;br&gt;
Windows &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftnz0t11essq3r85h25cj.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftnz0t11essq3r85h25cj.jpeg" alt=" " width="602" height="266"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 2: Installing docling
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;docling
docling &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Fedora&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feo6017k3vs4q1wu9z37v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feo6017k3vs4q1wu9z37v.png" alt=" " width="800" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Windows&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi5svweubh26mvz8fx7tj.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi5svweubh26mvz8fx7tj.jpeg" alt=" " width="800" height="302"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check the docling's version&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6hppa0i1j03xet2e86hp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6hppa0i1j03xet2e86hp.png" alt=" " width="800" height="139"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 3: Creating input and outputs folders
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;create a folder called data where you will stored your desired pdfs.&lt;/li&gt;
&lt;li&gt;create a new folder and name it outputs
then inside the folders create new folders called; markdown outputs, html outputs and json outputs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvp27bsfk0h0ttorb2jz0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvp27bsfk0h0ttorb2jz0.png" alt=" " width="800" height="139"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  step 4: Default options.
&lt;/h3&gt;

&lt;p&gt;Start by running default options&lt;br&gt;
run;&lt;br&gt;
Changes pdf into markdown format.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docling your-pdf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhlc53h8htczj4lt1wcv0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhlc53h8htczj4lt1wcv0.png" alt=" " width="800" height="102"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmdg4754gdk8ngghcrvyt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmdg4754gdk8ngghcrvyt.png" alt=" " width="800" height="155"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fotjr3hiis9wydpimsjhd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fotjr3hiis9wydpimsjhd.png" alt=" " width="800" height="155"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Changing the pdfs into html format
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docling &lt;span class="nt"&gt;--to&lt;/span&gt; html &lt;span class="k"&gt;*&lt;/span&gt;.pdf &lt;span class="nt"&gt;--output&lt;/span&gt; ~Documents/docling_cli/outputs/html_outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fawuw0owsm19mlihrna2f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fawuw0owsm19mlihrna2f.png" alt=" " width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Changing the pdfs into other formats
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Markdown
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docling --to md *.pdf --output ~Documents/docling_cli/outputs/markdown_outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3cnza6xffqqdv95omjjx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3cnza6xffqqdv95omjjx.png" alt=" " width="800" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  2. Json
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docling --to json *.pdf --output ~Documents/docling_cli/outputs/json_outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcuufwel6js9kkbczotrc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcuufwel6js9kkbczotrc.png" alt=" " width="800" height="162"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  3. Plain text
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docling --to text *.pdf --output ~Documents/docling_cli/outputs/plaintext_outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3ev3jvnot3i3npct86t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3ev3jvnot3i3npct86t.png" alt=" " width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  4. yaml
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docling --to yaml *.pdf --output ~Documents/docling_cli/outputs/yaml_outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ktwkgb0fux76efv5qoq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ktwkgb0fux76efv5qoq.png" alt=" " width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  5. html_split_page
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docling --to html_split_page *.pdf --output ~Documents/docling_cli/outputs/html_split_page_outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzpdlazql3iyhglcapu23.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzpdlazql3iyhglcapu23.png" alt=" " width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  6. DOCtags
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docling --to doctags *.pdf --output ~Documents/docling_cli/outputs/doctags_outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw7nxa84d44bc91d4fuzm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw7nxa84d44bc91d4fuzm.png" alt=" " width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  7. vtt
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docling --to vtt *.pdf --output ~Documents/docling_cli/outputs/vtt_outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fugs093q1pyftsyvrp4q8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fugs093q1pyftsyvrp4q8.png" alt=" " width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Analyzing the result findings.
&lt;/h3&gt;

&lt;p&gt;I used three types of pdfss;&lt;br&gt;
one with tables, the other with text and images and the other had tables and paragraphs. Here are my key findings;&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Pdf with tables
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;In HTML, the rows and columns came out better than they were in the original pdf. &lt;/li&gt;
&lt;li&gt;Markdown outputs were good too as it wrote the tables in markdown format without losing anything. &lt;/li&gt;
&lt;li&gt;JSON was broke everything down into nested objects&lt;/li&gt;
&lt;li&gt;Plain text was good too but not as compared to markdown.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. Pdf with text and images
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;HTML lost the color of the images.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Pdf with tables and paragraphs
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Paragraphs in all formats came out nicely as texts.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>fedora</category>
      <category>libravatar</category>
      <category>linux</category>
      <category>programmers</category>
    </item>
    <item>
      <title>Dealing with unstructured, scanned multilingual pdfs?? Here's how to parse them using OCR engines with docling CLI.</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Wed, 25 Mar 2026 22:23:04 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/dealing-with-unstructured-scanned-multilingual-pdfs-heres-how-to-parse-them-using-ocrs-with-4e9j</link>
      <guid>https://dev.to/njeri_kimaru/dealing-with-unstructured-scanned-multilingual-pdfs-heres-how-to-parse-them-using-ocrs-with-4e9j</guid>
      <description>&lt;h1&gt;
  
  
  &lt;strong&gt;WHAT DOCLING OCR?&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Docling OCR is an open-source document processing library developed by IBM research. It is designed to parse and convert complex multilingual documents into json or markdown. Most documents need to be parsed before  translating.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is OCR.
&lt;/h2&gt;

&lt;p&gt;OCR stands for Optimal Character Recognition ability. &lt;br&gt;
It allows to exract text from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;images&lt;/li&gt;
&lt;li&gt;scanned pdfs&lt;/li&gt;
&lt;li&gt;multilingual documents.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  I'll be taking you through the steps of parsing scanned multilingual documents.
&lt;/h3&gt;
&lt;h4&gt;
  
  
  Creating a folder and a virtual environment
&lt;/h4&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;your_foldername
&lt;span class="nb"&gt;cd &lt;/span&gt;your_foldername
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The first step is to always create a folder where you'll store your files. Then cd that folder. Finally, create a virtual environment and activate it.&lt;br&gt;
Fedora &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0dd2jyt5tohbfvdfvxy.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0dd2jyt5tohbfvdfvxy.jpeg" alt=" " width="800" height="87"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Windows&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hj3pb47r0hyvqaaifh7.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hj3pb47r0hyvqaaifh7.jpeg" alt=" " width="602" height="266"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Install docling and easyocr using python package manager
&lt;/h4&gt;

&lt;p&gt;NB; Docling takes sometime to install&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;docling
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fedora&lt;br&gt;
docling install&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbrumzq82vk1b6ngj9imy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbrumzq82vk1b6ngj9imy.png" alt=" " width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;easy ocr install&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;easyocr
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fne60zd1l9etv58rf4bb3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fne60zd1l9etv58rf4bb3.png" alt=" " width="800" height="477"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Windows&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5f45c9xjk6zi6hbhk4px.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5f45c9xjk6zi6hbhk4px.jpeg" alt=" " width="617" height="248"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Check for the versions of both docling and easyocr.
&lt;/h4&gt;

&lt;p&gt;Fedora&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docling --version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip show easyocr
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwofmei9f51pais3mzmc0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwofmei9f51pais3mzmc0.png" alt=" " width="800" height="178"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Windows&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzks7nbt169d9kmumufez.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzks7nbt169d9kmumufez.jpeg" alt=" " width="600" height="330"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Look for scanned multilingual or non-english pdf.
&lt;/h4&gt;

&lt;p&gt;I would recomend this site &lt;a href="https://archive.org/" rel="noopener noreferrer"&gt;here&lt;/a&gt; for your scanned multilingual documents.&lt;br&gt;
Create an account and log in and search for the documents you would like to use.&lt;br&gt;
Finally, download them using pdf format and save them into a folder.&lt;/p&gt;
&lt;h4&gt;
  
  
  Converting document into html or markdown using docling.
&lt;/h4&gt;

&lt;p&gt;Start by creating a folder where you will save your output files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;your_output_folder
&lt;span class="nb"&gt;cd &lt;/span&gt;your_output_folder
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Then run the following codes using Docling CLI
docling original_scanned_pdfs/hindu_scanned.pdf  &lt;span class="c"&gt;#your pdf&lt;/span&gt;
&lt;span class="nt"&gt;--ocr&lt;/span&gt;   &lt;span class="c"&gt;#enables ocr since you have a scanned documment&lt;/span&gt;
&lt;span class="nt"&gt;--ocr-engine&lt;/span&gt; easyocr   &lt;span class="c"&gt;#specifies the ocr engine&lt;/span&gt;
&lt;span class="nt"&gt;--ocr-lang&lt;/span&gt; hi  &lt;span class="c"&gt;#specifies the language in my case it's hindu&lt;/span&gt;
&lt;span class="nt"&gt;--to&lt;/span&gt; md   &lt;span class="c"&gt;#specifies output format md markdown&lt;/span&gt;
&lt;span class="nt"&gt;--output&lt;/span&gt; ./markdown_output/  &lt;span class="c"&gt;#where the output will be saved&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Here's the output
Fedora&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4l5ho5kn0uyg8azn1xp6.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4l5ho5kn0uyg8azn1xp6.jpeg" alt=" " width="800" height="68"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h5&gt;
  
  
  Here are some of the languages abbreviations you can use in the &lt;strong&gt;easyocr-lang&lt;/strong&gt;;
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;English: en&lt;/li&gt;
&lt;li&gt;Hindi: hi&lt;/li&gt;
&lt;li&gt;French: fr&lt;/li&gt;
&lt;li&gt;German: de&lt;/li&gt;
&lt;li&gt;Spanish: es&lt;/li&gt;
&lt;li&gt;Portuguese: pt&lt;/li&gt;
&lt;li&gt;Italian: it&lt;/li&gt;
&lt;li&gt;Dutch: nl&lt;/li&gt;
&lt;li&gt;Russian: ru&lt;/li&gt;
&lt;li&gt;Chinese (Simplified): ch_sim&lt;/li&gt;
&lt;li&gt;Chinese (Traditional): ch_tra&lt;/li&gt;
&lt;li&gt;Japanese: ja&lt;/li&gt;
&lt;li&gt;Korean: ko&lt;/li&gt;
&lt;li&gt;Arabic: ar&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Now let's try other OCR-engines;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Rapid ocr
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;As the name itself says it is very fast.&lt;/li&gt;
&lt;li&gt;It converts the languages into html or markdown really fast.
code steps:&lt;/li&gt;
&lt;li&gt;install rapidocr using pip
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;rapidocr
pip show rapidocr &lt;span class="c"&gt;#to get the version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyg3jn3ibr1r4mo9tle6.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyg3jn3ibr1r4mo9tle6.jpeg" alt=" " width="800" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;But like seen below you must install onnxruntime
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;onnxruntime
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxd5xjlo1v9es8p50q5qr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxd5xjlo1v9es8p50q5qr.png" alt=" " width="800" height="289"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Then run your codes to get your output; eg my arabic pdf;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
docling   &lt;span class="c"&gt;# we are using docling cli&lt;/span&gt;
&lt;span class="nt"&gt;--ocr&lt;/span&gt;     &lt;span class="c"&gt;# ocr&lt;/span&gt;
&lt;span class="nt"&gt;--force-ocr&lt;/span&gt; 
&lt;span class="nt"&gt;--ocr-engine&lt;/span&gt; rapidocr  &lt;span class="c"&gt;#specify your ocr engine&lt;/span&gt;
&lt;span class="nt"&gt;--to&lt;/span&gt; md                &lt;span class="c"&gt;# to markdown format&lt;/span&gt;
&lt;span class="nt"&gt;--output&lt;/span&gt; ./markdown_outputs_rapidocr  &lt;span class="c"&gt;#save in this folder&lt;/span&gt;
./original_scanned_pdfs/arabic_scanned.pdf &lt;span class="c"&gt;#arabic pdf&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Outputs&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xbxfve1pibaqqylmzoi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xbxfve1pibaqqylmzoi.png" alt=" " width="800" height="412"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  2. Tesseract
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Install the OCR engine:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;tesseract
pip show &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpiumpzxzw88oo9rbthn8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpiumpzxzw88oo9rbthn8.png" alt=" " width="800" height="113"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requires one to install packages for every language.eg for arabic.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-L&lt;/span&gt; https://github.com/tesseract-ocr/tessdata/raw/main/ara.traineddata &lt;span class="nt"&gt;-o&lt;/span&gt; ~/Documents/docling_ocr/tessdata/ara.traineddata

&lt;span class="o"&gt;![&lt;/span&gt; &lt;span class="o"&gt;](&lt;/span&gt;https://dev-to-uploads.s3.amazonaws.com/uploads/articles/oitfh2zzqi90stuwa0mo.png&lt;span class="o"&gt;)&lt;/span&gt;
- Then run the parsing codes and save them &lt;span class="k"&gt;in &lt;/span&gt;a folder


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;docling   # we are using docling cli&lt;br&gt;
--ocr     # ocr&lt;br&gt;
--force-ocr &lt;br&gt;
--ocr-engine tesseract  #specify your ocr engine&lt;br&gt;
--to md                # to markdown format&lt;br&gt;
--output ./markdown_outputs_tesserocr  #save in this folder&lt;br&gt;
./original_scanned_pdfs/arabic_scanned.pdf #arabic pdf&lt;/p&gt;

&lt;p&gt;Here are the outputs;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmsm9ixunr2rkap1hq0f0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmsm9ixunr2rkap1hq0f0.png" alt=" " width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  3. Tesserocr OCR
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;install the ocr and check the version.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;tesserocr
pip show &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;install the ocr and check the version.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgqjonbsbiowiysdtmfr0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgqjonbsbiowiysdtmfr0.png" alt=" " width="800" height="165"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requires one to install each language package
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget -P /directory-to-where-you-want-to-store/tessdata \
  https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata \
  https://github.com/tesseract-ocr/tessdata/raw/main/swa.traineddata
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F254ccx6x0ijcsnuwaf5j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F254ccx6x0ijcsnuwaf5j.png" alt=" " width="800" height="393"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;parse the documents and save them in a tesserocr folder
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;docling&lt;/span&gt;   &lt;span class="c1"&gt;# we are using docling cli&lt;/span&gt;
&lt;span class="s"&gt;--ocr&lt;/span&gt;     &lt;span class="c1"&gt;# ocr&lt;/span&gt;
&lt;span class="s"&gt;--force-ocr&lt;/span&gt; 
&lt;span class="s"&gt;--ocr-engine tesserocr&lt;/span&gt;  &lt;span class="c1"&gt;#specify your ocr engine&lt;/span&gt;
&lt;span class="s"&gt;--to md&lt;/span&gt;                &lt;span class="c1"&gt;# to markdown format&lt;/span&gt;
&lt;span class="s"&gt;--output ./markdown_outputs_tesserocr&lt;/span&gt;  &lt;span class="c1"&gt;#save in this folder&lt;/span&gt;
&lt;span class="s"&gt;./original_scanned_pdfs/arabic_scanned.pdf&lt;/span&gt; &lt;span class="c1"&gt;#arabic pdf&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Here's the output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3k69j6f7e37mwcpnotmm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3k69j6f7e37mwcpnotmm.png" alt=" " width="800" height="257"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Let's analyse these different OCR-engines.
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Easyocr
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;It is easy to use you just install easyocr then input your parsing code and that's it.&lt;/li&gt;
&lt;li&gt;However, it takes a lot of time to parse your document, it is realy slow.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. Rapidocr
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;You have to install onnxruntime.&lt;/li&gt;
&lt;li&gt;As the word itself says it is really fast and installation is not complex at all.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Tesseract
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;You have to install a lot more packages hence the installation is really complex.&lt;/li&gt;
&lt;li&gt;For every language you must upload it's own package. &lt;/li&gt;
&lt;li&gt;Give really good results.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  4. Tesserocr
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Uses tesseract internally.&lt;/li&gt;
&lt;li&gt;A bit complex when installing.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  5 Ocrmac
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Requires mac to install.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fej4afzq4pwd21m77q915.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fej4afzq4pwd21m77q915.png" alt=" " width="800" height="409"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Result Findimgs&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Other than differing character changes in the french document outputs, the ocr-engines gave almost similar results for the scanned pdfs.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;easy ocr combines only the listed languages&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5asizbb2sd6d5te84bpg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5asizbb2sd6d5te84bpg.png" alt=" " width="800" height="172"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;using html&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F638dj9uf85wfhxxs91nn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F638dj9uf85wfhxxs91nn.png" alt=" " width="800" height="619"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>programming</category>
      <category>contributions</category>
      <category>fedora</category>
    </item>
    <item>
      <title>Wondering how to join outreachy? Look no further. A clear onboarding guide for outreachy application process.</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Tue, 24 Mar 2026 19:35:53 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/wondering-how-to-join-outreachy-look-no-further-a-clear-onboarding-guide-for-outreachy-internship-1775</link>
      <guid>https://dev.to/njeri_kimaru/wondering-how-to-join-outreachy-look-no-further-a-clear-onboarding-guide-for-outreachy-internship-1775</guid>
      <description>&lt;p&gt;&lt;strong&gt;What is outreachy&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://www.outreachy.org/" rel="noopener noreferrer"&gt;Outreachy&lt;/a&gt; is a community which provides internships in open source. &lt;br&gt;
It is a &lt;strong&gt;paid remote&lt;/strong&gt; internship.&lt;br&gt;
Outreachy provides internships to anyone from any background who faces underrepresentation, systemic bias, or discrimination in the technical industry where they are living.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is open source&lt;/strong&gt;&lt;br&gt;
Open source refers to a software whose source code is publicly available for anyone to view, use, modify, and share.&lt;/p&gt;

&lt;h2&gt;
  
  
  Outreachy application process;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.The initial process
&lt;/h3&gt;

&lt;p&gt;This is the first step in the outreachy application process.&lt;br&gt;
Please check out on &lt;a href="https://www.outreachy.org/docs/applicant/" rel="noopener noreferrer"&gt;here&lt;/a&gt; on the initial application stage guidelines.&lt;br&gt;
Here are some of the tips that &lt;strong&gt;I would advise the 2026 outreachy applicants December round;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Apply early. Applying early increases your chances of getting your initial stage approved as there are a lot of applicants.&lt;/li&gt;
&lt;li&gt;Have your &lt;a href="https://www.outreachy.org/docs/applicant/" rel="noopener noreferrer"&gt;four essays&lt;/a&gt; ready before the initial application stage opens. I wrote mine two weeks before the portal opened such that once the portal opened I just copied and pasted and submitted.&lt;/li&gt;
&lt;li&gt;When writing your essay questions please be clear and do not use AI. Be authentic and give your personal stories.&lt;/li&gt;
&lt;li&gt;Be alert for the date the outreachy initial stage opens. Outreachy usually have a &lt;a href="https://www.outreachy.org/docs/applicant/" rel="noopener noreferrer"&gt;timeline&lt;/a&gt; for when the initial application stage opens.
For the specific date please subscribe to their social media handles to get the information once they open.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. The contribution phase
&lt;/h3&gt;

&lt;p&gt;I am currently in my contribution phase(I am really hoping to join this amazing community) but so far I have the following tips;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start contributing early.&lt;/li&gt;
&lt;li&gt;Carefully look at the projects and choose a project that you are passionate about.&lt;/li&gt;
&lt;li&gt;There's a final application form that must be filled only if you've made a contribution.&lt;/li&gt;
&lt;li&gt;Contribute as much as you can and also be active and help others.&lt;/li&gt;
&lt;li&gt;You can check out this &lt;a href="https://discussion.fedoraproject.org/t/why-i-got-selected-for-fedora-outreachy-and-what-might-help-you-too/155348" rel="noopener noreferrer"&gt;blog&lt;/a&gt; from a past outreachy intern.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. The intern selection phase.
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;This is the last stage where interns are selected. Then afterwards they start their three months internship.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Have you been underrepresented in the tech industry?? Here's your chance to get involved with a community which supports diversity.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Also I'll be updating more blogs about outreachy.&lt;br&gt;
look out for my next blog on Frequently asked questions on outreachy and another one on how to stand out during the contribution phase.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>outreachy</category>
      <category>programming</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Fedora linux not fedora hats, a beginner's guide to fedora.</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Tue, 24 Mar 2026 08:29:24 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/fedora-linux-not-fedora-hats-a-beginners-guide-to-fedora-14nj</link>
      <guid>https://dev.to/njeri_kimaru/fedora-linux-not-fedora-hats-a-beginners-guide-to-fedora-14nj</guid>
      <description>&lt;h1&gt;
  
  
  What is fedora?
&lt;/h1&gt;

&lt;p&gt;When I mention fedora some might think am referring to fedora hats 😂.&lt;br&gt;
Let me introduce you to fedora linux. This term might be new to you and that's okay initially, it was new to me too. Fedora is a free and open source operating system based on linux. However, fedora is more than just a software it's a community project which is open to everyone. &lt;br&gt;
In this article I'll be writing an introduction on the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Fedora project&lt;/li&gt;
&lt;li&gt;Fedora linux&lt;/li&gt;
&lt;li&gt;The Fedora community&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The fedora project.
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://fedoraproject.org/" rel="noopener noreferrer"&gt;The Fedora project&lt;/a&gt; is a global community of users and developers who collaborate to build Fedora Linux, an open-source operating system. What makes Fedora stand out? For one, it’s completely free and open-source, with new releases every six months and updates for 13 months. The project places strong emphasis on detailed documentation, ensuring users have clear guides on installation and usage. Unlike many Linux distributions, Fedora follows a liberal updates policy—balancing frequent improvements with minimal disruption. Backed by an active and diverse community, it evolves rapidly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fedora Linux.
&lt;/h2&gt;

&lt;p&gt;Fedora Linux is a free and open-source Linux distribution developed by the Fedora Project. It was originally developed in 2003 as a continuation of the Red Hat Linux project, and it aims to be on the leading edge of open-source technologies. It is now the upstream source for CentOS Stream and Red Hat Enterprise Linux.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fedora community
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://fedoraproject.org/workstation/community/" rel="noopener noreferrer"&gt;The fedora community&lt;/a&gt; is is an online community aimed at improving people's lives through free software. It was formed in 2003 as a partnership between Red Hat and volunteers from around the world, and has grown to tens of thousands of project members.&lt;br&gt;
Some of Fedora community initiatives include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.fedoraproject.org/en-US/dei/events/fwd/about-fwd/" rel="noopener noreferrer"&gt;Fedora Week of Diversity (FWD)&lt;/a&gt;: An annual event celebrating the diverse individuals within the Fedora Community.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://gitlab.com/fedora/dei/stories" rel="noopener noreferrer"&gt;Contributor Stories&lt;/a&gt;: A recognition initiative that highlights individual contributors who have positively impacted others during their time in Fedora.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;If you want to get involved in fedora, begin at the &lt;a href="https://fedoraproject.org/workstation/community/" rel="noopener noreferrer"&gt;Fedora Project website&lt;/a&gt; with resources for new contributors, including a &lt;a href="https://lists.fedoraproject.org/archives/list/fedora-join@lists.fedoraproject.org/" rel="noopener noreferrer"&gt;mailing list&lt;/a&gt;, forums, and chat channels for getting connected with other Fedora enthusiasts.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Also check out for my next blog I'll be writing a guide on how to install fedora and its packages using dnf.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fedora</category>
      <category>linux</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>BAYESIAN AND FREQUENTISTS</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Thu, 16 Oct 2025 13:09:02 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/bayesian-and-frequentists-o98</link>
      <guid>https://dev.to/njeri_kimaru/bayesian-and-frequentists-o98</guid>
      <description>&lt;p&gt;Bayesian and frequentist are two different approaches to statistical inference, differing primarily in how they define and use probability to interpret uncertainty. The frequentist approach considers probability as the long-run frequency of an event and views population parameters as fixed but unknown. In contrast, the Bayesian approach treats probability as a degree of belief and considers parameters to be random variables that can be updated with new evidence using prior beliefs and observed data. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F88oh6ry61tprocdmtv2m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F88oh6ry61tprocdmtv2m.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Frequentist approach
&lt;/h4&gt;

&lt;p&gt;Probability: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Views probability as the long-run frequency of an event if an experiment were repeated many times. 
Parameters:&lt;/li&gt;
&lt;li&gt;Treats parameters of a model as fixed, but unknown, values. 
Key output: &lt;/li&gt;
&lt;li&gt;Focuses on estimating parameters based on the observed data, often using methods like maximum likelihood estimation. It provides a single best estimate for the parameter. 
Uncertainty:&lt;/li&gt;
&lt;li&gt;Quantifies uncertainty through confidence intervals, which describe the range that would contain the true parameter in a high percentage of repeated experiments. 
Example:&lt;/li&gt;
&lt;li&gt;When testing a coin, the frequentist approach would ask, "What is the probability of getting this result, given a fair coin?" The probability is a property of the data, not the hypothesis itself. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi0okzm1ueuyr5ujn0qkp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi0okzm1ueuyr5ujn0qkp.png" alt=" " width="307" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Bayesian approach
&lt;/h4&gt;

&lt;p&gt;Probability: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Views probability as a degree of belief or certainty about an unknown event or parameter. 
Parameters:&lt;/li&gt;
&lt;li&gt;Treats parameters as random variables with their own probability distributions. 
Key output:&lt;/li&gt;
&lt;li&gt;Updates the probability distribution of a parameter based on new evidence, combining prior beliefs with observed data through Bayes' theorem. 
Uncertainty:&lt;/li&gt;
&lt;li&gt;Quantifies uncertainty through a posterior distribution, which is a probability distribution of the parameter after considering the data. 
Example:&lt;/li&gt;
&lt;li&gt;When testing a coin, the Bayesian approach would ask, "What is the probability that the coin is biased, given the results of my experiment?" It starts with a prior belief about the coin and updates it with each flip. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkllq00atax7fzfv8jqim.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkllq00atax7fzfv8jqim.png" alt=" " width="696" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>machinelearning</category>
      <category>api</category>
    </item>
    <item>
      <title>ANOVA</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Thu, 16 Oct 2025 12:37:45 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/anova-4jlj</link>
      <guid>https://dev.to/njeri_kimaru/anova-4jlj</guid>
      <description>&lt;h3&gt;
  
  
  Types of ANOVA
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;There are 3 main types of ANOVA, depending on the number of independent variables and interactions involved:&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  One-Way ANOVA also ttest  two sample
&lt;/h4&gt;

&lt;p&gt;What it compares: One independent variable (factor) with 2 or more groups.&lt;/p&gt;

&lt;p&gt;Example: Comparing test scores between 3 teaching methods.&lt;/p&gt;

&lt;p&gt;Assumption: Groups are independent and data is normally distributed.&lt;/p&gt;

&lt;p&gt;Python function: scipy.stats.f_oneway()&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from scipy.stats import f_oneway
import numpy as np

campaign_A = [12, 15, 14, 10, 13, 15, 11, 14, 13, 16]
campaign_B = [18, 17, 16, 15, 20, 19, 18, 16, 17, 19]
campaign_C = [10, 9, 11, 10, 12, 9, 11, 8, 10, 9]

f_stats, p_value = f_oneway(campaign_A,campaign_B,campaign_C)
print(f_stats, p_value)
alpha = 0.05

if p_value &amp;lt; alpha:
    print ("reject the null hypothesis")
else:
    print ("fail to reject null hypothesis")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs1c069dbxgba10tw055u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs1c069dbxgba10tw055u.png" alt=" " width="782" height="687"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Two-Way ANOVA
&lt;/h4&gt;

&lt;p&gt;What it compares: Two independent variables, possibly with interaction.&lt;/p&gt;

&lt;p&gt;Example: Test scores by teaching method and gender (2 factors).&lt;/p&gt;

&lt;p&gt;You can also test: Interaction effect — whether the effect of one factor depends on the other.&lt;/p&gt;

&lt;p&gt;📍 Usually implemented using statsmodels with a formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from statsmodels.formula.api import ols
import statsmodels.api as sm

model = ols('Score ~ C(Method) + C(Gender) + C(Method):C(Gender)', data=df).fit()
anova_table = sm.stats.anova_lm(model, typ=2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wxgvc8hvi10jenwktsr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wxgvc8hvi10jenwktsr.png" alt=" " width="224" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Repeated Measures ANOVA
&lt;/h4&gt;

&lt;p&gt;What it compares: Same subjects measured under different conditions or times.&lt;/p&gt;

&lt;p&gt;Example: Blood pressure before, during, and after treatment on same patients.&lt;/p&gt;

&lt;p&gt;Use when: Data is not independent, i.e., repeated measures from same subjects.&lt;/p&gt;

&lt;p&gt;Python: statsmodels or pingouin library.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3v7gh00bmc4g934905jk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3v7gh00bmc4g934905jk.png" alt=" " width="319" height="158"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>anova</category>
      <category>statistics</category>
    </item>
    <item>
      <title>PARAMETRIC AND NON-PARAMETRIC TESTS</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Thu, 09 Oct 2025 12:43:34 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/parametric-and-non-parametric-tests-4k1a</link>
      <guid>https://dev.to/njeri_kimaru/parametric-and-non-parametric-tests-4k1a</guid>
      <description>&lt;h2&gt;
  
  
  Parametric Tests
&lt;/h2&gt;

&lt;p&gt;Assume your data follows a specific distribution — usually a normal distribution (bell-shaped curve).&lt;/p&gt;

&lt;h4&gt;
  
  
  Key assumptions:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The data is normally distributed&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The sample size is large enough&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data is measured on interval or ratio scale&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Homogeneity of variance (similar spread in groups)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Examples:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;t-test -- Compare means between 2 groups&lt;/li&gt;
&lt;li&gt;ANOVA -- Compare means across 3+ groups&lt;/li&gt;
&lt;li&gt;Pearson correlation -- Relationship between two variables&lt;/li&gt;
&lt;li&gt;Linear regression --  Predicting outcomes based on predictors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj0d2zsyn0cwuebudzthl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj0d2zsyn0cwuebudzthl.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Non-Parametric Tests
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Don’t assume any specific distribution of data. These are more flexible, especially for:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Skewed data&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ordinal data&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Small sample sizes&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Examples:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Mann–Whitney U test -- Non-parametric alternative to t-test&lt;/li&gt;
&lt;li&gt;Kruskal–Wallis test -- Alternative to ANOVA&lt;/li&gt;
&lt;li&gt;Wilcoxon signed-rank -- Paired samples (like paired t-test)&lt;/li&gt;
&lt;li&gt;Spearman correlation -- Non-parametric correlation&lt;/li&gt;
&lt;li&gt;Chi-square test   Categorical data (e.g., frequencies)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxjyvolg3o3csancxyfs6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxjyvolg3o3csancxyfs6.png" alt=" " width="782" height="687"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmjhwni4tctdcz7zalsgg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmjhwni4tctdcz7zalsgg.png" alt=" " width="800" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>datascience</category>
      <category>statistics</category>
    </item>
    <item>
      <title>Degrees of Freedom and their role in Statistics</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Thu, 02 Oct 2025 11:26:56 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/degrees-of-freedom-and-their-role-in-statistics-1e9e</link>
      <guid>https://dev.to/njeri_kimaru/degrees-of-freedom-and-their-role-in-statistics-1e9e</guid>
      <description>&lt;p&gt;You’ve seen it in t-tests, chi-square, and regression: Degrees of Freedom (DoF). But what does it really mean?&lt;/p&gt;

&lt;h3&gt;
  
  
  🎓 A Simple Definition
&lt;/h3&gt;

&lt;p&gt;Degrees of Freedom refers to the number of independent values in a calculation that are free to vary.&lt;/p&gt;

&lt;p&gt;Imagine you have 3 numbers that must add up to 100. If you choose two freely, the third is fixed. So you have 2 degrees of freedom.&lt;/p&gt;

&lt;h5&gt;
  
  
  🧮 Why Do They Matter?
&lt;/h5&gt;

&lt;p&gt;In statistics, DoF adjust for the fact that we estimate parameters (like the mean) from our sample data.&lt;/p&gt;

&lt;h4&gt;
  
  
  Examples:
&lt;/h4&gt;

&lt;h4&gt;
  
  
  1. Sample Variance
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# population variance (DoF = N)
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ddof&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# sample variance (DoF = N-1)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faeyz09bnl641rcvxqphg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faeyz09bnl641rcvxqphg.png" alt=" " width="800" height="556"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>python</category>
      <category>datascience</category>
      <category>statistics</category>
    </item>
    <item>
      <title>Turning Two Lists into a Dictionary</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Thu, 02 Oct 2025 11:25:03 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/turning-two-lists-into-a-dictionary-using-comprehension-32a6</link>
      <guid>https://dev.to/njeri_kimaru/turning-two-lists-into-a-dictionary-using-comprehension-32a6</guid>
      <description>&lt;p&gt;Ever had two lists — one with keys and one with values — and wondered how to merge them into a dictionary? &lt;/p&gt;

&lt;h3&gt;
  
  
  Method 1: Comprehension loop
&lt;/h3&gt;

&lt;p&gt;Let’s say you have:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;people = [&lt;br&gt;
    "Alice", "Bob", "Charlie", "Diana", "Ethan",&lt;br&gt;
    "Fiona", "George", "Hannah", "Isaac", "Julia",&lt;br&gt;
    "Kevin", "Laura", "Michael", "Nina", "Oscar"&lt;br&gt;
]&lt;br&gt;
heights = [&lt;br&gt;
    165, 178, 172, 160, 185,&lt;br&gt;
    170, 182, 158, 174, 169,&lt;br&gt;
    180, 162, 176, 168, 181&lt;br&gt;
]&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;Now turning the two lists into a dictionary using comprehension loops:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;people_heights = {people[i]: heights[i] for i in range(len(people))}&lt;br&gt;
people_heights&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;outcome:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;{'Alice': 165,&lt;br&gt;
 'Bob': 178,&lt;br&gt;
 'Charlie': 172,&lt;br&gt;
 'Diana': 160,&lt;br&gt;
 'Ethan': 185,&lt;br&gt;
 'Fiona': 170,&lt;br&gt;
 'George': 182,&lt;br&gt;
 'Hannah': 158,&lt;br&gt;
 'Isaac': 174,&lt;br&gt;
 'Julia': 169,&lt;br&gt;
 'Kevin': 180,&lt;br&gt;
 'Laura': 162,&lt;br&gt;
 'Michael': 176,&lt;br&gt;
 'Nina': 168,&lt;br&gt;
 'Oscar': 181}&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;h4&gt;
  
  
  Method 2: Using zip
&lt;/h4&gt;

&lt;p&gt;Let's say you have:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pumpkin = ["a","b","c","d","e","f"]&lt;br&gt;
weights = [19,14,15,9,10,17]&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;Now combining the lists into a dictionary:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pumpkin_dict = dict(zip(pumpkin,weights))&lt;br&gt;
pumpkin_dict&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;Outcome:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;{'a': 19, 'b': 14, 'c': 15, 'd': 9, 'e': 10, 'f': 17}&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

</description>
      <category>python</category>
      <category>statistics</category>
      <category>datascience</category>
      <category>programming</category>
    </item>
    <item>
      <title>List Comprehension vs. Dictionary Comprehension</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Thu, 02 Oct 2025 11:23:28 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/list-comprehension-vs-dictionary-comprehension-in-python-56dj</link>
      <guid>https://dev.to/njeri_kimaru/list-comprehension-vs-dictionary-comprehension-in-python-56dj</guid>
      <description>&lt;h3&gt;
  
  
  Differences between list and dictionary comprehensions in python
&lt;/h3&gt;

&lt;p&gt;Python makes it easy to write clean and compact code using comprehensions. But what's the difference between list comprehension and dictionary comprehension?&lt;/p&gt;

&lt;h3&gt;
  
  
  📝 List Comprehension
&lt;/h3&gt;

&lt;p&gt;Used to build lists from iterables.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;squares = [x**2 for x in range(5)]&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;output:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;print(squares)  # [0, 1, 4, 9, 16]&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cb501iztv4dk33h9j9e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cb501iztv4dk33h9j9e.png" alt=" " width="324" height="156"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Dictionary Comprehension
&lt;/h2&gt;

&lt;p&gt;Used to create a dictionary by applying expressions to generate keys and values.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;squares_dict = {x: x**2 for x in range(5)}&lt;br&gt;
print(squares_dict)&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;With a condition:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;even_squares_dict = {x: x**2 for x in range(10) if x % 2 == 0}&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;{0: 0, 2: 4, 4: 16, 6: 36, 8: 64}&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flaljs4uyhstqe8zyaljm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flaljs4uyhstqe8zyaljm.png" alt=" " width="301" height="167"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
      <category>beginners</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Understanding Skewness and Kurtosis: A Friendly Guide for Data Enthusiasts</title>
      <dc:creator>Njeri Kimaru</dc:creator>
      <pubDate>Thu, 02 Oct 2025 11:21:38 +0000</pubDate>
      <link>https://dev.to/njeri_kimaru/understanding-skewness-and-kurtosis-a-friendly-guide-for-data-enthusiasts-5foe</link>
      <guid>https://dev.to/njeri_kimaru/understanding-skewness-and-kurtosis-a-friendly-guide-for-data-enthusiasts-5foe</guid>
      <description>&lt;p&gt;If you've ever looked at a histogram and thought, "Hmm... this looks weirdly stretched or tilted," you're not alone. What you're noticing might be &lt;strong&gt;skewness&lt;/strong&gt; or &lt;strong&gt;kurtosis&lt;/strong&gt; — two important concepts in statistics that describe the shape of a distribution.&lt;/p&gt;

&lt;h2&gt;
  
  
  📈 What is Skewness?
&lt;/h2&gt;

&lt;p&gt;Skewness tells us about the asymmetry of a distribution.&lt;/p&gt;

&lt;h4&gt;
  
  
  positive skewed:
&lt;/h4&gt;

&lt;p&gt;Tail stretches more on the right. Mean &amp;gt; Median.&lt;/p&gt;

&lt;h4&gt;
  
  
  negatively skewed:
&lt;/h4&gt;

&lt;p&gt;Tail stretches more on the left. Mean &amp;lt; Median.&lt;/p&gt;

&lt;h4&gt;
  
  
  Zero skewness:
&lt;/h4&gt;

&lt;p&gt;Perfectly symmetrical (like a normal distribution).&lt;/p&gt;

&lt;h4&gt;
  
  
  💡 Example in Python:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import scipy.stats as stats
import numpy as np
data = np.random.exponential(scale=2, size=1000)
print("Skewness:", stats.skew(data))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwt6c963arw3ynntcul0m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwt6c963arw3ynntcul0m.png" alt=" " width="600" height="227"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  📊 What is Kurtosis?
&lt;/h2&gt;

&lt;p&gt;Kurtosis is a statistical measure that describes the “tailedness” of a distribution — in other words:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How heavy or light the tails of your data are compared to a normal distribution.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Types of Kurtosis:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Mesokurtic
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Normal distribution (reference standard)&lt;/li&gt;
&lt;li&gt;its value is equal to 3&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Leptokurtic
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Heavy tails (more outliers); sharper peak&lt;/li&gt;
&lt;li&gt;its value is more than 3&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Platykurtic
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Light tails (fewer outliers); flatter, wider peak&lt;/li&gt;
&lt;li&gt;its value is less than 3&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn1d6m5xoxnpnr1qycdi.png" alt=" " width="270" height="148"&gt;
&lt;/h2&gt;

&lt;h6&gt;
  
  
  🔹 In Python (e.g., scipy.stats.kurtosis()), the default subtracts 3 (so normal = 0).
&lt;/h6&gt;

&lt;p&gt;This is called excess kurtosis.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgla0werj7xu3pixo64rn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgla0werj7xu3pixo64rn.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>statistics</category>
      <category>datascience</category>
      <category>python</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
