<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Amadou Wolfgang Cisse</title>
    <description>The latest articles on DEV Community by Amadou Wolfgang Cisse (@amadou6e).</description>
    <link>https://dev.to/amadou6e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2955539%2Fbb2a9df1-7af9-4658-8aea-14b7e3e97edd.png</url>
      <title>DEV Community: Amadou Wolfgang Cisse</title>
      <link>https://dev.to/amadou6e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/amadou6e"/>
    <language>en</language>
    <item>
      <title>Simplifying Programmatic Database Handling</title>
      <dc:creator>Amadou Wolfgang Cisse</dc:creator>
      <pubDate>Sat, 31 May 2025 11:06:34 +0000</pubDate>
      <link>https://dev.to/amadou6e/simplifying-programmatic-database-handling-p7l</link>
      <guid>https://dev.to/amadou6e/simplifying-programmatic-database-handling-p7l</guid>
      <description>&lt;p&gt;If you’ve ever worked on a backend or data-centric project in a small team, chances are you’ve hit the same wall I did: setting up local databases consistently, reliably, and without friction.&lt;/p&gt;

&lt;p&gt;It’s a deceptively simple task. And yet, it’s where many projects start to feel fragile.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: The Local Database Setup Spiral
&lt;/h2&gt;

&lt;p&gt;Picture this: you're building a microservice architecture that talks to a PostgreSQL database. You’re working with two other developers. You write a quick &lt;code&gt;README&lt;/code&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Make sure you have Postgres installed. Create a user, a password, a database. Import this SQL script. Use port 5432 unless it’s taken.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You think it’s fine. Then the pull requests start rolling in with bugs that don’t make sense. Someone’s DB is misconfigured. Someone else forgot to run the schema script. Another person installed the wrong version of Postgres on Windows and it won’t even start.&lt;/p&gt;

&lt;p&gt;And when you try to onboard a new teammate? If it's been a while since anyone performed the setup, it can quickly turn into a full afternoon of troubleshooting.&lt;/p&gt;

&lt;p&gt;Local database setup is deceptively expensive. It introduces variance into your dev environments and bakes hidden assumptions into your codebase.&lt;/p&gt;

&lt;p&gt;Even with Docker, it's rarely elegant. You might end up with a mess of &lt;code&gt;docker-compose&lt;/code&gt; files, environment variables, half-broken shell scripts, and manual volume mounts that no one dares touch.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Real Example: Automating WG-Gesucht Notifications
&lt;/h2&gt;

&lt;p&gt;In my case, this hit home during development of a bot that scrapes listings from &lt;strong&gt;WG-Gesucht&lt;/strong&gt; (a German apartment-sharing site) and automatically alerts users based on their preferences. It’s built as a collection of microservices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A scraper service (pulls data and stores in Postgres),&lt;/li&gt;
&lt;li&gt;A vector search module using pgvector (to recommend listings),&lt;/li&gt;
&lt;li&gt;A notification dispatcher (integrates with email/Telegram).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each service uses a local DB during development and testing. I needed to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quickly spin up Postgres and MySQL instances with test data,&lt;/li&gt;
&lt;li&gt;Run init scripts and seed content from Python,&lt;/li&gt;
&lt;li&gt;Share configurations with collaborators using Jupyter and notebooks,&lt;/li&gt;
&lt;li&gt;Avoid any OS-specific setup pain.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Docker is an obvious choice—but I didn’t want my dev flow to depend on Docker CLI commands buried in scripts. I wanted everything runnable in Python, so it could live side-by-side with my logic and be testable, restartable, and explicit.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Right Solution Should Look Like
&lt;/h2&gt;

&lt;p&gt;At this point, I had some clear goals in mind for a better approach to local databases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Python-first interface&lt;/strong&gt;: no shell scripts, no &lt;code&gt;docker-compose.yml&lt;/code&gt;, no Makefiles.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimal dependencies&lt;/strong&gt;: no installing client tools or external setup scripts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-platform&lt;/strong&gt;: works on macOS, Linux, and Windows (even WSL).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supports init scripts and volumes&lt;/strong&gt;: I want to seed data, or persist it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear lifecycle control&lt;/strong&gt;: I want to &lt;code&gt;.start_db()&lt;/code&gt;, &lt;code&gt;.stop_db()&lt;/code&gt;, &lt;code&gt;.delete_db()&lt;/code&gt; like any Python object.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the tool I wish existed from the start. So I built it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing &lt;code&gt;py-dockerdb&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://pypi.org/project/py-dockerdb" rel="noopener noreferrer"&gt;&lt;code&gt;py-dockerdb&lt;/code&gt;&lt;/a&gt; is a Python library that lets you manage real Dockerized databases like native Python objects. Visit the project on &lt;a href="https://github.com/amadou-6e/py-dockerdb/tree/main" rel="noopener noreferrer"&gt;github&lt;/a&gt; for more info and usage examples!&lt;/p&gt;

&lt;p&gt;With just a few lines of code, you can spin up Postgres, MongoDB, MySQL, or SQL Server, inject init scripts, connect with familiar Python drivers—and tear them down cleanly when done.&lt;/p&gt;

&lt;p&gt;No shell, no YAML, no guesswork.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docker_db.postgres_db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PostgresConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PostgresDB&lt;/span&gt;

&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PostgresConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;botuser&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;botpass&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wggesucht_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;container_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;local-postgres&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PostgresDB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;
&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT COUNT(*) FROM listings;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchone&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is running an actual PostgreSQL instance in Docker, spun up with an init script, ready for interaction, and controlled completely from Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;Every database type has two classes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;Config&lt;/code&gt; class that defines connection settings and init behavior.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;DB&lt;/code&gt; class that manages lifecycle: &lt;code&gt;create_db()&lt;/code&gt;, &lt;code&gt;stop_db()&lt;/code&gt;, &lt;code&gt;delete_db()&lt;/code&gt;, &lt;code&gt;restart_db()&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It supports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Init scripts&lt;/strong&gt;: SQL, JS, SH, depending on the DB engine.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Volume persistence&lt;/strong&gt;: so you can reuse data across runs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment injection&lt;/strong&gt;: useful for script templating.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native drivers&lt;/strong&gt;: &lt;code&gt;psycopg2&lt;/code&gt;, &lt;code&gt;pymongo&lt;/code&gt;, &lt;code&gt;pyodbc&lt;/code&gt;, &lt;code&gt;mysql-connector&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All you need is Docker and Python 3.7+ and a running docker instance on the host machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Can Use It For
&lt;/h2&gt;

&lt;p&gt;Here are some use cases I’ve explored or seen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data science notebooks&lt;/strong&gt; with SQL backends that boot on demand.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI testing environments&lt;/strong&gt; that require disposable database containers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Teaching&lt;/strong&gt; SQL or NoSQL without asking students to install anything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Microservice development&lt;/strong&gt; with predictable, isolated DB instances.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rapid prototyping&lt;/strong&gt; for apps that need seeded data on day one.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Philosophy
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;py-dockerdb&lt;/code&gt; is intentionally &lt;strong&gt;minimal&lt;/strong&gt;. It’s not trying to replace &lt;code&gt;docker-compose&lt;/code&gt; for full-stack orchestration. It doesn’t scaffold services or guess your intentions.&lt;/p&gt;

&lt;p&gt;Instead, it focuses on one thing: &lt;strong&gt;let you control local databases, entirely from Python&lt;/strong&gt;, using real Docker containers.&lt;/p&gt;

&lt;p&gt;No DSLs. No hidden automagic. Just code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supported Databases
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;PostgreSQL&lt;/li&gt;
&lt;li&gt;MySQL&lt;/li&gt;
&lt;li&gt;MongoDB&lt;/li&gt;
&lt;li&gt;Microsoft SQL Server&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And Cassandra is on the roadmap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;Just install it from PyPI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;py-dockerdb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And you're ready to go.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Working with databases locally shouldn’t be an afterthought. It’s one of the most repeated steps in any backend, data, or devops workflow. Yet we still hand-wave it away with vague instructions and flaky setup scripts.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;py-dockerdb&lt;/code&gt;, you can keep database setup alongside your logic—reproducible, isolated, and inspectable.&lt;/p&gt;

&lt;p&gt;Your teammates (and future self) will thank you.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>database</category>
      <category>vectordatabase</category>
      <category>python</category>
    </item>
    <item>
      <title>Minifying Tables with pymtd2json: Boosting Efficiency in RAG Systems</title>
      <dc:creator>Amadou Wolfgang Cisse</dc:creator>
      <pubDate>Sun, 27 Apr 2025 13:43:38 +0000</pubDate>
      <link>https://dev.to/amadou6e/minifying-tables-with-pymtd2json-boosting-efficiency-in-rag-systems-150k</link>
      <guid>https://dev.to/amadou6e/minifying-tables-with-pymtd2json-boosting-efficiency-in-rag-systems-150k</guid>
      <description>&lt;p&gt;In retrieval-augmented generation (RAG) pipelines, input efficiency is paramount, not just in terms of tokens, but also character limits&lt;/p&gt;

&lt;p&gt;When building a multilingual embedding pipeline, I faced a real challenge:&lt;br&gt;&lt;br&gt;
the Cohere multilingual model imposes a maximum of 2048 charactersnot a token limit per input.&lt;/p&gt;

&lt;p&gt;This article walks you through a clever solution:&lt;br&gt;&lt;br&gt;
preprocessing Markdown tables into dense JSON blocks using &lt;a href="https://github.com/amadou-6e/pymdt2json" rel="noopener noreferrer"&gt;&lt;code&gt;pymtd2json&lt;/code&gt;&lt;/a&gt;, to ensure smooth, efficient embeddings without errors.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Challenge: Character Limits vs Token Limits
&lt;/h2&gt;

&lt;p&gt;Classical chunking methods, like &lt;code&gt;SentenceSplitter&lt;/code&gt; from LlamaIndex, are token-focused:&lt;br&gt;&lt;br&gt;
you set a maximum number of tokens per chunk — but not characters&lt;/p&gt;
&lt;h3&gt;
  
  
  Why This Matters:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Markdown (especially GitHub-Flavored Markdown, GFM) wastes spacewith formatting.&lt;/li&gt;
&lt;li&gt;A Markdown chunk might have only 170 tokens but still exceed 2048 characters&lt;/li&gt;
&lt;li&gt;This results in rejected API requests or inefficient extra splitting.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Important Note&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
Markdown tables are up to &lt;strong&gt;3x less token-efficient&lt;/strong&gt; than other formats, further compounding the problem.&lt;br&gt;&lt;br&gt;
👉 &lt;a href="https://medium.com/singapore-gds/cutting-cost-and-enhancing-performance-minifying-markdown-tables-to-improve-token-efficiency-in-af488a784fd5" rel="noopener noreferrer"&gt;Read more on token inefficiency of Markdown tables here.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  A Real-World Example: Measuring the Problem
&lt;/h2&gt;

&lt;p&gt;Let's dive into a simple simulation:&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Create a Large Markdown Table
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="c1"&gt;# Build data
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Person&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Age&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;City&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;City&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Create DataFrame
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A very long row content, which leads to a lot of white spaces&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Age&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;City&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Convert to Markdown
&lt;/span&gt;&lt;span class="n"&gt;table_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This generates a verbose table with 30 rows and a very long header.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 2: Analyze Token and Character Counts
&lt;/h3&gt;

&lt;p&gt;Using Cohere’s tokenizer (available via Hugging Face):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;

&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cohere/Cohere-embed-multilingual-v3.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;encoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_tensors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;add_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;num_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;encoded&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_ids&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;num_chars&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Characters: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;num_chars&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tokens: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;num_tokens&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Characters: &lt;strong&gt;2719&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Tokens: &lt;strong&gt;432&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;⚡ &lt;strong&gt;Problem&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
While token count is fine, character count exceeds 2048, causing API errors like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cohere.error.CohereAPIError: input text exceeds maximum allowed size of 2048 characters
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Solution: Minifying Tables into JSON
&lt;/h2&gt;

&lt;p&gt;Instead of traditional Markdown, why not store the data in a &lt;strong&gt;dense JSON block&lt;/strong&gt;?&lt;/p&gt;

&lt;h3&gt;
  
  
  Benefits of Minifying Tables:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Remove pipes, dashes, and whitespace, all formatting overhead.&lt;/li&gt;
&lt;li&gt;Preserve semantic meaning.&lt;/li&gt;
&lt;li&gt;Shrink text to meet character limits safely.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example of the compact JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"Person0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"Person1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"Person2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Age"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"20"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"21"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"22"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"City"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"City0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"City1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"City2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;New Stats:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Characters: &lt;strong&gt;1027&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Tokens: &lt;strong&gt;461&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✅ Now well within Cohere’s input limit!&lt;/p&gt;

&lt;h2&gt;
  
  
  Applying Minification in Practice
&lt;/h2&gt;

&lt;p&gt;Want to prepare documents before chunking?&lt;br&gt;&lt;br&gt;
Here's how you can automatically process all Markdown files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SimpleDirectoryReader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;your_minifier&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MinifyMDT&lt;/span&gt;

&lt;span class="n"&gt;source_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;example_dir&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;markdown&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SimpleDirectoryReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required_exts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;recursive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;load_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;doc_texts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;doc_texts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MinifyMDT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text_resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;👉 And voilà: &lt;strong&gt;Your data is compact, clean, and embedding-ready!&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Working with multilingual RAG systems means optimizing every byte.&lt;br&gt;&lt;br&gt;
Whitespace-heavy Markdown tables might look nice for humans, but they’re expensive for machine understanding.&lt;/p&gt;

&lt;p&gt;By minifying your tables with &lt;code&gt;pymtd2json&lt;/code&gt;, you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cut down API errors.&lt;/li&gt;
&lt;li&gt;Reduce token overhead.&lt;/li&gt;
&lt;li&gt;Boost overall performance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Efficiency isn't optiona, it's a superpower.&lt;/strong&gt; 🚀&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Supercharge Your Jupyter Notebook: SQL Command Magic for IPython</title>
      <dc:creator>Amadou Wolfgang Cisse</dc:creator>
      <pubDate>Tue, 18 Mar 2025 22:02:25 +0000</pubDate>
      <link>https://dev.to/amadou6e/supercharge-your-jupyter-notebook-sql-command-magic-for-ipython-1ahf</link>
      <guid>https://dev.to/amadou6e/supercharge-your-jupyter-notebook-sql-command-magic-for-ipython-1ahf</guid>
      <description>&lt;p&gt;Find the executable notebook &lt;a href="https://github.com/amadou-6e/ipython-sqlcmd/blob/main/usage/general_example.ipynb" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Jupyter Notebooks are widely used for data analysis and scientific computing, but working with databases inside them has always been somewhat cumbersome. While libraries like &lt;code&gt;sqlite3&lt;/code&gt; or &lt;code&gt;pymssql&lt;/code&gt; provide connectivity, they require &lt;strong&gt;extra Python boilerplate&lt;/strong&gt; for managing connections, executing queries, and formatting results.  &lt;/p&gt;

&lt;p&gt;Wouldn’t it be better if we could &lt;strong&gt;directly run SQL queries&lt;/strong&gt; inside a Jupyter Notebook, just like in SQL Server Management Studio (SSMS)?  &lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: SQL in Jupyter Notebooks
&lt;/h2&gt;

&lt;p&gt;Many data professionals need to execute SQL queries within a Jupyter Notebook. However, the existing approaches often come with drawbacks:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Complex Setup:&lt;/strong&gt; Managing database connections, cursors, and transactions manually.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verbosity:&lt;/strong&gt; Writing additional Python code to fetch and display query results.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Limited Integration:&lt;/strong&gt; Difficult to run &lt;strong&gt;multi-statement SQL batches&lt;/strong&gt; using &lt;code&gt;GO&lt;/code&gt; commands.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of spending time writing extra Python code, what if we could just &lt;strong&gt;run SQL commands directly inside a cell&lt;/strong&gt;, as if we were in SSMS?  &lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: SQL Command Magic
&lt;/h2&gt;

&lt;p&gt;SQL Command Magic for IPython is an &lt;strong&gt;IPython extension&lt;/strong&gt; that integrates Microsoft’s &lt;code&gt;sqlcmd&lt;/code&gt; utility into Jupyter Notebooks. It allows users to execute &lt;strong&gt;native SQL queries inside Jupyter&lt;/strong&gt;, without any extra Python code.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;p&gt;✅ &lt;strong&gt;Seamless SQL Execution&lt;/strong&gt; - Write SQL directly in notebook cells without additional Python code.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Built-in Connection Management&lt;/strong&gt; - Connect to Microsoft SQL Server dynamically.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Multi-Statement Execution&lt;/strong&gt; - Supports &lt;code&gt;GO&lt;/code&gt; statements for executing multiple queries at once.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Variable Substitution&lt;/strong&gt; - Pass Python variables directly into SQL queries.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Debugging Support&lt;/strong&gt; - Use &lt;code&gt;--debug&lt;/code&gt; to analyze query execution details.  &lt;/p&gt;
&lt;h2&gt;
  
  
  Installation and Setup
&lt;/h2&gt;

&lt;p&gt;Find the executable notebook &lt;a href="https://github.com/amadou-6e/ipython-sqlcmd/blob/main/usage/general_example.ipynb" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Install the Extension
&lt;/h3&gt;

&lt;p&gt;First, install the required package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;ipython-sqlcmd python-dotenv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Load the Extension
&lt;/h3&gt;

&lt;p&gt;In your Jupyter Notebook, load the extension using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;load_ext&lt;/span&gt; &lt;span class="n"&gt;sqlcmd&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This enables the &lt;code&gt;%sqlcmd&lt;/code&gt; magic command inside Jupyter.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Connect to SQL Server
&lt;/h3&gt;

&lt;p&gt;To connect to a &lt;strong&gt;SQL Server instance&lt;/strong&gt;, use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;sqlcmd&lt;/span&gt; &lt;span class="n"&gt;master&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;localhost&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sa&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SSMS_PASSWORD&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;encrypt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;certificate&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can replace &lt;code&gt;localhost&lt;/code&gt; and credentials with your own connection details.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Running SQL Queries in Jupyter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Simple Query
&lt;/h3&gt;

&lt;p&gt;Once connected, you can execute &lt;strong&gt;SQL commands&lt;/strong&gt; inside a notebook cell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;%%&lt;/span&gt;&lt;span class="n"&gt;sqlcmd&lt;/span&gt;
&lt;span class="n"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;TOP&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; 
&lt;span class="n"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tables&lt;/span&gt; 
&lt;span class="n"&gt;ORDER&lt;/span&gt; &lt;span class="n"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This fetches the &lt;strong&gt;top 10 tables&lt;/strong&gt; from the system catalog, just like in SSMS.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Creating and Populating Tables
&lt;/h3&gt;

&lt;p&gt;Creating and inserting data is straightforward. Let’s create a table and insert some values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;%%&lt;/span&gt;&lt;span class="n"&gt;sqlcmd&lt;/span&gt;
&lt;span class="n"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;TABLE&lt;/span&gt; &lt;span class="nc"&gt;TestSpaces &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ID&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Description&lt;/span&gt; &lt;span class="nf"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;Code&lt;/span&gt; &lt;span class="nf"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;INSERT&lt;/span&gt; &lt;span class="n"&gt;INTO&lt;/span&gt; &lt;span class="nc"&gt;TestSpaces &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;VALUES&lt;/span&gt; 
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;This has spaces&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;A1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Another spaced value&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;B2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;No spaces&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;C3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;TestSpaces&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will &lt;strong&gt;create the table&lt;/strong&gt;, insert some values, and return the data in a single execution.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Using Python Variables Inside Queries
&lt;/h3&gt;

&lt;p&gt;You can use &lt;strong&gt;Python variables&lt;/strong&gt; to dynamically modify your SQL queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;table_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sys.tables&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;%%&lt;/span&gt;&lt;span class="n"&gt;sqlcmd&lt;/span&gt;
&lt;span class="n"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;TOP&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; 
&lt;span class="n"&gt;FROM&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="n"&gt;table_name&lt;/span&gt; 
&lt;span class="n"&gt;ORDER&lt;/span&gt; &lt;span class="n"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;$limit&lt;/code&gt; and &lt;code&gt;$table_name&lt;/code&gt; placeholders are automatically replaced with the Python variables before execution.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Executing External SQL Scripts
&lt;/h2&gt;

&lt;p&gt;SQL Command Magic also supports executing &lt;strong&gt;external SQL files&lt;/strong&gt;, making it useful for &lt;strong&gt;database migrations&lt;/strong&gt; or &lt;strong&gt;schema setup&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;%%&lt;/span&gt;&lt;span class="n"&gt;sqlcmd&lt;/span&gt;
&lt;span class="n"&gt;EXECUTE_SQL_FILE&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;../src/tests/empty.sql&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will run all SQL commands inside &lt;code&gt;empty.sql&lt;/code&gt;.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Debugging Queries
&lt;/h2&gt;

&lt;p&gt;To troubleshoot execution issues, enable &lt;strong&gt;debug mode&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;%%&lt;/span&gt;&lt;span class="n"&gt;sqlcmd&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;debug&lt;/span&gt;
&lt;span class="n"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;@VERSION&lt;/span&gt; &lt;span class="n"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;SQLServerVersion&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This outputs &lt;strong&gt;detailed execution logs&lt;/strong&gt;, showing how the query was processed.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Running Multiple SQL Batches
&lt;/h2&gt;

&lt;p&gt;Unlike standard SQL execution in Jupyter, &lt;strong&gt;SQL Command Magic&lt;/strong&gt; fully supports &lt;strong&gt;multi-statement execution&lt;/strong&gt; using &lt;code&gt;GO&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;%%&lt;/span&gt;&lt;span class="n"&gt;sqlcmd&lt;/span&gt;
&lt;span class="n"&gt;SELECT&lt;/span&gt; &lt;span class="nc"&gt;DB_NAME&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;CurrentDatabase&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;
&lt;span class="n"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;@SERVERNAME&lt;/span&gt; &lt;span class="n"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;ServerName&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each query batch executes separately, just like in SSMS.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;SQL Command Magic for IPython is a &lt;strong&gt;simple yet powerful&lt;/strong&gt; tool for running SQL queries inside Jupyter Notebooks. It removes unnecessary Python boilerplate, enables &lt;strong&gt;multi-statement execution&lt;/strong&gt;, and integrates seamlessly with &lt;strong&gt;Microsoft SQL Server&lt;/strong&gt;.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Key Benefits
&lt;/h3&gt;

&lt;p&gt;✅ &lt;strong&gt;Reduces Boilerplate&lt;/strong&gt; – No need to write extra Python code for database connections.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;More Natural SQL Workflow&lt;/strong&gt; – Execute queries just like in SSMS.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Advanced Features&lt;/strong&gt; – Supports &lt;code&gt;GO&lt;/code&gt; statements, variable substitution, and script execution.  &lt;/p&gt;

&lt;p&gt;If you frequently run &lt;strong&gt;SQL queries in Jupyter&lt;/strong&gt;, this extension is a &lt;strong&gt;game changer&lt;/strong&gt;.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Get Started
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;ipython-sqlcmd python-dotenv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Try it out and let me know your thoughts!&lt;/p&gt;

</description>
      <category>sqlserver</category>
      <category>jupyter</category>
      <category>python</category>
      <category>sql</category>
    </item>
  </channel>
</rss>
