<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alessandra Bilardi</title>
    <description>The latest articles on DEV Community by Alessandra Bilardi (@bilardi).</description>
    <link>https://dev.to/bilardi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3872509%2Ff4e33b54-08ad-4b6d-a6ba-ff25409f3dee.jpg</url>
      <title>DEV Community: Alessandra Bilardi</title>
      <link>https://dev.to/bilardi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bilardi"/>
    <language>en</language>
    <item>
      <title>The lazy developer's code quality</title>
      <dc:creator>Alessandra Bilardi</dc:creator>
      <pubDate>Thu, 30 Apr 2026 09:25:08 +0000</pubDate>
      <link>https://dev.to/bilardi/the-lazy-developers-code-quality-3a34</link>
      <guid>https://dev.to/bilardi/the-lazy-developers-code-quality-3a34</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2998kczeig3a7o5plwj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2998kczeig3a7o5plwj.png" alt="Flow" width="800" height="1159"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A repo to refresh, several rabbit holes to dive into
&lt;/h2&gt;

&lt;p&gt;A while ago, at PyCon IT, I attended a talk that opened my eyes on &lt;a href="https://pypi.org/project/pytest/" rel="noopener noreferrer"&gt;pytest&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;simpler test management, especially for mocks&lt;/li&gt;
&lt;li&gt;parametrizable fixtures instead of the &lt;code&gt;setUp&lt;/code&gt; / &lt;code&gt;tearDown&lt;/code&gt; ritual&lt;/li&gt;
&lt;li&gt;bare &lt;code&gt;assert&lt;/code&gt; instead of a thousand &lt;code&gt;self.assertEqual&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'd like my repo &lt;a href="https://github.com/bilardi/python-prototype/" rel="noopener noreferrer"&gt;python-prototype&lt;/a&gt;, born for educational purposes, to also be a bit of a template I can pull off the shelf for the next projects.&lt;/p&gt;

&lt;p&gt;So, with the excuse of refreshing the testing system with pytest and the packaging with &lt;a href="https://packaging.python.org/en/latest/guides/writing-pyproject-toml/" rel="noopener noreferrer"&gt;pyproject&lt;/a&gt;, I started thinking about adding more.&lt;/p&gt;

&lt;p&gt;I had been using &lt;a href="https://pypi.org/project/black/" rel="noopener noreferrer"&gt;black&lt;/a&gt; and &lt;a href="https://pypi.org/project/pylint/" rel="noopener noreferrer"&gt;pylint&lt;/a&gt; for a long time, so my first thought was: ok, let's bring in formatting and linting too. But I asked myself: isn't there something better that maintains style (&lt;a href="https://peps.python.org/pep-0008/" rel="noopener noreferrer"&gt;PEP 8&lt;/a&gt;), docstrings (&lt;a href="https://peps.python.org/pep-0257/" rel="noopener noreferrer"&gt;PEP 257&lt;/a&gt;) and type hints (&lt;a href="https://peps.python.org/pep-0484/" rel="noopener noreferrer"&gt;PEP 484&lt;/a&gt;) automatically ?&lt;/p&gt;

&lt;p&gt;And the environment, can it be modernized too ? With what ? Well, just like there are two schools, emacs and vi, there are also two schools, &lt;a href="https://python-poetry.org/" rel="noopener noreferrer"&gt;poetry&lt;/a&gt; and &lt;a href="https://docs.astral.sh/uv/" rel="noopener noreferrer"&gt;uv&lt;/a&gt; .. without even mentioning all the others.&lt;/p&gt;

&lt;p&gt;What I needed was something to cover code quality, formatting, packaging and beyond: fewer tasks left to memory or to reading the holy README, more chances they actually get done.&lt;/p&gt;

&lt;p&gt;Since there's no "all-inclusive package", the plan was to test what was maintained and maintainable, and find the one most suited to my needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Today's chosen stack
&lt;/h2&gt;

&lt;p&gt;Four tools, not ten:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;uv&lt;/strong&gt;: the env manager. One Rust binary in place of &lt;code&gt;pip&lt;/code&gt;, &lt;code&gt;venv&lt;/code&gt;, &lt;code&gt;pyenv&lt;/code&gt; and &lt;code&gt;pipx&lt;/code&gt;. With poetry, the last two aren't covered and need to be installed separately: fewer satellite tools around.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://docs.astral.sh/ruff/" rel="noopener noreferrer"&gt;ruff&lt;/a&gt;&lt;/strong&gt;: formatting and linting. Replaces &lt;code&gt;black&lt;/code&gt;, &lt;code&gt;isort&lt;/code&gt;, &lt;code&gt;flake8&lt;/code&gt; and most of &lt;code&gt;pylint&lt;/code&gt;. Another Rust binary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://microsoft.github.io/pyright/" rel="noopener noreferrer"&gt;pyright&lt;/a&gt;&lt;/strong&gt;: the type checker. Skipping &lt;a href="https://mypy-lang.org/" rel="noopener noreferrer"&gt;mypy&lt;/a&gt;, &lt;a href="https://pyrefly.org/" rel="noopener noreferrer"&gt;pyrefly&lt;/a&gt; and &lt;a href="https://github.com/astral-sh/ty" rel="noopener noreferrer"&gt;ty&lt;/a&gt;. For now.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://pre-commit.com/" rel="noopener noreferrer"&gt;pre-commit&lt;/a&gt;&lt;/strong&gt;: a git-hook that runs ruff and pytest automatically before every commit. Just .. remember to set it up at the start of the project !&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The single criterion that drove all these choices is &lt;strong&gt;least total effort&lt;/strong&gt;. Fewer tools = less config = less maintenance. The lazy developer wants the toolchain to break before the commit, in case some step gets forgotten. But without overdoing it: just enough to produce quality code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stories from the field
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Pylint and the 4.35/10 grade
&lt;/h3&gt;

&lt;p&gt;The first run of pylint on simple-sample stings: 4.35/10. A high school grade, not a teaching repo's. I sit down to fix my JavaScript hangover: &lt;code&gt;myClass&lt;/code&gt; becomes &lt;code&gt;my_class&lt;/code&gt; (PEP 8 naming), &lt;code&gt;foo&lt;/code&gt; and &lt;code&gt;bar&lt;/code&gt; and &lt;code&gt;foobar&lt;/code&gt; become &lt;code&gt;get_param_processing&lt;/code&gt;, &lt;code&gt;get_boolean&lt;/code&gt;, &lt;code&gt;get_reverse_protected_param&lt;/code&gt; (names that say what they do). Up to 9.41/10.&lt;/p&gt;

&lt;p&gt;But before claiming victory, three warnings need a decision:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;W0223&lt;/strong&gt;: abstract method not implemented in a subclass. Pylint flags it as a bug to fix. In my case it MUST fail: it's part of the educational example. I keep it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;C0301&lt;/strong&gt;: line too long. I look: it's an HTTP link in a docstring, can't be broken. I ignore it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;C0104&lt;/strong&gt;: names like "foo" and "bar" are disallowed. I could disable the rule globally, but here I prefer having spent the hour of restructuring: variables and methods should be expressive.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these decisions is a "the tool is right about the code but not about the context". And here is where pylint's limit shows up: it tells you what it found, not whether it really needs fixing. The case-by-case judgement stays with you: it doesn't change anything by itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pylint doesn't understand pytest
&lt;/h3&gt;

&lt;p&gt;I go looking for trouble, and run pylint on the test suite: a new warning shows up, W0621 &lt;code&gt;redefining-outer-name&lt;/code&gt;, on the fixtures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mci&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;MyClassInterface&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_mci_creation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mci&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mci&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MyClassInterface&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pylint says "you're redefining &lt;code&gt;mci&lt;/code&gt; from the outer scope". But this pattern is the way fixtures work: it's not redefinition, it's parameter injection. Pylint reads the code as if it were running it, but it doesn't know how pytest runs it.&lt;/p&gt;

&lt;p&gt;False positive. The workaround exists:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mci&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mci_fixture&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;MyClassInterface&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_mci_creation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mci&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mci&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MyClassInterface&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But it's there to silence pylint, not to improve the code. I don't add it. And here I start thinking that pylint is old for pytest, and it's time to switch tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ruff arrives and takes black's place
&lt;/h3&gt;

&lt;p&gt;I try &lt;code&gt;ruff check&lt;/code&gt; and &lt;code&gt;ruff format&lt;/code&gt;. It covers practically everything black did for formatting, and a good chunk of what pylint did for linting. One binary. Config in &lt;code&gt;pyproject.toml&lt;/code&gt;: a single section instead of two. Execution time: milliseconds.&lt;/p&gt;

&lt;p&gt;Ruff openly states the trade-off: it's AST-based and works on a single file at a time, it doesn't "read" the class hierarchy across files. So the abstract method not overridden, which I do need to see, doesn't get flagged. Ruff is a fast surface linter, not a deep analyst.&lt;/p&gt;

&lt;p&gt;Ok. Ruff takes black's place and covers most of pylint. For what's missing (abstract method, type consistency across files) I need another tool: a type checker.&lt;/p&gt;

&lt;h3&gt;
  
  
  The type checker tour
&lt;/h3&gt;

&lt;p&gt;Pylint flagged both typing and scoping errors (W0621 is a style check, not a type one). Choosing a type checker, I focus on the typing front: the scoping front stays out of this tour.&lt;/p&gt;

&lt;p&gt;I add type hints everywhere, otherwise the type checkers would throw a sea of red (with nothing to check): the signature &lt;code&gt;def get_param_processing(self, param):&lt;/code&gt; becomes &lt;code&gt;def get_param_processing(self, param: bool) -&amp;gt; bool:&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Then I run mypy, pyrefly, ty, pyright on the same code to see who flags what.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Abstract method not implemented&lt;/th&gt;
&lt;th&gt;Return None where type hint says bool&lt;/th&gt;
&lt;th&gt;Other&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;mypy&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;historical, slow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pyrefly&lt;/td&gt;
&lt;td&gt;in a different form&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;lightning fast, young&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ty&lt;/td&gt;
&lt;td&gt;yes (interface only)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;lightning fast, young&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pyright&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;also flags a third error: the method is used in MyClass&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Pyright finds more and has a mature ecosystem: Microsoft maintains it actively, and Pylance (the Python extension for VS Code) is built on top of pyright. Pyright wins. Pyrefly and ty are under active development: I'll come back to them later.&lt;/p&gt;

&lt;h3&gt;
  
  
  The workflow breaking at the first &lt;code&gt;make patch&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Setup done. Ruff passes clean. Pyright passes clean. Pre-commit stops me if I forget something. I run &lt;code&gt;make patch&lt;/code&gt; for the first "real" release .. and:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;make[1]: bump-my-version: No such file or directory
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Makefile was calling &lt;code&gt;bump-my-version&lt;/code&gt; directly, and the project's dev-deps were in &lt;code&gt;tests/requirements-test.txt&lt;/code&gt;, not in &lt;code&gt;pyproject.toml&lt;/code&gt;. So whoever cloned the repo had to know to do a &lt;code&gt;pip install -r tests/requirements-test.txt&lt;/code&gt; on top of &lt;code&gt;uv sync&lt;/code&gt;, and the release workflow assumed the venv was activated. Too much implicit knowledge, too much hassle.&lt;/p&gt;

&lt;p&gt;I'm so used to using &lt;code&gt;uv run&lt;/code&gt; that I don't run &lt;code&gt;source .venv/bin/activate&lt;/code&gt; anymore, so I tripped over something that "the old-fashioned way" would never have happened.&lt;/p&gt;

&lt;p&gt;What did it take to truly hand the environment over to uv ? Well, all I needed was to add every dependency in &lt;code&gt;pyproject.toml&lt;/code&gt; with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv add &lt;span class="nt"&gt;--dev&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; tests/requirements-test.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single command. uv reads the requirements file, writes everything in &lt;code&gt;[dependency-groups].dev&lt;/code&gt; of &lt;code&gt;pyproject.toml&lt;/code&gt; (the standard introduced by &lt;a href="https://peps.python.org/pep-0735/" rel="noopener noreferrer"&gt;PEP 735&lt;/a&gt; for dev-deps), updates &lt;code&gt;uv.lock&lt;/code&gt;, and installs. The &lt;code&gt;tests/requirements-test.txt&lt;/code&gt; file becomes redundant: one less file to handle.&lt;/p&gt;

&lt;p&gt;And then in the Makefile I added &lt;code&gt;uv run&lt;/code&gt; in front of every Python command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nl"&gt;release&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    uv run bump-my-version bump &lt;span class="p"&gt;$(&lt;/span&gt;PART&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;$(&lt;/span&gt;MAKE&lt;span class="p"&gt;)&lt;/span&gt; changelog
    git tag &lt;span class="nt"&gt;-f&lt;/span&gt; v&lt;span class="p"&gt;$$(&lt;/span&gt;uv run python &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"from simple_sample import __version__; print(__version__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    git push &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git push &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nt"&gt;--force&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now &lt;code&gt;make patch&lt;/code&gt; works even from a fresh shell, no activation needed. The venv is no longer tribal knowledge, it's implicit in every command.&lt;/p&gt;

&lt;h3&gt;
  
  
  Seven sections in &lt;code&gt;pyproject.toml&lt;/code&gt;, one per tool
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;pyproject.toml&lt;/code&gt; was born for packaging, and from there it picked up the config sections of the project's tools: seven in total.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ruff&lt;/strong&gt; starts from &lt;code&gt;select = ["ALL"]&lt;/code&gt;: I enable every available rule and use &lt;code&gt;ignore&lt;/code&gt; for the ones I find too much. Philosophy "everything by default, exclude by name": as ruff adds new rules, I get them automatically. And the "ALL" bundle isn't just style + lint: it includes naming (PEP 8), docstring (PEP 257), type annotations (PEP 484, with &lt;code&gt;flake8-annotations&lt;/code&gt;), cyclomatic complexity (&lt;code&gt;mccabe&lt;/code&gt;), basic security (&lt;code&gt;bandit-base&lt;/code&gt;), import order (&lt;code&gt;isort&lt;/code&gt;). Ruff isn't "just" a formatter + linter, it's the umbrella under which black + isort + flake8 + parts of pylint, pydocstyle and bandit live.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;pyright&lt;/strong&gt; in &lt;code&gt;typeCheckingMode = "strict"&lt;/code&gt;: the default &lt;code&gt;basic&lt;/code&gt; lets a lot slide, &lt;code&gt;strict&lt;/code&gt; requires complete type hints and explicit returns. It's the mode that surfaces those errors the type checker tour had revealed (and that mypy / pyrefly / ty in default config would have missed).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;pytest&lt;/strong&gt;: minimal config, &lt;code&gt;asyncio_mode = "auto"&lt;/code&gt; and &lt;code&gt;testpaths = ["tests"]&lt;/code&gt;. The rest lives in the tests themselves.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;[dependency-groups].dev&lt;/strong&gt;: the list of dev-deps with version constraints (PEP 735). uv reads this section for &lt;code&gt;uv sync --group dev&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;packaging&lt;/strong&gt; (&lt;code&gt;[build-system]&lt;/code&gt;, &lt;code&gt;[project]&lt;/code&gt;, &lt;code&gt;[tool.setuptools]&lt;/code&gt;), &lt;strong&gt;bumpversion&lt;/strong&gt;, &lt;strong&gt;git-cliff&lt;/strong&gt;: handle the release pipeline (metadata + runtime dependencies + wheel and sdist build + versioning + CHANGELOG from conventional commits). A different topic from code quality, but necessary for the modernization and automation goal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;pre-commit&lt;/strong&gt; lives in &lt;code&gt;.pre-commit-config.yaml&lt;/code&gt; (outside &lt;code&gt;pyproject.toml&lt;/code&gt;): it points to the official &lt;code&gt;astral-sh/ruff-pre-commit&lt;/code&gt; repo for the two ruff hooks (check + format) and keeps a local hook running &lt;code&gt;uv run pytest&lt;/code&gt; for the tests. So pre-commit also leans on uv to access the project's venv, just like the Makefile targets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Plus
&lt;/h2&gt;

&lt;p&gt;The lazy developer adds tools when they're really needed, when it's time to handle some other aspect automatically.&lt;/p&gt;

&lt;p&gt;Still on the code quality front, what could be added and when ?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://pypi.org/project/vulture/" rel="noopener noreferrer"&gt;vulture&lt;/a&gt; and &lt;a href="https://pypi.org/project/radon/" rel="noopener noreferrer"&gt;radon&lt;/a&gt;&lt;/strong&gt;: project-level dead code and complexity reports. When a map of the codebase is needed, for instance before a major refactor: ruff sees the single file, vulture and radon see the whole.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://pypi.org/project/bandit/" rel="noopener noreferrer"&gt;bandit&lt;/a&gt; (SAST), &lt;a href="https://pypi.org/project/pip-audit/" rel="noopener noreferrer"&gt;pip-audit&lt;/a&gt; (SCA) and &lt;a href="https://pypi.org/project/detect-secrets/" rel="noopener noreferrer"&gt;detect-secrets&lt;/a&gt;&lt;/strong&gt;: if the package becomes an API or handles sensitive data, but here a whole new world opens up ..&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;mypy in strict mode&lt;/strong&gt;: a second pass on top of pyright. Today I don't have an example that would push me to add it, pyright strict covers well.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pyrefly and ty&lt;/strong&gt;: worth re-evaluating especially for projects with many files. They're fast but young.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://pre-commit.ci/" rel="noopener noreferrer"&gt;pre-commit.ci&lt;/a&gt;&lt;/strong&gt;: a hook that runs in CI on every PR too. For a personal one-maintainer project it's overhead, for a shared repo it would make sense.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>pytest</category>
      <category>ruff</category>
      <category>pyright</category>
      <category>uv</category>
    </item>
    <item>
      <title>Realtime transcription: choices and stories for PyCon IT</title>
      <dc:creator>Alessandra Bilardi</dc:creator>
      <pubDate>Mon, 20 Apr 2026 21:23:48 +0000</pubDate>
      <link>https://dev.to/bilardi/realtime-transcription-choices-and-stories-for-pycon-it-4ehd</link>
      <guid>https://dev.to/bilardi/realtime-transcription-choices-and-stories-for-pycon-it-4ehd</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs4m0zy96pe583q0v8bcl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs4m0zy96pe583q0v8bcl.png" alt="Architecture" width="800" height="957"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why all this interest in realtime transcription
&lt;/h2&gt;

&lt;p&gt;It all started with the collaboration with PyCon IT. At PyCon IT 2025 they set up live transcription with local Whisper on a Graphics Processing Unit (GPU), based on the repo &lt;a href="https://github.com/sofdog-gh/realtime-transcription-fastrtc" rel="noopener noreferrer"&gt;&lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt;&lt;/a&gt;. With the YouTube videos used as tests, all good. With the real audio of a conference room, Whisper started hallucinating: a generative model, if you give it a signal it doesn't recognize, doesn't leave a blank, it writes something anyway.&lt;/p&gt;

&lt;p&gt;For PyCon IT 2026 a different path was needed, on a non-negotiable anchor: no hallucinations. If the model doesn't hear, ok, skip a word. If it hears badly, ok, transcribe badly. But it must not write sentences I didn't say.&lt;/p&gt;

&lt;p&gt;Fixing Whisper's hallucinations directly (Voice Activity Detection, tuning decoding parameters, logprob filters, fine-tuning, ..) would have been a separate effort: I didn't have the time, with everything else to build. A bigger Whisper I haven't tested. Other paid generative Speech To Text (STT) services either: they stay in the same category of a model that produces text token after token, so the structural risk of invention stays. To get out of the category, a managed service based on acoustic decoding was needed. And since it's PyCon, let's also grab the bonus of decoupling the pieces and writing it in a testable way.&lt;/p&gt;

&lt;h2&gt;
  
  
  A model that gets it wrong but doesn't make it up
&lt;/h2&gt;

&lt;p&gt;Let's start with the engine. Then with what's around it.&lt;/p&gt;

&lt;h3&gt;
  
  
  STT: who gets it wrong, who makes it up
&lt;/h3&gt;

&lt;p&gt;I didn't run empirical benchmarks on the three. The choice played out on two axes: &lt;strong&gt;model structure&lt;/strong&gt; (generative or not) and &lt;strong&gt;delivery&lt;/strong&gt; (self-hosted or managed). The properties in the table come from product documentation and from direct observation of Whisper at PyCon IT 2025, not from A/B tests.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Whisper local&lt;/th&gt;
&lt;th&gt;Amazon Transcribe Streaming&lt;/th&gt;
&lt;th&gt;Paid generative STT&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Architecture&lt;/td&gt;
&lt;td&gt;generative (autoregressive)&lt;/td&gt;
&lt;td&gt;non-generative (acoustic decoding)&lt;/td&gt;
&lt;td&gt;generative&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hallucinations structurally possible&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delivery&lt;/td&gt;
&lt;td&gt;self-hosted&lt;/td&gt;
&lt;td&gt;managed&lt;/td&gt;
&lt;td&gt;managed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Setup&lt;/td&gt;
&lt;td&gt;GPU + model&lt;/td&gt;
&lt;td&gt;AWS credentials&lt;/td&gt;
&lt;td&gt;credentials&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network dependency&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;on-site hardware&lt;/td&gt;
&lt;td&gt;$0.024/min&lt;/td&gt;
&lt;td&gt;variable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Declared latency&lt;/td&gt;
&lt;td&gt;1-15s end of segment&lt;/td&gt;
&lt;td&gt;~300ms partial&lt;/td&gt;
&lt;td&gt;depends&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The most important criterion is architecture. A non-generative model cannot, by construction, add words it didn't hear: at worst it skips or gets it wrong. A generative model can. The other criteria (network, cost, latency) are secondary trade-offs, all acceptable for a conference context: there's internet, a 30-minute talk costs ~$0.72, partial results arrive in ~300ms.&lt;/p&gt;

&lt;p&gt;Choice: Amazon Transcribe Streaming. Not because it's "the best" in absolute terms, but because it sits in the category that rules out at the root the problem we're here for. The repo &lt;a href="https://github.com/bilardi/video-to-text" rel="noopener noreferrer"&gt;&lt;code&gt;video-to-text&lt;/code&gt;&lt;/a&gt; I wrote on purpose to test Transcribe as an alternative to Whisper.&lt;/p&gt;

&lt;h3&gt;
  
  
  New repo or fork of the old one ?
&lt;/h3&gt;

&lt;p&gt;The other big choice: fork of &lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt; (the one already used at PyCon IT 2025), or a new repo that takes only the good pieces from the two predecessors (&lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt; and &lt;code&gt;video-to-text&lt;/code&gt;) ?&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Fork&lt;/th&gt;
&lt;th&gt;New repo&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial effort&lt;/td&gt;
&lt;td&gt;low&lt;/td&gt;
&lt;td&gt;medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fragile dependencies inherited&lt;/td&gt;
&lt;td&gt;FastRTC v0.0.26&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architecture&lt;/td&gt;
&lt;td&gt;monolithic to dismantle&lt;/td&gt;
&lt;td&gt;designed for the use case&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testability&lt;/td&gt;
&lt;td&gt;inherits the existing scope&lt;/td&gt;
&lt;td&gt;every component in isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choice: new repo. As a lazy developer one would be tempted to fork, but when a dependency is fragile (FastRTC v0.0.26 isn't a stable standard), a fork could cost more than a targeted rewrite.&lt;/p&gt;

&lt;p&gt;From &lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt; I keep the &lt;code&gt;screen&lt;/code&gt; layout (black background, large text) and the auto-scroll logic of the frontend. From &lt;code&gt;video-to-text&lt;/code&gt; I take the &lt;code&gt;transcribe_service.py&lt;/code&gt; module and the async pattern with &lt;code&gt;asyncio.Queue&lt;/code&gt; + &lt;code&gt;asyncio.gather()&lt;/code&gt;. The rest gets dropped.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture: monolithic or decoupled ?
&lt;/h3&gt;

&lt;p&gt;As a lazy developer, I don't want to redo everything moving from Proof of Concept (PoC) to Minimum Viable Product (MVP). The two predecessors already have pieces that work (the &lt;code&gt;screen&lt;/code&gt; layout of &lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt;, the &lt;code&gt;transcribe_service&lt;/code&gt; of &lt;code&gt;video-to-text&lt;/code&gt;), but they're pieces from different repos, made for different purposes. To recycle them, the modules need clear boundaries.&lt;/p&gt;

&lt;p&gt;A decoupled architecture here means having three components as three separate processes that talk to each other over the network:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the audio client, which captures audio from the system device and sends it to the server&lt;/li&gt;
&lt;li&gt;the server, which receives audio, manages the stream toward Amazon Transcribe, and publishes the text&lt;/li&gt;
&lt;li&gt;the display client, which receives the text from the server and shows it on the dedicated monitor&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The alternative architecture is a single process (a single running program) that captures, transcribes, displays.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Monolithic&lt;/th&gt;
&lt;th&gt;Decoupled&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Deploy&lt;/td&gt;
&lt;td&gt;a single binary&lt;/td&gt;
&lt;td&gt;three components&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Distribution across multiple computers&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;yes (native)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testability&lt;/td&gt;
&lt;td&gt;internal dependencies&lt;/td&gt;
&lt;td&gt;each component in isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Communication overhead&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;network calls&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choice: decoupled. It works both in development with everything on one computer (localhost), and at the conference with three separate computers: audio client in the control room near the mixer, server on any computer connected to the network, and display client on the computer that drives the monitor. The monolithic instead locks everything onto a single computer, and the code couples the components: tests and replacements require more work. With more rooms the bill gets worse: you'd need a full copy of the system per room (audio, server, display for each), whereas the decoupled shares a single server across all rooms, and each room only adds an audio-and-display client on the same computer, or, to avoid running a long cable across the room, a second display client near the monitor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Audio client: browser or standalone ?
&lt;/h3&gt;

&lt;p&gt;The audio to transcribe has different sources depending on the context: laptop microphone in local tests, Universal Serial Bus (USB) or analog mixer in the room, browser loopback for live apps like StreamYard. Who picks up this flow and sends it to the server ?&lt;/p&gt;

&lt;p&gt;Two candidates: the browser app with &lt;code&gt;getUserMedia&lt;/code&gt; (&lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt;'s path), or a standalone Python script launched from the audio computer.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;In the browser&lt;/th&gt;
&lt;th&gt;Standalone Python script&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;System devices (mixer)&lt;/td&gt;
&lt;td&gt;limited&lt;/td&gt;
&lt;td&gt;full access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Browser dependency&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testability&lt;/td&gt;
&lt;td&gt;medium&lt;/td&gt;
&lt;td&gt;high&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choice: standalone Python with &lt;code&gt;sounddevice&lt;/code&gt;. At a conference, audio doesn't come from the speaker's laptop microphone, but from a room mixer or a dedicated microphone connected via USB. The browser's Web Audio APIs don't expose virtual sinks and USB mixers as separate devices. Instead, a Python script with &lt;code&gt;sounddevice&lt;/code&gt; sees all the devices the operating system exposes, loopback and mixer included.&lt;/p&gt;

&lt;h3&gt;
  
  
  Protocol between audio client and server
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt; used Web Real-Time Communication (WebRTC); &lt;code&gt;video-to-text&lt;/code&gt; instead WebSocket (WS). Which makes sense here ?&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;WebRTC&lt;/th&gt;
&lt;th&gt;WS&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Bidirectionality&lt;/td&gt;
&lt;td&gt;required&lt;/td&gt;
&lt;td&gt;not needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network setup&lt;/td&gt;
&lt;td&gt;Network Address Translation (NAT), Traversal Using Relays around NAT (TURN), Interactive Connectivity Establishment (ICE)&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reliability&lt;/td&gt;
&lt;td&gt;path-dependent&lt;/td&gt;
&lt;td&gt;persistent connection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complexity&lt;/td&gt;
&lt;td&gt;high&lt;/td&gt;
&lt;td&gt;low&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choice: WS. The audio client sends, the server receives. Bidirectionality isn't needed, so WebRTC is overkill. Persistence, on the other hand, is: a talk lasts tens of minutes, audio goes in chunks every 100ms, and on the server the same pipe keeps the Amazon Transcribe stream open for the whole session. WS covers both without the WebRTC layers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Transcript channel between server and display
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt; used Server-Sent Events (SSE); &lt;code&gt;video-to-text&lt;/code&gt; WS. Which here ?&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;SSE&lt;/th&gt;
&lt;th&gt;WS&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fits the case&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tech already in use&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;yes (for audio)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duplicate code&lt;/td&gt;
&lt;td&gt;a second handler&lt;/td&gt;
&lt;td&gt;same stack&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choice: WS. SSE would technically be enough (unidirectional server -&amp;gt; client, fine for the transcript). But WS is already in the house for the audio channel: keeping a single technology means a single stack of handlers server-side and a single client-side library, instead of two.&lt;/p&gt;

&lt;h3&gt;
  
  
  Partial results vs final
&lt;/h3&gt;

&lt;p&gt;Amazon Transcribe sends both partials (text that changes until the segment is stable) and finals (stable). To compare the two delivery modes in the field, the display supports both via the &lt;code&gt;?partial=true|false&lt;/code&gt; flag: picked at runtime, not at build.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Partial on by default&lt;/th&gt;
&lt;th&gt;Partial off by default&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Readability on the monitor&lt;/td&gt;
&lt;td&gt;low (changing text)&lt;/td&gt;
&lt;td&gt;high&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Perceived latency&lt;/td&gt;
&lt;td&gt;good&lt;/td&gt;
&lt;td&gt;medium&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choice: off by default. A dedicated monitor with text that writes, erases and rewrites is unpleasant to look at. Partials can be turned on via &lt;code&gt;?partial=true&lt;/code&gt; on the display if in a specific room the delay of finals ends up bothering.&lt;/p&gt;

&lt;h3&gt;
  
  
  Language: zero restart between talks
&lt;/h3&gt;

&lt;p&gt;Amazon Transcribe wants the language when opening the stream (&lt;code&gt;language_code="it-IT"&lt;/code&gt; or &lt;code&gt;"en-US"&lt;/code&gt;). At PyCon, rooms have consecutive talks in different languages: Italian, English. Two paths: language as a global server configuration, or as a parameter per connection of the audio client.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Global in the server&lt;/th&gt;
&lt;th&gt;Per-room parameter&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Language change between talks&lt;/td&gt;
&lt;td&gt;server restart&lt;/td&gt;
&lt;td&gt;zero restart&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scalability to multiple rooms in parallel&lt;/td&gt;
&lt;td&gt;all same language&lt;/td&gt;
&lt;td&gt;each room its own&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choice: per-room parameter. With the global version, a restart would be needed at every language change (or a proxy that discriminates per path, complicating things). With the per-room parameter, the server stays up for the whole day, and the audio client reopens at the next talk with the right language (&lt;code&gt;?lang=it-IT&lt;/code&gt; or &lt;code&gt;?lang=en-US&lt;/code&gt;). And it also works with multiple rooms in parallel: each room has its own language, independent of the others.&lt;/p&gt;

&lt;p&gt;Concretely: every WS connection is an independent handler on FastAPI, and each opens its own Amazon Transcribe stream with its own language. There's no shared state between different streams, so the language of one room cannot affect another.&lt;/p&gt;

&lt;h3&gt;
  
  
  Display: dynamic app or static HTML ?
&lt;/h3&gt;

&lt;p&gt;In this case, the display is what the audience looks at: a dedicated monitor with text scrolling as it arrives. It must update in real time receiving messages from the server, but does nothing else: no forms, no interaction.&lt;/p&gt;

&lt;p&gt;Two paths: a dynamic app (React, Vue or similar, with build and state management), or a static HTML page with a bit of JS that opens a WS and appends text.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Dynamic app&lt;/th&gt;
&lt;th&gt;Static HTML + JS&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Client-side state&lt;/td&gt;
&lt;td&gt;possible&lt;/td&gt;
&lt;td&gt;only via WS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deploy&lt;/td&gt;
&lt;td&gt;requires build&lt;/td&gt;
&lt;td&gt;file served by the server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reuse from &lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;yes (CSS + JS)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choice: static HTML. No client-side state needed: the browser opens the page, receives text via WS, shows it. No build. And the CSS of &lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt;'s &lt;code&gt;screen&lt;/code&gt; mode gets reused as is.&lt;/p&gt;

&lt;h3&gt;
  
  
  Choices at a glance
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;realtime-transcription&lt;/code&gt; choices don't come out of nowhere: some are new decisions for the live use case, others are pieces lifted from the two predecessors. Here they are in a row, with the source of inspiration. For the sequence diagram with WS endpoints and message flow, see the &lt;a href="https://github.com/bilardi/realtime-transcription#architecture" rel="noopener noreferrer"&gt;README of the repo&lt;/a&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;th&gt;Winning option&lt;/th&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;STT&lt;/td&gt;
&lt;td&gt;Amazon Transcribe Streaming&lt;/td&gt;
&lt;td&gt;no hallucinations&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;video-to-text&lt;/code&gt; (transcribe_service)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Repo&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;td&gt;less tech debt&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architecture&lt;/td&gt;
&lt;td&gt;decoupled (3 components)&lt;/td&gt;
&lt;td&gt;reuse from predecessors, deploy flexibility&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audio client&lt;/td&gt;
&lt;td&gt;standalone Python&lt;/td&gt;
&lt;td&gt;full access to system devices&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audio protocol&lt;/td&gt;
&lt;td&gt;WS&lt;/td&gt;
&lt;td&gt;persistent connection, minimal network setup&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transcript channel&lt;/td&gt;
&lt;td&gt;WS&lt;/td&gt;
&lt;td&gt;single stack server + client&lt;/td&gt;
&lt;td&gt;&lt;code&gt;video-to-text&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Partial vs final&lt;/td&gt;
&lt;td&gt;flag `?partial=true\&lt;/td&gt;
&lt;td&gt;false`&lt;/td&gt;
&lt;td&gt;readability on the monitor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;per room&lt;/td&gt;
&lt;td&gt;zero restart between talks, scales to more rooms&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Display&lt;/td&gt;
&lt;td&gt;static HTML&lt;/td&gt;
&lt;td&gt;no build, reuse of existing work&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;realtime-transcription-fastrtc&lt;/code&gt; (CSS + JS &lt;code&gt;screen&lt;/code&gt; mode)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The stories you only find when you plug things in
&lt;/h2&gt;

&lt;p&gt;The real fun starts when you stop drawing and turn on the machines.&lt;/p&gt;

&lt;h3&gt;
  
  
  The device number on Fedora
&lt;/h3&gt;

&lt;p&gt;The first time I ran &lt;code&gt;uv run python -m audio_client --list-devices&lt;/code&gt; I found myself facing a long list with the same hardware (my headphones in the docking station jack) showing up multiple times, with similar names and different IDs. On Linux several audio layers coexist (ALSA at the kernel, JACK for pro audio, PipeWire as a modern sound server) and &lt;code&gt;sounddevice&lt;/code&gt; lists them all: each exposes the same device, each is a candidate on paper.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Backend&lt;/th&gt;
&lt;th&gt;Device ID&lt;/th&gt;
&lt;th&gt;Outcome&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ALSA&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;doesn't work as one might expect&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JACK&lt;/td&gt;
&lt;td&gt;25&lt;/td&gt;
&lt;td&gt;doesn't work as one might expect&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PipeWire (system default)&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;works (it's the active routing of the system)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;There's no logic that helps you pick a priori: it depends on what the system uses as default routing. On Fedora 41 it's PipeWire, so the "right" ID was 20. I tried all three before figuring out the logic.&lt;/p&gt;

&lt;p&gt;Rule of thumb: if the audio doesn't get where it should, try all the candidates before touching the code.&lt;/p&gt;

&lt;h3&gt;
  
  
  The browser loopback
&lt;/h3&gt;

&lt;p&gt;One of the audio sources to transcribe is StreamYard, which is a browser app: the speaker's audio goes out of the browser to the system's default sink. &lt;code&gt;audio_client&lt;/code&gt; with &lt;code&gt;sounddevice&lt;/code&gt; can capture from system devices (microphone, USB mixer), but can't read directly from an app's output. A bridge is needed: a virtual sink the browser writes to, and whose monitor &lt;code&gt;audio_client&lt;/code&gt; reads from.&lt;/p&gt;

&lt;p&gt;On Linux with PipeWire (or PulseAudio) the bridge is &lt;code&gt;module-null-sink&lt;/code&gt;. You load a sink called &lt;code&gt;loopback&lt;/code&gt;, you move the browser's stream onto it, you point &lt;code&gt;audio_client&lt;/code&gt; at the null-sink's monitor. It works on the first try, but there's a side effect: while the browser's stream is on the null-sink, I can't hear it on my headphones anymore. In the room it's not a problem (audio comes from the physical mixer, not from the laptop browser). In development, yes: I can't verify what I'm transcribing.&lt;/p&gt;

&lt;p&gt;I tried three paths: two deaf, one hearing clearly.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;audio_client hears&lt;/th&gt;
&lt;th&gt;Headphones hear&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;module-null-sink&lt;/code&gt; + move browser&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;base setup, muted on the laptop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;module-combine-sink&lt;/code&gt; with slaves&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;failed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;module-null-sink&lt;/code&gt; + &lt;code&gt;module-loopback&lt;/code&gt; as a parallel branch&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes (+~50ms)&lt;/td&gt;
&lt;td&gt;adopted solution&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The path that works is &lt;code&gt;module-loopback&lt;/code&gt; as a parallel branch. The null-sink &lt;code&gt;loopback&lt;/code&gt; stays source for &lt;code&gt;audio_client&lt;/code&gt;; on top you load a &lt;code&gt;module-loopback&lt;/code&gt; that reads from the null-sink's monitor and writes to the default sink. Two independent consumers on the same monitor, neither blocks the other.&lt;/p&gt;

&lt;p&gt;The ~50ms is &lt;code&gt;module-loopback&lt;/code&gt;'s buffer. For the transcription nothing changes: the &lt;code&gt;audio_client&lt;/code&gt; branch stays instant. The 50ms is only what I hear in headphones compared to what leaves the browser.&lt;/p&gt;

&lt;p&gt;Everything is wrapped in two &lt;code&gt;make&lt;/code&gt; commands: &lt;code&gt;make loopback_redirect APP=firefox&lt;/code&gt; (which also accepts &lt;code&gt;MONITOR=1&lt;/code&gt; for the listening branch to headphones) and &lt;code&gt;make loopback_clean&lt;/code&gt; that cleans up.&lt;/p&gt;

&lt;p&gt;Practical choice: default &lt;code&gt;MONITOR=0&lt;/code&gt;. At the conference audio comes from the mixer, not the laptop, so hearing it locally isn't needed. &lt;code&gt;MONITOR=1&lt;/code&gt; is a development luxury.&lt;/p&gt;

&lt;h2&gt;
  
  
  How much hardware do you need ?
&lt;/h2&gt;

&lt;p&gt;I haven't benchmarked the system on specific hardware yet, so I'm basing this on typical sizes of similar Python applications. Better to oversize than to pick the bare minimum: on a real deploy you want margin, not to crash on the first spike.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;RAM/CPU&lt;/th&gt;
&lt;th&gt;Recommended example&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Audio client&lt;/td&gt;
&lt;td&gt;~50-100MB&lt;/td&gt;
&lt;td&gt;Pi 4 2GB with USB mic&lt;/td&gt;
&lt;td&gt;Pi 3 technically enough but tight&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server&lt;/td&gt;
&lt;td&gt;~100-200MB base + ~30-50MB per room&lt;/td&gt;
&lt;td&gt;EC2 t4g.small (2GB, ARM) or Pi 4 4-8GB&lt;/td&gt;
&lt;td&gt;Pi 4 handles 1-2 rooms; EC2 for more&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Display client&lt;/td&gt;
&lt;td&gt;~200-300MB for Chromium&lt;/td&gt;
&lt;td&gt;Pi 4 4GB&lt;/td&gt;
&lt;td&gt;Pi 4 2GB technically enough but tight&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three deploy scenarios:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Recommended device&lt;/th&gt;
&lt;th&gt;When and why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;All separate&lt;/td&gt;
&lt;td&gt;Pi 4 2GB (audio) + EC2 t4g.small (server) + Pi 4 4GB (display)&lt;/td&gt;
&lt;td&gt;Multi-room conference; server in cloud for sharing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All together&lt;/td&gt;
&lt;td&gt;A laptop with 8GB, or a Pi 4 8GB with USB mic&lt;/td&gt;
&lt;td&gt;Development, local demo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audio + server together, display separate&lt;/td&gt;
&lt;td&gt;Pi 4 8GB (audio+server) + Pi 4 4GB (display)&lt;/td&gt;
&lt;td&gt;A single room, zero cloud; the audio Pi also hosts the server&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For one room, two Pis are enough. With a Pi 5 (server) you can push to 2-3 rooms; beyond that, EC2 is the way. EC2 or a more powerful laptop are natural upgrades anywhere, if you want more margin.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anything else to add ?
&lt;/h2&gt;

&lt;p&gt;What's there today is good enough for one room, with any computer connected to the network. But the design holds beyond, when it's worth it.&lt;/p&gt;

&lt;h3&gt;
  
  
  More rooms, same setup
&lt;/h3&gt;

&lt;p&gt;If many rooms in parallel are needed, the infrastructure can be handled with &lt;a href="https://github.com/bilardi/aws-docker-host" rel="noopener noreferrer"&gt;aws-docker-host&lt;/a&gt;, which spins up an Elastic Compute Cloud (EC2) instance with Docker ready to use. The &lt;code&gt;realtime-transcription&lt;/code&gt; server already ships with docker compose, and the opening image describes exactly this scenario.&lt;/p&gt;

&lt;h3&gt;
  
  
  When one EC2 isn't enough: ECS Fargate
&lt;/h3&gt;

&lt;p&gt;If there are many rooms and the load varies, a single static EC2 becomes tight. Fargate (part of Elastic Container Service, ECS) spins up tasks on-demand and shuts them down when needed. But live transcription lives on long-lived WS, and from the AWS documentation there are some points to configure with care (I haven't tested them on the project):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sticky sessions&lt;/strong&gt;: a one-hour WS connection must stay on the same Fargate task. The Application Load Balancer (ALB) supports WS, but the session must be routed with affinity. No per-packet round-robin.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idle timeout&lt;/strong&gt;: the ALB target group default is 60 seconds of inactivity. A 20-second pause between sentences isn't inactivity (the client sends silence every 100ms), but it's worth raising the timeout to a few minutes for safety.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful shutdown&lt;/strong&gt;: during a deploy or a scale-in, the task that's closing must let open Transcribe streams finish, not cut off mid-talk. The container must handle &lt;code&gt;SIGTERM&lt;/code&gt; and close the WSs gracefully, giving the client time to reconnect to a different task.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Authentication on the WebSockets
&lt;/h3&gt;

&lt;p&gt;Today the WSs are open: anyone who knows &lt;code&gt;/ws/audio/{sala}&lt;/code&gt; can inject audio, anyone who knows &lt;code&gt;/ws/transcript/{sala}&lt;/code&gt; can listen. For a deploy in a Local Area Network (LAN) or a private cloud on a Virtual Private Network (VPN) it's perfectly fine. On the public internet you need at least:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a token in the path or query (e.g. &lt;code&gt;?token=...&lt;/code&gt;), validated at connect&lt;/li&gt;
&lt;li&gt;rate limit per Internet Protocol (IP) on the audio channel&lt;/li&gt;
&lt;li&gt;permission separation: whoever can write on room X may not necessarily be allowed to read it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are the minimum requirements to expose the endpoints on the public internet.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>transcribe</category>
      <category>docker</category>
      <category>fastapi</category>
    </item>
    <item>
      <title>Docker on EC2 with Terraform</title>
      <dc:creator>Alessandra Bilardi</dc:creator>
      <pubDate>Fri, 10 Apr 2026 22:25:12 +0000</pubDate>
      <link>https://dev.to/bilardi/docker-on-ec2-with-terraform-41lp</link>
      <guid>https://dev.to/bilardi/docker-on-ec2-with-terraform-41lp</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff629k4bxool0cbahalah.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff629k4bxool0cbahalah.png" alt="Architecture" width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this project
&lt;/h2&gt;

&lt;p&gt;I was preparing a &lt;a href="https://github.com/bilardi/n8n-workshop" rel="noopener noreferrer"&gt;workshop&lt;/a&gt; and needed to expose a url with a specific interface, sparing participants from installing docker or anything else on their machines.&lt;/p&gt;

&lt;p&gt;I built the workshop locally with docker compose, which is one of the ways to develop and test locally: it works, it's fast, it's reproducible. And then?&lt;/p&gt;

&lt;p&gt;Then you need to move everything to the cloud. And as a lazy developer, why not use that same docker compose?&lt;/p&gt;

&lt;p&gt;The point isn't running Docker in the cloud - it's everything around it: HTTPS, custom domain, machine access, data backups, and the ability to rebuild or tear it all down with one command.&lt;/p&gt;

&lt;p&gt;With IaC you can manage HTTPS, custom domain, backups, access and cleanup smoothly: everything in one place, versioned, reproducible. Without IaC, you start from scratch every time.&lt;/p&gt;

&lt;p&gt;The usual options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Manual EC2 setup&lt;/strong&gt;: SSH in, install Docker, configure nginx, certbot, and pray. Slow, fragile, and hard to reproduce.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ECS/Fargate&lt;/strong&gt;: task definition, service discovery, cluster .. for what ? Using Fargate for a single container is like hiring a moving truck to carry your groceries home.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker on EC2 with Terraform&lt;/strong&gt;: one &lt;code&gt;terraform apply&lt;/code&gt; to spin up, one &lt;code&gt;bash scripts/destroy.sh&lt;/code&gt; to tear down. Backups included.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The third option is what I chose because it has the simplest architecture .. and the most complex part depends on your user data !&lt;/p&gt;

&lt;p&gt;The architecture in the image above is generated directly from the Terraform code (spoiler) in the &lt;a href="https://github.com/bilardi/aws-docker-host" rel="noopener noreferrer"&gt;repo&lt;/a&gt;, where you can find the README.md and all the details to use it.&lt;/p&gt;

&lt;p&gt;But let's take it step by step. The third option can be implemented in 1024 different ways: which IaC tool ? How do you handle HTTPS ? How do you access the machine ? Where do you store backups ? How do you manage DNS ? Which AMI ? It depends. The point is asking the right questions.&lt;/p&gt;

&lt;p&gt;As a lazy developer, every choice follows one criterion: less effort, in terms of time, cost, or both. And when less effort isn't enough to decide, the cleanest path is a minimal system: you know what's there, you know what's missing, no surprises.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Terraform and not CDK
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Terraform&lt;/th&gt;
&lt;th&gt;CDK&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;HCL: declarative, simple&lt;/td&gt;
&lt;td&gt;TypeScript/Python: powerful but verbose for simple infra&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;State&lt;/td&gt;
&lt;td&gt;Local file, zero dependencies&lt;/td&gt;
&lt;td&gt;Requires CloudFormation stack, S3 bucket for assets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bootstrap&lt;/td&gt;
&lt;td&gt;&lt;code&gt;terraform init&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cdk bootstrap&lt;/code&gt; already creates resources in your AWS account&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Learning curve&lt;/td&gt;
&lt;td&gt;Low for simple infra&lt;/td&gt;
&lt;td&gt;Need to know both CDK and CloudFormation .. and their quirks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Destruction&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;terraform destroy&lt;/code&gt;: clean, predictable&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cdk destroy&lt;/code&gt;, which sometimes leaves orphaned resources&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For an ephemeral workshop run by one person, Terraform with local state is the minimum effort. CDK makes sense when the infra grows, you need complex logic, or there's a team involved.&lt;/p&gt;

&lt;h2&gt;
  
  
  The choices and why
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;th&gt;Why (less effort)&lt;/th&gt;
&lt;th&gt;The discarded alternative (more effort)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ALB + ACM&lt;/td&gt;
&lt;td&gt;Free HTTPS certificate, auto-renewal, no certbot/nginx&lt;/td&gt;
&lt;td&gt;Let's Encrypt on EC2: port 80 open, cron for renewal, more moving parts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSM instead of SSH&lt;/td&gt;
&lt;td&gt;No keys, no port 22, audit trail on CloudTrail&lt;/td&gt;
&lt;td&gt;SSH key pair, SG rules, bastion if private subnet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 for backups&lt;/td&gt;
&lt;td&gt;Costs nothing, survives the EC2, simple CLI&lt;/td&gt;
&lt;td&gt;EBS snapshot: tied to instance lifecycle, harder to restore&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Route 53 hosted zone&lt;/td&gt;
&lt;td&gt;DNS validation for ACM, alias record for ALB, all managed by Terraform&lt;/td&gt;
&lt;td&gt;External DNS only: manual certificate validation or HTTP challenge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Amazon Linux 2023 minimal&lt;/td&gt;
&lt;td&gt;Clean AMI, you install only what you need&lt;/td&gt;
&lt;td&gt;AL2023 standard: doesn't have Docker anyway, but has hundreds of extra packages you don't need&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker compose up --build&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Works with both &lt;code&gt;build&lt;/code&gt; and &lt;code&gt;image&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Separate logic for build vs pull: pointless complexity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local state&lt;/td&gt;
&lt;td&gt;The workshop is ephemeral, one operator, no team&lt;/td&gt;
&lt;td&gt;Remote state (S3 + DynamoDB): cost and setup for zero benefit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conditional VPC&lt;/td&gt;
&lt;td&gt;Three modes: use an existing VPC, find the default, or create a new one&lt;/td&gt;
&lt;td&gt;Always new VPC: waste for a workshop running in the default VPC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conditional S3 bucket&lt;/td&gt;
&lt;td&gt;Pass one and it uses it. Don't, and it creates one named after the domain&lt;/td&gt;
&lt;td&gt;Always new bucket: waste for someone running many workshops and just managing backups&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What I learned (the hard way)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The right AMI and how much disk
&lt;/h3&gt;

&lt;p&gt;As a lazy developer, instead of reading the documentation, one command to see what's out there:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 describe-images &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="s2"&gt;"Name=name,Values=al2023-ami-*-x86_64"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--owners&lt;/span&gt; amazon &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'reverse(sort_by(Images, &amp;amp;CreationDate))[:10].[Name, BlockDeviceMappings[0].Ebs.VolumeSize]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three variants: &lt;strong&gt;minimal&lt;/strong&gt; (2 GB), &lt;strong&gt;standard&lt;/strong&gt; (8 GB), &lt;strong&gt;ECS-optimized&lt;/strong&gt; (30 GB). The ECS one comes with Docker but is meant to run in an ECS cluster, not on a standalone EC2. Standard and minimal don't have Docker: you need to install it either way.&lt;/p&gt;

&lt;p&gt;At that point, what does the standard have that minimal doesn't ? SSM agent and a few hundred packages you don't need. The &lt;a href="https://docs.aws.amazon.com/linux/al2023/ug/image-comparison.html" rel="noopener noreferrer"&gt;package comparison page&lt;/a&gt; confirms it: no Docker, no buildx, nothing that changes the picture.&lt;/p&gt;

&lt;p&gt;Minimal is the cleanest choice: install Docker, SSM agent and buildx in the user data, and you know exactly what's on the machine. One thing to watch: the 2 GB disk isn't enough, set &lt;code&gt;volume_size = 20&lt;/code&gt; and move on.&lt;/p&gt;

&lt;h3&gt;
  
  
  ssm-user is not root
&lt;/h3&gt;

&lt;p&gt;When you connect with &lt;code&gt;aws ssm start-session&lt;/code&gt;, you're &lt;code&gt;ssm-user&lt;/code&gt;. You don't have access to the Docker socket. Everything needs &lt;code&gt;sudo&lt;/code&gt;. Commands sent with &lt;code&gt;aws ssm send-command&lt;/code&gt; run as &lt;code&gt;root&lt;/code&gt; though, so sudo is built in.&lt;/p&gt;

&lt;h3&gt;
  
  
  buildx: no buildx, no build
&lt;/h3&gt;

&lt;p&gt;From Docker Compose v2.17+ the &lt;code&gt;--build&lt;/code&gt; flag requires buildx &amp;gt;= 0.17.0. The minimal AMI doesn't have it. Without buildx, &lt;code&gt;docker compose up --build&lt;/code&gt; fails even if no service uses &lt;code&gt;build&lt;/code&gt;: install it in the user data and forget about it.&lt;/p&gt;

&lt;h3&gt;
  
  
  That damn cache
&lt;/h3&gt;

&lt;p&gt;After a destroy + redeploy, the new Route 53 hosted zone gets different nameservers. You update the NS records on the DNS provider, everything looks fine. But the browser says no.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;dig @8.8.8.8&lt;/code&gt; tells you it's all good. But your local resolver disagrees.&lt;/p&gt;

&lt;p&gt;What happens: your ISP's resolver has the old SERVFAIL cached, and until it expires, that domain doesn't exist as far as it's concerned.&lt;/p&gt;

&lt;p&gt;The fix: temporarily switch your local DNS to Google (&lt;code&gt;8.8.8.8&lt;/code&gt;) and wait for your provider's cache to expire: they say 5-10 minutes, but sometimes (way) longer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anything else to add ?
&lt;/h2&gt;

&lt;p&gt;When it's not a workshop of a few hours but something that lasts weeks or months, it's worth investing extra effort to make the system hold up over time. But remember, it's always a temporary solution !&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;More subdomains&lt;/strong&gt;: more applications on the same ALB, with routing rules, separate target groups, and potentially more containers on the same EC2 or, if needed, dedicated EC2s per service&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tactical scheduling&lt;/strong&gt;: start/stop the EC2 to save money off-hours, periodic backups with EventBridge + SSM, not just at destroy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CloudWatch alarms&lt;/strong&gt;: basic monitoring (CPU, disk, health check) with SNS notifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-recovery&lt;/strong&gt;: ASG with min=max=1 to replace dying instances (user data restores everything from S3)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spot instances&lt;/strong&gt;: for workshops that tolerate interruptions, ~70% cost reduction&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>terraform</category>
      <category>docker</category>
      <category>aws</category>
      <category>ec2</category>
    </item>
  </channel>
</rss>
