<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Antonio Feregrino</title>
    <description>The latest articles on DEV Community by Antonio Feregrino (@feregri_no).</description>
    <link>https://dev.to/feregri_no</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3891%2Fa214417e-d7a0-4031-ae66-e45fa309b76a.png</url>
      <title>DEV Community: Antonio Feregrino</title>
      <link>https://dev.to/feregri_no</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/feregri_no"/>
    <language>en</language>
    <item>
      <title>Understanding (a bit of) the Gradle Kotlin DSL</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Sun, 12 Jan 2025 18:22:09 +0000</pubDate>
      <link>https://dev.to/feregri_no/understanding-a-bit-of-the-gradle-kotlin-dsl-57kg</link>
      <guid>https://dev.to/feregri_no/understanding-a-bit-of-the-gradle-kotlin-dsl-57kg</guid>
      <description>&lt;p&gt;Ever since I started learning Kotlin I have been intrigued by what is going on behind the scenes in the &lt;code&gt;build.gradle.kts&lt;/code&gt; file. At first glance, the file looks confusing, and for a newbie like me, it is hard to comprehend how the snippet below even has Kotlin-valid syntax.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nf"&gt;plugins&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;kotlin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"jvm"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"1.9.25"&lt;/span&gt;
  &lt;span class="nf"&gt;kotlin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"plugin.spring"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"1.9.25"&lt;/span&gt;
  &lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"org.springframework.boot"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"3.4.1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To fully understand how is this Kotlin there are a couple of mechanisms at play here that you must understand.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lambdas with receivers&lt;/li&gt;
&lt;li&gt;Infix functions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What is &lt;code&gt;plugins&lt;/code&gt;?
&lt;/h3&gt;

&lt;p&gt;Internally, &lt;code&gt;plugins&lt;/code&gt; is a function whose signature looks somewhat like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;plugins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;PluginDependenciesSpecScope&lt;/span&gt;&lt;span class="p"&gt;.()&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;Unit&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells us that &lt;code&gt;plugins&lt;/code&gt; is a function that takes a single argument named &lt;code&gt;block&lt;/code&gt; which known as a lambda with receiver.&lt;/p&gt;

&lt;p&gt;A lambda with receiver is a special kind of lambda that allows you to call methods on a specific object (the receiver) within the lambda's body without explicitly referencing it.&lt;/p&gt;

&lt;p&gt;For example, in our function signature above, we can see that the &lt;em&gt;receiving&lt;/em&gt; type is &lt;code&gt;PluginDependenciesSpecScope&lt;/code&gt; from the package &lt;code&gt;org.gradle.kotlin.dsl&lt;/code&gt;. This lambda function does not receive any arguments, as indicated by the empty parentheses after the receiver type specification &lt;code&gt;.()&lt;/code&gt;, and it does not return anything meaningful, as indicated by the &lt;code&gt;-&amp;gt; Unit&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;A potential implementation of the &lt;code&gt;plugins&lt;/code&gt; function could be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;plugins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;PluginDependenciesSpecScope&lt;/span&gt;&lt;span class="p"&gt;.()&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;Unit&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;scope&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PluginDependenciesSpecScope&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;block&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While Gradle is way more complex, this is just a very simple example of how the function could be implemented. The key part is that behind the scenes, an instance of &lt;code&gt;PluginDependenciesSpecScope&lt;/code&gt; is created for you, and &lt;code&gt;block&lt;/code&gt;, being an extension method, can be called directly on this instance.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is &lt;code&gt;kotlin&lt;/code&gt; and &lt;code&gt;id&lt;/code&gt;?
&lt;/h3&gt;

&lt;p&gt;Both &lt;code&gt;kotlin&lt;/code&gt; and &lt;code&gt;id&lt;/code&gt; are extension methods of &lt;code&gt;PluginDependenciesSpecScope&lt;/code&gt;, the reason behind one being able to call these methods directly without specifying the &lt;em&gt;receiver&lt;/em&gt; is because this receiver is implicit, but if you wanted to, you reference it directly using the &lt;code&gt;this&lt;/code&gt; keyword:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nf"&gt;plugins&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kotlin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"jvm"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"1.9.25"&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kotlin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"plugin.spring"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"1.9.25"&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"org.springframework.boot"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"3.4.1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both these methods follow the fluent pattern, meaning they return the same object they acted on. For example, the signature of the &lt;code&gt;kotlin&lt;/code&gt; method looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nc"&gt;PluginDependenciesSpec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kotlin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;module&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;PluginDependencySpec&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The difference between &lt;code&gt;kotlin&lt;/code&gt; and &lt;code&gt;id&lt;/code&gt;&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;Behind the scenes, &lt;code&gt;kotlin&lt;/code&gt; is just a shorthand for &lt;code&gt;id&lt;/code&gt;, when you call &lt;code&gt;kotlin("jvm")&lt;/code&gt;, this methods calls &lt;code&gt;id&lt;/code&gt; but appends the prefix &lt;code&gt;org.jetbrains.kotlin.&lt;/code&gt; to whatever string you passed, so &lt;code&gt;kotlin("jvm")&lt;/code&gt; is equal to &lt;code&gt;id("org.jetbrains.kotlin.jvm")&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  So, what is &lt;code&gt;version&lt;/code&gt;?
&lt;/h3&gt;

&lt;p&gt;The missing piece in my understanding of the Kotlin Gradle DSL is the &lt;code&gt;version&lt;/code&gt; in the plugin definition, at first I thought this was a keyword, however, it turns out is an extension method, but not any kind of method but an &lt;em&gt;infix&lt;/em&gt; one.&lt;/p&gt;

&lt;p&gt;The signature of the &lt;code&gt;version&lt;/code&gt; method looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;infix&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nc"&gt;PluginDependencySpec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;PluginDependencySpec&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, it takes operates on a &lt;code&gt;PluginDependencySpec&lt;/code&gt; but takes in a &lt;code&gt;String&lt;/code&gt; and returns a &lt;code&gt;PluginDependencySpec&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;An infix function can be called without using dot notation and parentheses. It's declared using the &lt;code&gt;infix&lt;/code&gt; keyword and must have a single parameter. This allows for more readable, natural language-like expressions in code.&lt;/p&gt;

&lt;p&gt;So we could further change our plugin block definition to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nf"&gt;plugins&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kotlin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"jvm"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"1.9.25"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kotlin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"plugin.spring"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"1.9.25"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"org.springframework.boot"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"3.4.1"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;h3&gt;
  
  
  On Lambdas with receivers
&lt;/h3&gt;

&lt;p&gt;While the approach of using lambdas with receivers is powerful for creating DSLs, it can initially be confusing for those new to Kotlin or Gradle. This technique allows for a more natural, declarative syntax, enabling code that looks like special language constructs when it's actually just method calls on an implicit receiver.&lt;/p&gt;

&lt;p&gt;Once grasped, this results in more readable and expressive code that closely resembles the domain it's describing - in this case, Gradle build configurations. However, the implicit nature of the receiver can be a source of confusion for beginners trying to understand what's happening "behind the scenes.”&lt;/p&gt;

&lt;h3&gt;
  
  
  On fluent interfaces and infix methods
&lt;/h3&gt;

&lt;p&gt;The fluent pattern and infix methods aim to create a chain of method calls that read almost like natural language. In theory, this should enhance readability and make the build script more intuitive. However, for those not deeply familiar with Gradle, Kotlin, or the concept of DSLs, this syntax can initially seem magical or confusing.&lt;/p&gt;

&lt;p&gt;It's not immediately obvious that &lt;code&gt;version&lt;/code&gt; is a method, for instance, or why we can omit parentheses in some places but not others. While it allows for a clear separation of concerns, each method in the chain responsible for a specific aspect of plugin configuration requires some learning and adjustment to fully appreciate and utilise effectively.&lt;/p&gt;

&lt;h3&gt;
  
  
  As for me…
&lt;/h3&gt;

&lt;p&gt;While I'm still getting accustomed to the Gradle Kotlin DSL and occasionally find it confusing, I've come to appreciate its power and elegance. Building DSLs is indeed one of Kotlin's strengths, and understanding these concepts opens up exciting possibilities for creating expressive and maintainable code.&lt;/p&gt;

&lt;p&gt;By exploring the mechanics behind Gradle's Kotlin DSL, I feel I have demystified a bit of its syntax and gained insights into advanced Kotlin features that can be applied in various contexts. When mastered, these patterns and techniques can significantly enhance our ability to write clean, readable, and powerful code.&lt;/p&gt;

&lt;p&gt;I hope this dive into the Gradle Kotlin DSL has been as enlightening for you as it has been for me. Whether you're a seasoned Kotlin developer or just starting out, understanding these concepts can greatly improve your grasp of Gradle and Kotlin's capabilities.&lt;/p&gt;

</description>
      <category>gradle</category>
      <category>kotlin</category>
      <category>dsl</category>
      <category>androiddev</category>
    </item>
    <item>
      <title>Creando un notebook con Jupyter y Kotlin</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Mon, 06 Jan 2025 21:48:34 +0000</pubDate>
      <link>https://dev.to/feregri_no/creando-un-notebook-con-jupyter-y-kotlin-5g8d</link>
      <guid>https://dev.to/feregri_no/creando-un-notebook-con-jupyter-y-kotlin-5g8d</guid>
      <description>&lt;h2&gt;
  
  
  Introducción
&lt;/h2&gt;

&lt;p&gt;Recientemente, comencé a sumergirme en el mundo de Kotlin, un lenguaje de programación moderno y versátil que ha captado mi atención. Sin embargo, como alguien acostumbrado al entorno interactivo de Jupyter, que permite iteraciones rápidas y una exploración fluida del código, me preguntaba si existía algo similar para Kotlin.&lt;/p&gt;

&lt;p&gt;Para mi agradable sorpresa, descubrí que existe un &lt;a href="https://github.com/Kotlin/kotlin-jupyter" rel="noopener noreferrer"&gt;kernel de Jupyter para Kotlin&lt;/a&gt;. Esta herramienta combina la potencia y elegancia de Kotlin con la interactividad y facilidad de uso de Jupyter, creando un ambiente de desarrollo ideal para aprender y experimentar con el lenguaje.&lt;/p&gt;

&lt;p&gt;En este post, compartiré mi experiencia configurando un entorno de Jupyter con soporte para Kotlin, e incluso iré un paso más allá, creando un notebook que permite trabajar con múltiples lenguajes simultáneamente.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creando un contenedor con Kotlin
&lt;/h2&gt;

&lt;p&gt;La instalación del kernel de Kotlin para Jupyter es relativamente sencilla, especialmente si utilizamos Docker para crear un entorno controlado y reproducible. Veamos el Dockerfile que he creado para este propósito – revisa los comentarios para entender cada paso:&lt;/p&gt;

&lt;h3&gt;
  
  
  Dockerfile
&lt;/h3&gt;

&lt;p&gt;Comenzamos con una imagen oficial de Jupyter descargada de quay.io. Usamos una versión específica para asegurar la reproducibilidad y etiquetamos la imagen como &lt;code&gt;kotlin-kernel&lt;/code&gt; para identificarla fácilmente.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;quay.io/jupyter/base-notebook:2024-12-31&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;kotlin-kernel&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instalamos OpenJDK 21, necesario para ejecutar Kotlin, la instalación se realiza como root para evitar problemas de permisos y luego cambiamos al usuario no-root para asegurar la seguridad de la imagen.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; root&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nb"&gt;install &lt;/span&gt;openjdk-21-jdk

&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; jovyan&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instalamos el kernel de Kotlin para Jupyter, esto nos permitirá ejecutar código Kotlin en nuestro notebook.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--user&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    kotlin-jupyter-kernel&lt;span class="o"&gt;==&lt;/span&gt;0.12.0.322
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Creamos un directorio para almacenar los notebooks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /home/jovyan/notebooks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Por último, establecemos la variable de entorno &lt;code&gt;NOTEBOOK_ARGS&lt;/code&gt; que permite configurar el notebook con las opciones que necesitemos, en este caso, no queremos que se abra un navegador automáticamente y queremos que el directorio de notebooks sea &lt;code&gt;/home/jovyan/notebooks&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; NOTEBOOK_ARGS="--no-browser --notebook-dir=/home/jovyan/notebooks"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Para construir la imagen Docker, ejecutamos:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;--target&lt;/span&gt; kotlin-kernel &lt;span class="nt"&gt;-t&lt;/span&gt; kotlin-kernel &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Este comando construye la imagen Docker y la etiqueta como &lt;code&gt;kotlin-kernel&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Para ejecutar el contenedor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-p&lt;/span&gt; 8888:8888 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;/notebooks:/home/jovyan/notebooks &lt;span class="se"&gt;\&lt;/span&gt;
    kotlin-kernel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Este comando:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ejecuta el contenedor en modo interactivo (&lt;code&gt;-it&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Mapea el puerto 8888 del contenedor al puerto 8888 del host (&lt;code&gt;-p 8888:8888&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Monta el directorio local &lt;code&gt;notebooks&lt;/code&gt; en el directorio &lt;code&gt;:/home/jovyan/notebooks&lt;/code&gt; del contenedor (&lt;code&gt;-v $(pwd)/notebooks::/home/jovyan/notebooks&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Una vez ejecutado, podrás acceder a JupyterLab en tu navegador y verás que el Launcher ya tiene dos kernels disponibles: Python y Kotlin.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Ftwo-kernels%3FupdatedAt%3D1735657648064" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Ftwo-kernels%3FupdatedAt%3D1735657648064" alt="Kernels disponibles" width="468" height="260"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Y de hecho, ya podemos crear notebooks con Kotlin!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Frunning-kotlin%3FupdatedAt%3D1735657858774" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Frunning-kotlin%3FupdatedAt%3D1735657858774" alt="Notebook con Kotlin" width="765" height="250"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  El siguiente paso en interactividad
&lt;/h2&gt;

&lt;p&gt;Al profundizar en Kotlin, noté algunas similitudes interesantes con Python. Esto me llevó a querer visualizar estas similitudes de manera más detallada, creando comparaciones directas entre los dos lenguajes. Me pregunté si sería posible ejecutar código Python y Kotlin en el mismo notebook, y resulta que sí es posible.&lt;/p&gt;

&lt;p&gt;Descubrí una extensión (y kernel de Jupyter) llamada SoS (&lt;a href="https://vatlab.github.io/sos-docs/" rel="noopener noreferrer"&gt;Script of Scripts&lt;/a&gt;) que permite esta funcionalidad. Decidí agregarla a mi contenedor con el kernel de Kotlin. Aquí están las adiciones al Dockerfile:&lt;/p&gt;

&lt;h3&gt;
  
  
  Actualización del Dockerfile
&lt;/h3&gt;

&lt;p&gt;Instalamos SoS, que nos permitirá ejecutar código Python y Kotlin en el mismo notebook.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--user&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    sos-notebook&lt;span class="o"&gt;==&lt;/span&gt;0.24.4 &lt;span class="se"&gt;\
&lt;/span&gt;    jupyterlab-sos&lt;span class="o"&gt;==&lt;/span&gt;0.11.0 &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nv"&gt;sos&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;0.25.1 &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    python &lt;span class="nt"&gt;-m&lt;/span&gt; sos_notebook.install
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Con estas adiciones, ahora podemos construir y ejecutar nuestro contenedor mejorado:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;-t&lt;/span&gt; jupyter-kotlin &lt;span class="nb"&gt;.&lt;/span&gt;

docker run &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-p&lt;/span&gt; 8888:8888 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;/notebooks:/home/jovyan/notebooks &lt;span class="se"&gt;\&lt;/span&gt;
    jupyter-kotlin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Al acceder a JupyterLab ahora, verás tres kernels disponibles: Python, Kotlin y SoS.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Fthree-kernels%3FupdatedAt%3D1735658830005" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Fthree-kernels%3FupdatedAt%3D1735658830005" alt="Kernels disponibles" width="534" height="254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Y ahora podemos ejecutar código Python y Kotlin en el mismo notebook:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Fjupyter-kotlin-indicators%3FupdatedAt%3D1735659227991" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Fjupyter-kotlin-indicators%3FupdatedAt%3D1735659227991" alt="Notebook con Python y Kotlin" width="497" height="230"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Personalización extra
&lt;/h2&gt;

&lt;p&gt;Para mejorar la experiencia visual y distinguir fácilmente entre las celdas de diferentes lenguajes, decidí personalizar la apariencia de las celdas.&lt;/p&gt;

&lt;p&gt;Jupyter Notebook permite agregar CSS personalizado, lo que nos permite añadir gradientes a la izquierda de cada celda, dependiendo del lenguaje.&lt;/p&gt;

&lt;p&gt;Aquí está el CSS que utilicé:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nt"&gt;class&lt;/span&gt;&lt;span class="o"&gt;*=&lt;/span&gt;&lt;span class="s1"&gt;"sos_lan__python"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; 
    &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;linear-gradient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;90deg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;222&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;87&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;10px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;69&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;132&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;182&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;10px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;69&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;132&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;182&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;20px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;254&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;254&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;254&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;20px&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nt"&gt;class&lt;/span&gt;&lt;span class="o"&gt;*=&lt;/span&gt;&lt;span class="s1"&gt;"sos_lan__kotlin"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;linear-gradient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;90deg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;180&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;140&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;252&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;0px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;196&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;22&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;6px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;223&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;73&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;107&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;16px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;223&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;73&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;107&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;20px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="m"&gt;20px&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Para implementar esta personalización, guardé el CSS en un archivo llamado &lt;code&gt;custom.css&lt;/code&gt; y lo agregué al Dockerfile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# Copy the custom.css file&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; custom.css ${HOME}/.jupyter/custom/custom.css&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Además, es necesario especificar al comando &lt;code&gt;jupyter lab&lt;/code&gt; que queremos usar este CSS personalizado, añadiendo la bandera &lt;code&gt;--custom-css&lt;/code&gt; al comando de ejecución.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; NOTEBOOK_ARGS="${NOTEBOOK_ARGS} --custom-css"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Fjupyter-kotlin%3FupdatedAt%3D1735659227973" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Fjupyter-kotlin%3FupdatedAt%3D1735659227973" alt="Notebook con CSS personalizado" width="588" height="243"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Errores y cómo esconderlos
&lt;/h2&gt;

&lt;p&gt;Durante el uso del kernel de múltiples lenguajes, ocasionalmente aparece un error cuando se ejecuta una celda de Kotlin. Este error se muestra de forma aleatoria y, aunque aún no he logrado identificar su origen ni cómo resolverlo de manera definitiva, he encontrado una solución temporal para mejorar la experiencia del usuario.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Fkotlin-with-error%3FupdatedAt%3D1735659310180" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fik.imagekit.io%2Fthatcsharpguy%2Fposts%2Fdocker%2Fkotlin-kernel%2Fkotlin-with-error%3FupdatedAt%3D1735659310180" alt="Error de Kotlin" width="800" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Para ocultar este error molesto, decidí utilizar CSS. Agregué la siguiente línea al archivo &lt;code&gt;custom.css&lt;/code&gt; mencionado anteriormente:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nt"&gt;class&lt;/span&gt;&lt;span class="o"&gt;*=&lt;/span&gt;&lt;span class="s1"&gt;"sos_lan__kotlin"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nt"&gt;data-mime-type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;"application/vnd.jupyter.stderr"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; 
    &lt;span class="nl"&gt;display&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;none&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; 
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Esta línea de CSS oculta los mensajes de error específicos de Kotlin en el notebook. Aunque no es una solución ideal, ya que podría ocultar errores importantes, mejora significativamente la experiencia visual al trabajar con notebooks de Kotlin, especialmente cuando se trata de este error recurrente y aparentemente inofensivo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusión
&lt;/h2&gt;

&lt;p&gt;En este post, hemos explorado cómo crear un entorno de desarrollo interactivo para Kotlin utilizando Jupyter Notebooks.&lt;/p&gt;

&lt;p&gt;Comenzamos con la configuración básica de un contenedor Docker con soporte para Kotlin, luego avanzamos hacia un entorno más sofisticado que permite la ejecución de código en múltiples lenguajes dentro del mismo notebook.&lt;/p&gt;

&lt;p&gt;Además, hemos visto cómo personalizar la apariencia de nuestros notebooks para mejorar la experiencia visual y la legibilidad, y cómo &lt;em&gt;"esconder"&lt;/em&gt; algunos errores comunes que pueden surgir durante el uso de estos notebooks.&lt;/p&gt;

&lt;p&gt;Esto no solo facilita el aprendizaje de Kotlin, sino que también permite realizar comparaciones directas con otros lenguajes como Python, lo cual puede ser extremadamente útil para desarrolladores que están haciendo la transición a Kotlin o que trabajan regularmente con múltiples lenguajes de programación.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recursos adicionales
&lt;/h2&gt;

&lt;p&gt;Para aquellos interesados en explorar más a fondo o replicar este entorno, he puesto a disposición todo el código utilizado en &lt;a href="https://github.com/fferegrino/jupyter-kotlin" rel="noopener noreferrer"&gt;este proyecto en mi repositorio de GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Espero que esta guía te sea útil en tu viaje de aprendizaje con Kotlin y Jupyter.&lt;/p&gt;

</description>
      <category>jupyter</category>
      <category>python</category>
      <category>kotlin</category>
      <category>docker</category>
    </item>
    <item>
      <title>Más allá del pickle: el verdadero resultado de un equipo de aprendizaje automático</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Mon, 16 Sep 2024 19:06:46 +0000</pubDate>
      <link>https://dev.to/feregri_no/mas-alla-del-pickle-el-verdadero-resultado-de-un-equipo-de-aprendizaje-automatico-4k35</link>
      <guid>https://dev.to/feregri_no/mas-alla-del-pickle-el-verdadero-resultado-de-un-equipo-de-aprendizaje-automatico-4k35</guid>
      <description>&lt;p&gt;Imagínate esto: Un científico de datos recibe un problema, desaparece en las profundidades del &lt;em&gt;data warehouse&lt;/em&gt; durante meses y emerge triunfante con un archivo &lt;em&gt;pickle&lt;/em&gt;. ¿Qué contiene ese archivo? Un modelo que presume de una precisión impresionante y que está listo para que los ingenieros de software lo pongan en producción.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;¡Lo hemos conseguido! ¡hicimos machine learning!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Er... no, la verdad es que no.&lt;/p&gt;

&lt;p&gt;En los primeros días del ML -y seamos sinceros, esto sigue siendo así en muchas empresas hoy en día- todo giraba en torno a esta criatura mítica conocida como «El Científico de Datos». Se esperaba que lo hicieran todo, desde la gestión de los datos hasta la creación de modelos y el despliegue en producción. Esta es sin duda una idea seductora, pero llena de problemas.&lt;/p&gt;

&lt;h2&gt;
  
  
  La dura realidad
&lt;/h2&gt;

&lt;p&gt;Algunos de los problemas que surgen de este enfoque y otros similares son los siguientes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Mala asignación de competencias: Los científicos de datos son brillantes, pero sus habilidades a menudo se utilizan mejor en otros ámbitos. Sus habilidades de codificación pueden no ser las mejores cuando se trata de código optimizado, y estoy bastante seguro de que para la mayoría de ellos, escribir miles de líneas de SQL no es algo que les fascine.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Desconexión de la producción: Este enfoque mantiene a los científicos de datos en la oscuridad sobre las realidades de los despliegues de producción, y aunque esto suena como una buena idea para algunos de ellos, en realidad les está quitando oportunidades de aprendizaje que podrían informar sus futuras oportunidades de trabajo.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Problemas de produccionalización: Cuando llega el momento del despliegue, los ingenieros de MLOps, ML y software a menudo se encuentran rascándose la cabeza, tratando de descifrar el código del científico de datos y entender cómo se supone que debe resolver el problema. Es como recibir un rompecabezas al que le faltan la mitad de las piezas. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Discrepancia en &lt;em&gt;features&lt;/em&gt;: Dado que en un data &lt;em&gt;warehouse&lt;/em&gt;, es fácil usar todas las que encontremos. El modelo puede construirse con características que ni siquiera están disponibles cuando llega el momento de ejecutar la inferencia en el mundo real. Imagínese entrenarse para un maratón en casa en una caminadora y luego sorprenderse al descubrir que la carrera es en el Himalaya. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Falta de escalabilidad: Cuando cada proyecto depende de una sola persona o de un pequeño equipo que se encarga de todo de principio a fin, resulta casi imposible ampliar las operaciones o asumir varios proyectos simultáneamente.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dificultad de mantenimiento: Una vez implantado el modelo, ¿quién supervisa su rendimiento? ¿Quién lo actualiza cuando inevitablemente empieza a degradarse? A menudo, cuando hay que reentrenar el modelo el científico de datos ya está trabajando en el siguiente proyecto.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;¿El resultado de todo esto? Interesados frustrados que se dan golpecitos en los pies, preguntándose por qué todo tarda tanto o se rompe en pedazos, a veces lentamente, a veces con un fuerte estruendo. La promesa de ML se convierte en un juego de espera.&lt;/p&gt;

&lt;p&gt;No estoy aquí para señalar a nadie. Los científicos de datos son increíbles -incluso intenté ser uno de ellos-, pero no tuve su paciencia ni sus habilidades. A lo que quiero llegar es a la importancia de la colaboración, las herramientas y la cultura para que los proyectos de ML tengan éxito.&lt;/p&gt;

&lt;h2&gt;
  
  
  De las fábricas de modelos a las fábricas de fábricas
&lt;/h2&gt;

&lt;p&gt;Esta es mi propuesta: En lugar de ver el producto final como un modelo entrenado en producción, ¿qué pasaría si apuntáramos a algo más grande? ¿Y si nuestro objetivo fuera crear código que cree, reentrene y supervise modelos?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;En otras palabras, dejemos de ser una fábrica de modelos para convertirnos en una fábrica de fábricas de modelos.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fry6pxbdm4mkadhbcz19h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fry6pxbdm4mkadhbcz19h.png" alt="Image description" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No se trata sólo de una diferencia semántica: es un cambio fundamental para abordar los proyectos de aprendizaje automático.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ventajas del enfoque de fábrica de fábricas
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Reproducibilidad mejorada: No dependemos de la suerte al entrenar un modelo o de la intuición de un único científico de datos. Cada paso del proceso está documentado y es repetible. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mejor gestión de errores: Gracias a los  de supervisión y formación, a menudo podemos solucionar los problemas sin necesidad de recurrir al científico de datos cada vez que algo va mal.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Escalabilidad mejorada: Una vez que disponemos de la infraestructura necesaria para crear y gestionar modelos de forma automática, podemos gestionar varios proyectos simultáneamente sin un aumento lineal de la carga de trabajo. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tiempo de comercialización más rápido: Con canales automatizados para la preparación de datos, la formación de modelos y el despliegue, podemos pasar de la idea a la producción mucho más rápidamente.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mejor asignación de recursos: Los científicos de datos pueden centrarse en tareas de alto valor, como la ingeniería de características y la arquitectura de modelos, en lugar de estancarse en tareas operativas. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mayor colaboración: Los procesos claramente definidos y las interfaces compartidas facilitan que los miembros del equipo con diferentes especialidades trabajen juntos de forma eficaz. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mejora continua: Con el reentrenamiento y la supervisión automatizados, los modelos pueden adaptarse continuamente a los nuevos datos, manteniendo su rendimiento a lo largo del tiempo sin intervención manual. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mejora de la gobernanza y el cumplimiento: Los pipelines automatizados facilitan la implementación y el cumplimiento de las normas para el manejo de datos, la validación de modelos y el despliegue.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Hacerlo realidad: La ruta hacia el éxito
&lt;/h2&gt;

&lt;p&gt;Convertirse en una fábrica de fábricas no se consigue de la noche a la mañana. Requiere un cambio fundamental tanto en las herramientas como en la cultura.&lt;br&gt;
He aquí una lista no exhaustiva de lo que se necesita:&lt;/p&gt;

&lt;h3&gt;
  
  
  Herramientas esenciales
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Almacenes de datos bien documentados: Ya sea un &lt;em&gt;data lake&lt;/em&gt;, un &lt;em&gt;warehouse&lt;/em&gt; o una &lt;em&gt;feature store&lt;/em&gt;, los datos deben estar listos para que los científicos de datos los consuman. La documentación debe incluir el linaje y la disponibilidad de las funciones (es decir, ¿esta función está disponible en tiempo real?). Los almacenes de datos también deben ser fáciles de modificar y añadir funciones según sea necesario. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Plataformas de experimentación: Los científicos de datos necesitan entornos en los que puedan probar ideas rápidamente sin preocuparse por romper los sistemas de producción. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Infraestructura específica de ML: Ya sea una plataforma en la nube o una solución local, la infraestructura debe estar optimizada para las demandas de las cargas de trabajo de aprendizaje automático. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Seguimiento de experimentos, artefactos y modelos: Herramientas que permitan a los equipos realizar un seguimiento de los diferentes experimentos, comparar versiones de modelos y comprender qué funciona y qué no. Esto también podría servir como catálogo de modelos, un lugar donde todo el mundo pueda saber qué está en producción. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Frameworks&lt;/em&gt; de desarrollo y despliegue: Herramientas que funcionen a la perfección desde el desarrollo hasta la producción, minimizando la necesidad de reescribir el código o de complejas transferencias de código. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Soluciones de supervisión automatizadas: Sistemas que pueden realizar un seguimiento del rendimiento del modelo, la deriva de los datos y otras métricas clave sin supervisión humana constante, pero capaces de alertar cuando las cosas van mal.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Sistema de control de versiones: No basta con hacer un seguimiento de los modelos y experimentos; también es necesario hacer un seguimiento del código que los crea, que es nuestro principal producto.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Plataformas de colaboración: Herramientas que facilitan la comunicación y el intercambio de conocimientos entre los diferentes roles dentro y fuera del equipo de ML.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cambios culturales necesarios
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Científicos de datos más interesados: Necesitan sentirse cómodos con una gama más amplia de herramientas. No buscamos un científico de datos que lo sepa todo, sino uno que entienda y aproveche las herramientas que tiene a su disposición. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Inversión en infraestructura: Las empresas deben comprender que invertir en las herramientas y plataformas adecuadas reporta dividendos a largo plazo. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Planificación inclusiva: Todo el mundo, incluido el personal técnico, debe tener un sitio en la mesa de negociación, desde el principio de un proyecto y a la hora de definir su alcance y etapas. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Compromiso de las partes interesadas: Tiene que haber voluntad de probar y comprobar los resultados intermedios. El aprendizaje automático no es una solución mágica; necesita una retroalimentación temprana y constante. Los equipos deben sentirse cómodos con la mejora continua y la iteración. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Pensamiento a largo plazo: Las organizaciones deben planificar el mantenimiento y la mejora continuos de sus sistemas de ML. El software se degrada rápidamente, y el aprendizaje automático se degrada aún más porque es software MÁS datos desordenados. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Comunicación mejorada: Las reuniones regulares y los canales de comunicación claros entre los equipos de datos, los equipos de ML y otras partes de la organización son esenciales. Debido a su naturaleza, el ML tiende a estar al final de la cadena alimentaria de los datos y, como tal, a menudo es olvidado por los generadores de datos. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Equipos interfuncionales: Romper los silos entre los científicos de datos, los ingenieros de ML, los desarrolladores de software y los expertos en el dominio conduce a soluciones más sólidas y prácticas. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Nota sobre AutoML
&lt;/h4&gt;

&lt;p&gt;Algunas personas ven AutoML como una solución para adoptar el aprendizaje automático en sus organizaciones. Estas plataformas intentan automatizar muchos aspectos del proceso de ML y ofrecen ventajas como una estrecha integración con los datos, la democratización del ML, una supervisión sencilla y una comercialización más rápida.&lt;/p&gt;

&lt;p&gt;Sin embargo, las herramientas AutoML tienen posibles inconvenientes: el riesgo de dependencia del proveedor, la personalización limitada para necesidades empresariales específicas, las posibles restricciones de despliegue y la dificultad para comprender lo que ocurre bajo el capó.&lt;/p&gt;

&lt;p&gt;Aunque valiosas, las herramientas AutoML no pueden reemplazar la necesidad de canalizaciones ML personalizadas a medida que las operaciones se vuelven más complejas; son sólo una herramienta en el conjunto de herramientas ML y no niegan la necesidad de cambios. Puede considerarlas un punto de partida, pero prepárese para complementarlas o sustituirlas por soluciones personalizadas a medida que maduren sus operaciones de ML.&lt;/p&gt;

&lt;h2&gt;
  
  
  La verdadera medida del éxito
&lt;/h2&gt;

&lt;p&gt;Esto no significa que tengas que tener todas las piezas en su sitio ahora mismo para considerarse un éxito. Conseguirlo lleva tiempo y es un proceso. Cada paso que das hacia esta visión es un progreso. Empieza poco a poco, céntrate en una parte de tu flujo de trabajo y sigue avanzando a partir de ahí.&lt;/p&gt;

&lt;p&gt;Y si tienes que empezar por algún sitio, hazlo por la cultura. Las herramientas van y vienen, pero una base cultural sólida te servirá independientemente de las tecnologías concretas que utilices.&lt;/p&gt;

&lt;p&gt;A medida que esta cultura arraigue, le resultará más fácil identificar qué herramientas necesita y cómo implantarlas con eficacia. También estará mejor posicionado para utilizar estas herramientas de una manera que realmente mejore sus capacidades de ML en lugar de simplemente añadir complejidad.&lt;/p&gt;

&lt;p&gt;El verdadero objetivo de un equipo de aprendizaje automático nunca debería ser crear un único modelo perfecto o tener el &lt;em&gt;stack&lt;/em&gt; tecnológico más elegante. Se trata de crear un sistema (y un equipo) que pueda producir, mejorar y mantener modelos eficaces de forma constante. Así es como realmente aprovecharemos el poder del aprendizaje automático de una manera que sea sostenible, escalable y realmente útil.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recuerda, fábricas de fábricas.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;[Gracias a Ned Webster y Lorraine D'Almeida por sus comentarios sobre este artículo].&lt;/p&gt;

</description>
      <category>mlops</category>
      <category>machinelearning</category>
      <category>softwaredevelopment</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Face detection in movie trailers</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Thu, 29 Aug 2024 19:28:06 +0000</pubDate>
      <link>https://dev.to/feregri_no/face-detection-in-movie-trailers-1di0</link>
      <guid>https://dev.to/feregri_no/face-detection-in-movie-trailers-1di0</guid>
      <description>&lt;p&gt;I recently started &lt;a href="https://www.youtube.com/@AntonioFeregrino/videos" rel="noopener noreferrer"&gt;a new YouTube channel&lt;/a&gt; where I review movies and TV shows. To make my videos a little bit more interesting, I show scenes from the trailers and the original movies instead of just my talking head.&lt;/p&gt;

&lt;p&gt;In particular, if I speak of a particular character, I like to show the scenes where that character appears. This is a process that I started doing manually by scrolling through the video and carefully selecting the scenes where the character appears, but then I thought, "&lt;em&gt;Wait a minute, this could be a job for a computer.&lt;/em&gt;" &lt;/p&gt;

&lt;p&gt;The following is a step-by-step guide on how I did it and how you can do it too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Overall architecture
&lt;/h2&gt;

&lt;p&gt;I started by thinking of this process as a data pipeline: starting with the full-length video, detecting scene changes, and then finding the faces in those scenes. With the faces extracted, I can perform clustering to group the faces belonging to the same individual. Once I have the clusters, I can just re-stitch the video with the scenes where the faces are from the same cluster.&lt;/p&gt;

&lt;p&gt;A pipeline that looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbdub848yotm8gmic2hzx.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbdub848yotm8gmic2hzx.jpeg" alt="An initial pipeline design" width="800" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Everything starts with a video
&lt;/h2&gt;

&lt;p&gt;I will be using a movie from the Malayalam film industry – a movie I recently reviewed on my channel, the movie is called &lt;em&gt;Ullozhukku&lt;/em&gt; and this is its trailer:&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/iElmR97W024"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;I downloaded the video and saved it in my movies folder, and I will be using the path in my local machine:&lt;/p&gt;

&lt;p&gt;I downloaded the video and saved it in my movies folder, and I will be using the path in my local machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;original_video_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Ullozhukku-Trailer.mp4&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Scene detection
&lt;/h2&gt;

&lt;p&gt;Thankfully, there are libraries to help us out with scene change detection, such as &lt;code&gt;scenedetect&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;You can install it via pip:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;scenedetect[opencv]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It could not be easier to use the &lt;code&gt;detect&lt;/code&gt; function along with a &lt;code&gt;ContentDetector&lt;/code&gt; to detect the scenes in the video:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;scenedetect&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;detect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ContentDetector&lt;/span&gt;

&lt;span class="n"&gt;scenes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;detect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_video_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ContentDetector&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If needed you can further customize the scene detection by passing arguments to the &lt;code&gt;ContentDetector&lt;/code&gt; constructor.&lt;/p&gt;

&lt;p&gt;The return value of the &lt;code&gt;detect&lt;/code&gt; function is a list of tuples, where each tuple contains a scene's start and end time in the shape of a &lt;code&gt;FrameTimecode&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;&lt;code&gt;FrameTimecode&lt;/code&gt; has methods to get the exact frame number and seconds.&lt;/p&gt;

&lt;p&gt;In my case, I'll be introducing a dataclass to store the scene data more conveniently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dataclasses&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dataclass&lt;/span&gt;


&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Scene&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;end_time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;start_frame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;end_frame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;

    &lt;span class="nd"&gt;@property&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end_time&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_time&lt;/span&gt;


&lt;span class="n"&gt;detected_scenes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nc"&gt;Scene&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get_seconds&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get_seconds&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get_frames&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get_frames&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;scenes&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  First frame extraction
&lt;/h2&gt;

&lt;p&gt;We will work under the assumption that the first frame of each scene is the most representative frame of the scene – after all, the scene is defined by dramatic changes between frames.&lt;/p&gt;

&lt;p&gt;Using &lt;code&gt;opencv&lt;/code&gt; we can easily extract frames from a video, but first we need to turn the video into a &lt;code&gt;VideoCapture&lt;/code&gt; object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;

&lt;span class="n"&gt;video&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;VideoCapture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_video_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can use the &lt;code&gt;read()&lt;/code&gt; method of the &lt;code&gt;VideoCapture&lt;/code&gt; object to get the current frame in the video. However, we need to set the position of the video to the start of the scene we want to extract; we can do this using the &lt;code&gt;set()&lt;/code&gt; method of the &lt;code&gt;VideoCapture&lt;/code&gt; object.&lt;/p&gt;

&lt;p&gt;We need to do this for each scene in the video:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;first_frames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;detected_scenes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CAP_PROP_POS_FRAMES&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_frame&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;first_frames&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cvtColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COLOR_BGR2RGB&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;.read()&lt;/code&gt; returns a tuple, where the first element is a boolean indicating whether the frame was successfully read, and the second element is the frame itself.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;.cvtColor()&lt;/code&gt; is used to convert the frame from BGR to RGB, which is the format expected by some image processing libraries in Python.&lt;/p&gt;

&lt;p&gt;For example, these are the first frames of six random scenes in the original video:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyj4361d45oakdhl9ezuv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyj4361d45oakdhl9ezuv.png" alt="Some scenes from the original video" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Face detection
&lt;/h2&gt;

&lt;p&gt;We will use the &lt;code&gt;ultralytics&lt;/code&gt; library to detect the faces in the video – be aware that you may need to install the &lt;code&gt;torch&lt;/code&gt; and &lt;code&gt;torchvision&lt;/code&gt; libraries too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;ultralytics
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;code&gt;ultralytics&lt;/code&gt; we can use a YOLO model to detect the faces in the video, in this case we need to use a custom model that has been trained to detect faces such as &lt;code&gt;yolov5s_face_relu6.pt&lt;/code&gt; that I got from &lt;a href="https://github.com/DeGirum/ultralytics_yolov8/releases/tag/v1.0.0" rel="noopener noreferrer"&gt;this repository&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;ultralytics&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;YOLO&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;YOLO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;yolov5s_face_relu6.pt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we call the model with a frame, it will return a list of results (with a single element), this single element is a complex object we need to treat before accessing the bounding box coordinates, the confidence score, and the class ID.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_frames&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;class_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;boxes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x1: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, y1: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, x2: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, y2: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, confidence: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, class_id: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;class_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I will introduce another dataclass to store the detected faces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DetectedFace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, we can iterate over the results and extract the detected faces, at this point we can also filter out the faces with low confidence, and just to be sure, we can check that the class ID is 0 (which corresponds to the &lt;code&gt;Human face&lt;/code&gt; class):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;faces_in_frame&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;detections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;boxes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;det&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;detections&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;class_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;det&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;class_id&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Human face&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;faces_in_frame&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;DetectedFace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And just as a sanity check, let's plot the detected faces for one frame:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh3v0alktfeohe1ha87hr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh3v0alktfeohe1ha87hr.png" alt="Detected faces in a video scene" width="515" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Face embedding
&lt;/h2&gt;

&lt;p&gt;Before clustering the faces, we need to extract the face embeddings. These embeddings are a numerical representation of the face that approximately encode facial features that we can then use to compare and cluster the faces.&lt;/p&gt;

&lt;p&gt;We will use the &lt;code&gt;face_recognition&lt;/code&gt; library to extract the face embeddings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;face-recognition
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;face_recognition&lt;/code&gt; library provides a &lt;code&gt;face_encodings&lt;/code&gt; function that can extract the face embeddings from a frame given a set of face locations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;face_recognition&lt;/span&gt;

&lt;span class="n"&gt;encodings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;face_recognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;face_encodings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;first_frames&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;face&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;face&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;face&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;face&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;face&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt;
 &lt;span class="n"&gt;faces_in_frame&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;face_encodings&lt;/code&gt; function returns a list of encodings, where each encoding is a numpy array of 128 values.&lt;/p&gt;

&lt;p&gt;We can use these encodings to compare and cluster the faces.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting it all together
&lt;/h2&gt;

&lt;p&gt;But before doing that, we need to detect the faces in each scene and extract the face embeddings, we also need to keep track of the scene id for each face to be able to retrieve the original scenes later.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_faces&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;detections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;boxes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;det&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;detections&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;class_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;det&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;class_id&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;conf&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;DetectedFace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_encodings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detections&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;face_recognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;face_encodings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;detection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;detection&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;detections&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;face_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;detected_faces&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;encodings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;face_id_to_scene&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;scene_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_frames&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;face_detection_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;detect_faces&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;detection&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;face_detection_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;detected_faces&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;detection&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;face_id_to_scene&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;face_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scene_id&lt;/span&gt;
        &lt;span class="n"&gt;face_id&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;encodings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;extract_encodings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;face_detection_results&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By now we have a list of detected faces, a list of encodings, and a dictionary that maps each scene to the detected faces in that scene.&lt;/p&gt;

&lt;h2&gt;
  
  
  Clustering
&lt;/h2&gt;

&lt;p&gt;We will use the &lt;code&gt;scikit-learn&lt;/code&gt; library to perform the clustering.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;scikit-learn
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We will use the &lt;code&gt;DBSCAN&lt;/code&gt; algorithm to cluster the faces. &lt;code&gt;DBSCAN&lt;/code&gt; is a clustering algorithm that groups together points that are close to each other, and separates points that are far apart.&lt;/p&gt;

&lt;p&gt;It's a powerful algorithm that can find arbitrarily shaped clusters, and it doesn't require the number of clusters to be specified beforehand.&lt;/p&gt;

&lt;p&gt;It identifies high-density regions (clusters) and low-density regions (noise). It groups together points that are close to each other and separates points that are far apart.&lt;/p&gt;

&lt;p&gt;It requires two parameters to be set:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;eps&lt;/code&gt;: the maximum distance between two samples to be considered in the same neighbourhood.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;min_samples&lt;/code&gt;: the minimum number of samples in a neighbourhood for a point to be considered a core point.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We can play with the parameters to see how the clustering changes – but in my experiments, I found that these values worked well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;eps=0.45&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;min_samples=3&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's perform the clustering using the encodings and print the results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.cluster&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DBSCAN&lt;/span&gt;

&lt;span class="n"&gt;clustering&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DBSCAN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clustering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encodings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To get the clusters we can access the &lt;code&gt;labels_&lt;/code&gt; attribute of the &lt;code&gt;DBSCAN&lt;/code&gt; object, this will return an array of labels, one for each encoding. A label of -1 means that the encoding is a noise point (meaning that it's not part of any cluster).&lt;/p&gt;

&lt;p&gt;We can then use these labels to group the detected faces into clusters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;face_clusters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clustering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;labels_&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# -1 is noise
&lt;/span&gt;        &lt;span class="n"&gt;face_clusters&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And just as a sanity check, let's plot some of the detected faces, now grouped by cluster.&lt;/p&gt;

&lt;p&gt;Remember that we have a dictionary that maps each face to the scene it belongs to, so we can use this to plot the detected faces in the original scenes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfmba2ccpoa1u527e1j0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfmba2ccpoa1u527e1j0.png" alt="Faces in cluster 0" width="800" height="201"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwdntgxgtmtltfga865mq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwdntgxgtmtltfga865mq.png" alt="Faces in cluster 1" width="800" height="201"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6hztefl2ok06xfatcacj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6hztefl2ok06xfatcacj.png" alt="Faces in cluster 2" width="800" height="201"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see, the clustering algorithm has done a pretty good job of grouping the faces that belong to the same individual. However, we can see that some clusters contain more than one face, and some faces that belong to the same individual are in different clusters. &lt;/p&gt;

&lt;p&gt;I only care about clear face shots for my use case, so whatever the algorithm didn't cluster correctly, I will discard it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Assembling clips
&lt;/h2&gt;

&lt;p&gt;Now, we need to assemble the clips for each cluster. We will use the &lt;code&gt;moviepy&lt;/code&gt; library to do this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;moviepy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We will iterate over the desired clusters and assemble the clips for each cluster.&lt;/p&gt;

&lt;p&gt;Let's start by creating a function that takes a video and a list of scenes and returns a video clip containing the scenes using the &lt;code&gt;subclip&lt;/code&gt; method of the &lt;code&gt;VideoFileClip&lt;/code&gt; object. Then we can use the &lt;code&gt;concatenate_videoclips&lt;/code&gt; function to concatenate the clips.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;moviepy.editor&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VideoFileClip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;concatenate_videoclips&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_video&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_video&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scenes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;subclips&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;original_video&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subclip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end_time&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;scenes&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;final_clip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;concatenate_videoclips&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subclips&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;final_clip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write_videofile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;final_clip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we can use this function to assemble the clips for each cluster by iterating over the desired clusters, selecting the scenes where the faces in the cluster appear, and then creating a video clip for each cluster with the function we just created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;desired_clusters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;output_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Parvathy&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Urvashi&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;original_video&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VideoFileClip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_video_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cluster_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;desired_clusters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_names&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;original_video&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VideoFileClip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_video_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;face_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;face_clusters&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cluster_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;scene_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;face_id_to_scene&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;face_ids&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;scenes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;detected_scenes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;scene_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;scene_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;scene_ids&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;video_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.mp4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="nf"&gt;create_video&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_video&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scenes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;video_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;original_video&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that's it! We have successfully clustered the faces in the video and assembled the clips for each cluster. Just what we wanted.&lt;/p&gt;

&lt;p&gt;Find the results below:&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/f9t-11feZqc"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;The code is far from perfect and could be optimized further, but it works well for my use case, and I hope it can be helpful for yours too, or at least it gives you some ideas on how to tackle your own problem.&lt;/p&gt;

&lt;p&gt;If you want the full code, find &lt;a href="https://colab.research.google.com/drive/1U17aMuTrlWqMV4pYUbLABZt4skfR6ftM?usp=sharing" rel="noopener noreferrer"&gt;it in this Jupyter Notebook&lt;/a&gt;. &lt;/p&gt;

</description>
      <category>python</category>
      <category>moviepy</category>
      <category>clustering</category>
      <category>facerecognition</category>
    </item>
    <item>
      <title>Documenting my pin collection with Segment Anything: Part 4</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Sat, 22 Jun 2024 17:21:30 +0000</pubDate>
      <link>https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-4-3ea2</link>
      <guid>https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-4-3ea2</guid>
      <description>&lt;p&gt;Welcome to the fourth entry in my series where I document my journey of cataloguing my enamel pin collection. If you missed the previous posts, you can catch up &lt;a href="https://dev.to/feregri_no/series/27656"&gt;&lt;strong&gt;here&lt;/strong&gt;&lt;/a&gt;. Previously, &lt;a href="https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-3-4iam"&gt;I introduced a simple app&lt;/a&gt; that segments each pin, assigning unique identifiers and names. Although I shared some future enhancements at the end of my last post, it dawned on me that I had slightly deviated from my main objective: &lt;strong&gt;effectively showcase my collection&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this update, I'll take you through the process of integrating all previous developments into a single interactive webpage. This page highlights each pin, with detailed information accessible via mouse hover, all crafted using HTML, JavaScript, and jQuery.&lt;/p&gt;

&lt;p&gt;As always, let me show you what the end product looks like:&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/2uNkWPo6XAI"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;And the live web page of my &lt;a href="https://pins.feregri.no/v2/" rel="noopener noreferrer"&gt;pin collection showcase v2 here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Improving the quality of the cutout
&lt;/h2&gt;

&lt;p&gt;Before getting into the front-end development, I wanted to try a couple of things to improve the quality of the cutout. &lt;/p&gt;

&lt;p&gt;If you remember, from a previous post, the output of the Segment Anything Model is a set of masks covering where the segmented object is, however, for my use case the edges of the masks always ended up being a bit jagged, too pointy and complex, so I created the following function in an attempt to simplify the edges of the mask:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;refine_mask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;polygons&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Polygon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;poly&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;poly&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mask_to_polygons&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;single_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;unary_union&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;polygons&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;single_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geom_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Polygon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;selected_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;single_polygon&lt;/span&gt;

    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;single_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geom_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MultiPolygon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;selected_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;single_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geoms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unexpected geometry type: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;single_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geom_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;simplified_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;simplify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_polygon&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;selected_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simplified_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;join_style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;join_style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exterior&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xy&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;selected_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exterior&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xy&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="n"&gt;polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;new_mask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;polygon_to_mask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exterior&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;int32&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;new_mask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;polygon&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A brief description of the function behaviour is:&lt;/p&gt;

&lt;h3&gt;
  
  
  Parameters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;image&lt;/strong&gt;: This is the original image associated with the mask. It is used to determine the dimensions for the new mask.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;mask&lt;/strong&gt;: This is a binary mask produced by SAM where the areas of interest are marked.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Function Body
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Convert Mask to Polygons
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;polygons&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Polygon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;poly&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;poly&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mask_to_polygons&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Converts the mask into a list of &lt;code&gt;shapely&lt;/code&gt;’s &lt;code&gt;Polygon&lt;/code&gt; objects. It is achieved by detecting contours or similar features in the mask using the &lt;code&gt;supervision&lt;/code&gt; library’s &lt;code&gt;mask_to_polygons&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Merge Polygons
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;single_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;unary_union&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;polygons&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combines these polygons into a single polygon using &lt;code&gt;shapely.ops.unary_union&lt;/code&gt;, which efficiently merges overlapping or adjacent polygons.&lt;/p&gt;

&lt;h4&gt;
  
  
  Select Largest Polygon (if necessary)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;single_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geom_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Polygon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;selected_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;single_polygon&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;single_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geom_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MultiPolygon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;selected_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;single_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geoms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unexpected geometry type: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;single_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;geom_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Checks the geometry type of the resultant polygon. If it's a &lt;code&gt;MultiPolygon&lt;/code&gt; (which in my case happens quite often), it selects the polygon with the largest area, assuming that that largest area is one that contains the pin.&lt;/p&gt;

&lt;h4&gt;
  
  
  Simplify Polygon
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;simplified_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;simplify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_polygon&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simplifies the polygon's shape to reduce the number of vertices, making the shape easier to handle and process, the &lt;code&gt;simplify&lt;/code&gt; function comes from the &lt;code&gt;shapely&lt;/code&gt; module.&lt;/p&gt;

&lt;h4&gt;
  
  
  Buffering
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;selected_polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simplified_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;join_style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;join_style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Applies a buffer of 10 units outward and then -10 units inward to smooth and regularise the edges, potentially cleaning up the polygon's boundary.&lt;/p&gt;

&lt;h4&gt;
  
  
  Extract Coordinates
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;polygon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exterior&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xy&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;selected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;polygons&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exterior&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xy&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
    &lt;span class="n"&gt;polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Extracts the x and y coordinates from the exterior of the selected polygon and stores them in a list where the coordinates are laid like this: &lt;code&gt;[x1, y1, x2, y2, ..., xn, yn]&lt;/code&gt;, which is useful when showing the polygon in the front end as image maps.&lt;/p&gt;

&lt;h4&gt;
  
  
  Convert Polygon to Mask
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;new_mask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;polygon_to_mask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_polygon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exterior&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;int32&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, converts the simplified polygon back into a mask format with the original image's dimensions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Returns
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;new_mask&lt;/strong&gt;: The refined mask derived from the largest or simplified polygon.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;polygon&lt;/strong&gt;: The coordinates of the simplified polygon.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Some results
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ynfwx56jdx1gd50mqxo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ynfwx56jdx1gd50mqxo.png" width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this image, it is possible to see how in the original cutout there was an extra bit of image that does not belong to the pin badge, with the refining function we got rid of it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Filf7bc2f67ndjbfmajw1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Filf7bc2f67ndjbfmajw1.png" width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The refining function not only helps in removing the unwanted bits of the image but also helps in removing empty spaces that should not be there. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvl3uwifcakp83ipaibau.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvl3uwifcakp83ipaibau.png" width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, the benefits of the refining function is not always visible, as shown above.&lt;/p&gt;

&lt;h2&gt;
  
  
  Front-end
&lt;/h2&gt;

&lt;p&gt;Now, on to the front-end, where most of the time was invested.&lt;/p&gt;

&lt;h3&gt;
  
  
  A new &lt;code&gt;view&lt;/code&gt; endpoint
&lt;/h3&gt;

&lt;p&gt;I added a new endpoint to my FastAPI app, this endpoint serves the existing masks rendered into an HTML that will show the original image along with an &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/map" rel="noopener noreferrer"&gt;HTML &lt;code&gt;map&lt;/code&gt; element&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/view/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;existing_cutouts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_selected_cutouts&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;templates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TemplateResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;view.html.jinja&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;imageWidth&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;og_image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;imageHeight&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;og_image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;existing_cutouts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;existing_cutouts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;turns_image_to_base64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;og_image&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The &lt;code&gt;view.html.jinja&lt;/code&gt; template:
&lt;/h3&gt;

&lt;p&gt;The template is quite simple since most of the interactivity and functionality is in the JavaScript code that I will explain later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;!DOCTYPE html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;html&lt;/span&gt; &lt;span class="na"&gt;lang=&lt;/span&gt;&lt;span class="s"&gt;"en"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;head&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- Code omitted for brevity --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/head&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;body&amp;gt;&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"{{image}}"&lt;/span&gt; &lt;span class="na"&gt;alt=&lt;/span&gt;&lt;span class="s"&gt;"Enamel Pins Collection"&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"canvasMapContainer"&lt;/span&gt; &lt;span class="na"&gt;usemap=&lt;/span&gt;&lt;span class="s"&gt;"#pinmap"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;map&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"pinmap"&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"pinmap"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
      {%- for cutout in existing_cutouts %}
      &lt;span class="nt"&gt;&amp;lt;area&lt;/span&gt; &lt;span class="na"&gt;shape=&lt;/span&gt;&lt;span class="s"&gt;"poly"&lt;/span&gt; &lt;span class="na"&gt;coords=&lt;/span&gt;&lt;span class="s"&gt;"{{cutout.polygon | join(',')}}"&lt;/span&gt; 
        &lt;span class="na"&gt;data-name=&lt;/span&gt;&lt;span class="s"&gt;"{{cutout.name}}"&lt;/span&gt;
        &lt;span class="err"&gt;{%&lt;/span&gt;&lt;span class="na"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt; &lt;span class="na"&gt;cutout.description&lt;/span&gt; &lt;span class="err"&gt;%}&lt;/span&gt;
        &lt;span class="na"&gt;data-description=&lt;/span&gt;&lt;span class="s"&gt;"{{cutout.description}}"&lt;/span&gt;
        &lt;span class="err"&gt;{%&lt;/span&gt;&lt;span class="na"&gt;-&lt;/span&gt; &lt;span class="na"&gt;endif&lt;/span&gt; &lt;span class="err"&gt;%}&lt;/span&gt;
        &lt;span class="na"&gt;alt=&lt;/span&gt;&lt;span class="s"&gt;"{{cutout.name}}"&lt;/span&gt; &lt;span class="na"&gt;data-key=&lt;/span&gt;&lt;span class="s"&gt;"{{cutout.uuid}}"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"#"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
      {%- endfor %}
  &lt;span class="nt"&gt;&amp;lt;/map&amp;gt;&lt;/span&gt;

  &lt;span class="c"&gt;&amp;lt;!-- Modal --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;dialog&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"modal"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;article&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;header&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;h3&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"infoPinNameModal"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/h3&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/header&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;p&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"infoPinDescriptionModal"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/article&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/dialog&amp;gt;&lt;/span&gt;

  &lt;span class="c"&gt;&amp;lt;!-- Tooltip --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"tooltip"&lt;/span&gt; &lt;span class="na"&gt;style=&lt;/span&gt;&lt;span class="s"&gt;"display: none;"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;article&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;header&amp;gt;&amp;lt;h3&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"infoPinNameTooltip"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/h3&amp;gt;&amp;lt;/header&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;p&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"infoPinDescriptionTooltip"&lt;/span&gt; &lt;span class="na"&gt;style=&lt;/span&gt;&lt;span class="s"&gt;"display: none;"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/article&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;

    &lt;span class="nt"&gt;&amp;lt;script&amp;gt;&lt;/span&gt;
    &lt;span class="cm"&gt;/* Functionality described below */&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/body&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/html&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are five key pieces to this app:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;img&lt;/code&gt; tag with &lt;code&gt;canvasMapContainer&lt;/code&gt; as id. This image will show the image containing all the pins. This image is the same I have been working with across these series of posts. The tag has &lt;code&gt;src="{{image}}"&lt;/code&gt; where the image is provided via the server as a base46 image. Another thing to note about this image tag is that it has the property &lt;code&gt;usemap&lt;/code&gt; set to &lt;code&gt;"#pinmap"&lt;/code&gt;, this lets the browser know that there is an image map attached to this image.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;map&lt;/code&gt; tag contains areas that correspond to different parts of the enamel pin canvas, notice how the map’s &lt;code&gt;name&lt;/code&gt; property matches the value set as &lt;code&gt;usemap&lt;/code&gt; in the image above. These values are set dynamically at render time, the loop &lt;code&gt;{%- for cutout in existing_cutouts %}&lt;/code&gt; allows us to create an &lt;code&gt;area&lt;/code&gt; element with information such as polygon coordinates, name and descriptions for each of the pins.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;dialog&lt;/code&gt; tag with &lt;code&gt;modal&lt;/code&gt; as id. This element is used to display more detailed information about a selected pin. This element is hidden by default and only shown whenever a user clicks on a pin.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;div&lt;/code&gt; that works as a floating tooltip that displays basic information about the pin over which the user is hovering the cursor. Just like the modal dialog above, this tooltip is hidden initially and shown on certain interactions defined in the script below.&lt;/li&gt;
&lt;li&gt;The fifth element is a &lt;code&gt;script&lt;/code&gt; that orchestrates the whole functionality of the app, it requires more than a simple paragraph to explain its functionality, continue reading to learn more about it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The app’s logic
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Dependencies&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;jQuery&lt;/strong&gt;: A fast, small, and feature-rich JavaScript library, some people may think it is quite outdated, however, it simplifies things like HTML document traversal and manipulation, event handling, and even animation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ImageMapster&lt;/strong&gt;: A jQuery plugin that provides interactive image maps functionality. It allows images to be used with areas that can be manipulated and interacted with in various ways.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Functionality&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Everything happens after the document has been loaded, inside a &lt;code&gt;$(document).ready(function() { });&lt;/code&gt;  definition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modal and Tooltip Interaction&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;The script initialises variables for modal and tooltip elements, as well as several configuration variables for classes and animation timing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;$modal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#modal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isOpenClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;modal-is-open&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openingClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;modal-is-opening&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;closingClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;modal-is-closing&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;scrollbarWidthCssVar&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;--pico-scrollbar-width&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;animationDuration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// ms&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;$tooltip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#tooltip&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;$infoPinNameTooltip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#infoPinNameTooltip&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;$infoPinNameModal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#infoPinNameModal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;$canvasMapContainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#canvasMapContainer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;visibleModal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It defines functions to toggle, open, and close the modal. The modal can be opened or closed either by clicking on an area of the image map or using the Escape key.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;  &lt;span class="c1"&gt;// Toggle modal&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;toggleModal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;$modal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nx"&gt;$modal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;open&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nf"&gt;closeModal&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;openModal&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// Open modal&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openModal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;html&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;addClass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;isOpenClass&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;addClass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;openingClass&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;visibleModal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;$modal&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;html&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;removeClass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;openingClass&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;animationDuration&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;$modal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;showModal&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// Close modal&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;closeModal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;visibleModal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;html&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;addClass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;closingClass&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;html&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;removeClass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;closingClass&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;removeClass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;isOpenClass&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;html&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;css&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scrollbarWidthCssVar&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nx"&gt;$modal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;animationDuration&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// Close with a click outside&lt;/span&gt;
  &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;click&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;visibleModal&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isClickInside&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;visibleModal&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;article&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;isClickInside&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;closeModal&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Close with Esc key&lt;/span&gt;
  &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;keydown&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Escape&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;visibleModal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nf"&gt;closeModal&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Interactive Image Map Setup&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The image map is initialized with the ImageMapster plugin, which is configured to not allow selection (highlighting) of map areas but to react to mouse events – &lt;a href="https://jamietre.github.io/ImageMapster/reference/configuration-reference/" rel="noopener noreferrer"&gt;this plugin’s documentation&lt;/a&gt; is top-notch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;$canvasMapContainer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mapster&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;enableAutoResizeSupport&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;autoResize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;isSelectable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;stroke&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;strokeColor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;00FF00&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;strokeWidth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;mapKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data-key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;fillOpacity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="c1"&gt;// ....&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On clicking an image map area, the script fetches the area's data attributes (like name), updates the modal's content, and toggles the modal's visibility.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;    &lt;span class="nx"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;$infoPinNameModal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nf"&gt;toggleModal&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On mouseover, the tooltip's content is updated based on the hovered area's data attributes, and its position is dynamically calculated to appear near the cursor but adjusted to avoid overflowing the viewport.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;    &lt;span class="nx"&gt;onMouseout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;$tooltip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hide&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="nx"&gt;onMouseover&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// ... see below for the dynamic positioning&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Dynamic Positioning&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;The tooltip's position is calculated based on the coordinates of the hovered area. The script ensures that the tooltip does not overflow the window edges by adjusting its position relative to the image map area's boundaries.&lt;/p&gt;

&lt;p&gt;Position calculations take into account the current scroll position and the tooltip's dimensions to ensure it is always visible.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;coords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;attr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;coords&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;,&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;coord&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;coord&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;xCoords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;yCoords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;x1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nx"&gt;xCoords&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;y1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nx"&gt;yCoords&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;x2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nx"&gt;xCoords&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;y2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nx"&gt;yCoords&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;centerX&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

      &lt;span class="nx"&gt;$infoPinNameTooltip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;infoWidth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;$tooltip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;width&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;infoHeight&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;$tooltip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;height&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

      &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;positionX&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;centre&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;infoWidth&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;positionX&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;left&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;infoWidth&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;$canvasMapContainer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;width&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;positionX&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;right&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;positionY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;top&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;y1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;infoHeight&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;scrollTop&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;positionY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;bottom&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;positionXmap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;left&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;x2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;centre&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;centerX&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;infoWidth&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;right&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;x1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;infoWidth&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;positionYmap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;top&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;y1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;infoHeight&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;bottom&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;y2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;

      &lt;span class="nx"&gt;$tooltip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;css&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
          &lt;span class="na"&gt;top&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;positionYmap&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;positionY&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
          &lt;span class="na"&gt;left&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;positionXmap&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;positionX&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a real-world production app, this script probably should exist in its own file, however, as this is just a toy project, it is currently inlined along with the HTML code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This project has been an enriching learning experience, and although the results haven't fully met my expectations yet, I believe it's time for a pause. Juggling multiple interests and responsibilities, including learning, writing, and teaching, demands that I prioritise my commitments. &lt;/p&gt;

&lt;p&gt;In the meantime, I will keep a list of the ideas that come to my mind to improve the results of the processes I have been describing here, and, if you have ideas on how I could improve this project or want to share your experiences with similar projects, please leave a comment below or reach out to me on &lt;a&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are looking for all the code I have written so far, &lt;a href="https://github.com/fferegrino/pin-detection-with-sam" rel="noopener noreferrer"&gt;everything is on GitHub&lt;/a&gt;, feel free to use it for your own projects!&lt;/p&gt;

</description>
      <category>jquery</category>
      <category>javascript</category>
      <category>html</category>
      <category>python</category>
    </item>
    <item>
      <title>Documenting my pin collection with Segment Anything: Part 3</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Fri, 14 Jun 2024 04:11:08 +0000</pubDate>
      <link>https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-3-4iam</link>
      <guid>https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-3-4iam</guid>
      <description>&lt;p&gt;In &lt;a href="https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-2-4pjc"&gt;my last post&lt;/a&gt;, I showed how to use the Segment Anything Model with prompts to improve the segmentation output, in it I decided that using bounding boxes to prompt the model yielded the best results for my purposes.&lt;/p&gt;

&lt;p&gt;In this post I will try to describe a tiny, but slightly complex, app I made with the help of GitHub Copilot. This app is made with vanilla JavaScript and HTML uses SAM in the backend to extract the cutouts along with the bounding polygons for further use in my ultimate collection display.&lt;/p&gt;

&lt;p&gt;Before we dive into a mess of code, have a look at the app I created:&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/ALIFCBvGnFg"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;(if you just want the code, go to the end of this post)&lt;/p&gt;

&lt;h2&gt;
  
  
  Requirements
&lt;/h2&gt;

&lt;p&gt;The app I created needed to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Allow me to draw boxes on an image,&lt;/li&gt;
&lt;li&gt;perform image segmentation using the drawn box as a prompt.&lt;/li&gt;
&lt;li&gt;Once the image semgmentation is done, show the candidate cutouts and allow me to select the best and,&lt;/li&gt;
&lt;li&gt;give each one of them a unique identifier and a name.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Solution
&lt;/h2&gt;

&lt;p&gt;In the end, I created a client-server app:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyirqux354bjajnki0gap.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyirqux354bjajnki0gap.png" alt="App architecture" width="800" height="445"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the backend, the obvious decision was Python, since the Segment Anything Model is readily accessible in that language, and it is the language I know the most.&lt;/p&gt;

&lt;p&gt;The client app is done with vanilla JavaScript, CSS and HTML; Using the &lt;em&gt;canvas&lt;/em&gt; API it is effortless to draw bounding boxes over an image, and all the mouse events help us send the necessary data to extract a cutout.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Project Structure Overview&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The project consists of several interconnected components, including a FastAPI backend, HTML5 and JavaScript for the frontend, and CSS for styling. Here’s a breakdown of the key files and their roles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;web/labeller.py&lt;/code&gt;&lt;/strong&gt;: The core backend file built with FastAPI. It handles route definitions, image manipulations, and interactions with the image segmentation model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;web/static/app.css&lt;/code&gt;&lt;/strong&gt;: Contains CSS styles to enhance the appearance of the application.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;web/static/app.js&lt;/code&gt;&lt;/strong&gt;: Manages the frontend logic, particularly the interactions on the HTML5 Canvas where users draw annotations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;web/templates/index.html.jinja&lt;/code&gt;&lt;/strong&gt;: The Jinja2 template for the HTML structure, dynamically integrating backend data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;web/resources.py&lt;/code&gt;&lt;/strong&gt;: Manages downloading necessary resources like images and model files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;web/sam.py&lt;/code&gt;&lt;/strong&gt;: Integrates the machine learning model for image segmentation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Out of these files, perhaps the most important ones are the one that manages the frontend logic and the core of the app; will try my best to describe them below:&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;&lt;code&gt;web/static/app.js&lt;/code&gt;&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The script starts by setting up an environment where users can draw rectangular boxes on an image loaded into a canvas element. This functionality is an essential part of the app, since these boxes will be the prompts to the segment anything model in the backend.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Initialization on Window Load:
&lt;/h3&gt;

&lt;p&gt;The script begins execution once the window has fully loaded, ensuring all HTML elements, especially the &lt;code&gt;&amp;lt;canvas&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt;, are available to manipulate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;load&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;canvas&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;2d&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;image&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;results&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;contours&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Canvas and Context Setup:
&lt;/h3&gt;

&lt;p&gt;Here, the canvas dimensions are set to match the image dimensions, and the image is then drawn onto the canvas. This forms the base on which users will draw the bounding boxes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;    &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;width&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;height&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;height&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drawImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Drawing Interactions:
&lt;/h3&gt;

&lt;p&gt;Listeners for &lt;code&gt;mousedown&lt;/code&gt;, &lt;code&gt;mousemove&lt;/code&gt;, and &lt;code&gt;mouseup&lt;/code&gt; events are added to the canvas to handle drawing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a variables to hold the mouse position and the drawing state:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;startingMousePosition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;isDrawing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Start Drawing:&lt;/strong&gt; On &lt;code&gt;mousedown&lt;/code&gt;, it captures the starting point where the user begins the draw interaction.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mousedown&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;startingMousePosition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offsetX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offsetY&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="nx"&gt;isDrawing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Drawing in Progress:&lt;/strong&gt; The &lt;code&gt;mousemove&lt;/code&gt; event updates the drawing in real-time, showing a visual feedback of the rectangle being drawn on the canvas via the &lt;code&gt;redrawCanvas&lt;/code&gt; and the &lt;code&gt;drawBox&lt;/code&gt; functions.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mousemove&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;isDrawing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;currentX&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offsetX&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;currentY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offsetY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nf"&gt;redrawCanvas&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nf"&gt;drawBox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;startingMousePosition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;startingMouse&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;Position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;currentX&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;startingMousePosition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;currentY&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;startingMousePosition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;End Drawing:&lt;/strong&gt; The &lt;code&gt;mouseup&lt;/code&gt; event finalises the drawing and optionally sends the drawn box data to the server using the &lt;code&gt;sendBoxData&lt;/code&gt; function.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mouseup&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;isDrawing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;endX&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offsetX&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;endY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offsetY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;box&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;startingMousePosition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;endX&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="na"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;startingMousePosition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;endY&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="na"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;startingMousePosition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;endX&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="na"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;startingMousePosition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;endY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
        &lt;span class="nf"&gt;sendBoxData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;box&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nf"&gt;redrawCanvas&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nx"&gt;isDrawing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Server Interaction:
&lt;/h3&gt;

&lt;p&gt;Upon completing a drawing, the box data is sent to the server using a &lt;code&gt;fetch&lt;/code&gt; call. This allows the application to process the box. This processing involves using the segment anything model to extract the candidate cutouts and returning them to be presented to the user using the &lt;code&gt;createForm&lt;/code&gt; function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;sendBoxData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;box&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/cut&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;box&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;innerHTML&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createForm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;appendChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Error:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Dynamic Form Generation:
&lt;/h3&gt;

&lt;p&gt;Responses from the server include an image and an identifier, that is used to create and populate forms dynamically. Using Mustache.js for templating, the script generates HTML forms based on this data, which are then inserted into the DOM, allowing further user interaction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
&amp;lt;div class="form-container"&amp;gt;
    &amp;lt;div class="image-container"&amp;gt;
        &amp;lt;img src="{{image}}"&amp;gt;
    &amp;lt;/div&amp;gt;
    &amp;lt;form action="/select_cutout" method="POST"&amp;gt;
        &amp;lt;input type="text" name="name" placeholder="Name"&amp;gt;
        &amp;lt;input type="hidden" name="id" value="{{id}}"&amp;gt;
        &amp;lt;button type="submit"&amp;gt;Select&amp;lt;/button&amp;gt;
    &amp;lt;/form&amp;gt;
&amp;lt;/div&amp;gt;
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createForm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rendered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Mustache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;template&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;image&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createElement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;div&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;div&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;innerHTML&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rendered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;div&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;firstChild&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Utility Functions:
&lt;/h3&gt;

&lt;p&gt;Several utility functions handle repetitive tasks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;redrawCanvas&lt;/code&gt;&lt;/strong&gt;: Clears and redraws the canvas, useful for updating the view when needed.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;redrawCancas&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clearRect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;height&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drawImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;contours&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;points&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;drawPolygon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;points&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;drawBox&lt;/code&gt;&lt;/strong&gt;: Draws rectangles based on coordinates, the pins that have already been cut.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;drawBox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;height&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;fill&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;beginPath&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;height&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;strokeStyle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;red&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fillStyle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#ff000033&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stroke&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;drawPolygon&lt;/code&gt;&lt;/strong&gt;: A more complex drawing function that can render polygons, used here to illustrate the capability to handle various shapes.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;drawPolygon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;points&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;beginPath&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;moveTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;points&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nx"&gt;points&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="nx"&gt;points&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;point&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lineTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;point&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nx"&gt;point&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fillStyle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#ff0000FF&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;closePath&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stroke&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These utility functions are essential for managing the visual elements on the canvas, allowing for efficient updates and complex graphical operations like drawing polygons and boxes.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;web/labeller.py&lt;/code&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Environment Setup and Initialisations:
&lt;/h3&gt;

&lt;p&gt;The application begins with setting up the FastAPI environment and configuring static file paths and template directories. This setup is crucial for serving static content like images and CSS, and for rendering HTML templates.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/static&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;StaticFiles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;directory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web/static&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;static&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;templates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Jinja2Templates&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;directory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web/templates&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The image to annotate and the model to use are downloaded or loaded into the application, ensuring that all necessary components are available for image processing and analysis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;download_resources&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Image Loading and Preprocessing:
&lt;/h3&gt;

&lt;p&gt;The image to annotate is loaded and preprocessed. This involves reading the image from a path, converting it to an appropriate colour format, and resizing it to a manageable size. This resizing is particularly important to ensure that the processing is efficient.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;original_image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cvtColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])),&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COLOR_BGR2RGB&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;image_to_show&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;image_to_show&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image_to_show&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;desired_image_width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_to_show&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Model Loading for Image Segmentation:
&lt;/h3&gt;

&lt;p&gt;The segmentation model is loaded, configured, and prepared to predict masks based on user-defined annotations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mask_predictor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_mask_predictor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;mask_predictor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Web Routing and Request Handling:
&lt;/h3&gt;

&lt;p&gt;FastAPI routes handle different types of web requests. The main route serves the annotated image along with tools for the user to interact with. This is done through a both POST and GET request which renders an HTML template with the image and existing annotations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;turns_image_to_base64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_to_show&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;existing_cutouts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_folder&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;selected_folder&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;existing_cutouts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;width&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;image_to_show&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;height&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;image_to_show&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;existing_cutouts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;existing_cutouts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ratio&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;templates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TemplateResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;index.html.jinja&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Image Annotation and Segmentation:
&lt;/h3&gt;

&lt;p&gt;When a user submits a bounding box annotation, the coordinates are scaled back to the original, unresized image and processed to segment the image. The application uses the model to predict the mask and then extracts the relevant part of the image based on these masks.&lt;/p&gt;

&lt;p&gt;Apart from the extracted image cutouts, metadata is saved to a temporary folder, so that when a user selects a given cutout, they can be recovered.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/cut/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;post_cut&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BoundingBox&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;box&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;original_box&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;box&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt;
    &lt;span class="n"&gt;masks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mask_predictor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;original_box&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;multimask_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;masks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;uuid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;cutout&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_from_mask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;base64_cutout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;turns_image_to_base64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cutout&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PNG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;base64_cutout&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;uuid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bbox&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;y1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;y2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;original_bbox&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;original_box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;y1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;original_box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;original_box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;y2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;original_box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;polygons&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;poly&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;poly&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mask_to_polygons&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;temp_folder&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;cutout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PNG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;temp_folder&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Handling user selection of cutouts
&lt;/h3&gt;

&lt;p&gt;After users mark and submit their desired cutout, this endpoint manages the user's selection, moving the annotated image data from temporary storage to a selected folder and updating its associated metadata with new user-provided information (like a name for the annotation):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/select_cutout/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;post_select_cutout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Form&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Form&lt;/span&gt;&lt;span class="p"&gt;()]):&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;shutil&lt;/span&gt;

    &lt;span class="c1"&gt;# Move the PNG image from temporary to selected folder
&lt;/span&gt;    &lt;span class="n"&gt;shutil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;temp_folder&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;selected_folder&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Load the existing metadata for the selected annotation
&lt;/span&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;temp_folder&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;  &lt;span class="c1"&gt;# Update the name field with user-provided name
&lt;/span&gt;
    &lt;span class="c1"&gt;# Write the updated metadata back to the selected folder
&lt;/span&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;selected_folder&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Redirect back to the main page after processing is complete
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;RedirectResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Utility Functions for Image Manipulation:
&lt;/h3&gt;

&lt;p&gt;Several utility functions facilitate image manipulation tasks like cropping the image based on the mask:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_from_mask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;crop_box&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;margin&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;image_rgba&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uint8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uint8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image_rgba&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;image_rgba&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;
    &lt;span class="n"&gt;image_pil&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_rgba&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;crop_box&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;bbox&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;getbbox&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;crop_box&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_pil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_pil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cropped_image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image_pil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;crop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;crop_box&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cropped_image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;crop_box&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And converting images to a web-friendly format to be sent as responses to the front end.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;turns_image_to_base64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;JPEG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;buffered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BytesIO&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;img_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;b64encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getvalue&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data:image/jpeg;base64,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;img_str&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These functions ensure that the application can handle image data efficiently and render it appropriately on the web interface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Libraries I used
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;&lt;strong&gt;FastAPI&lt;/strong&gt;&lt;/a&gt;: A web framework for building APIs (and web pages). It is used as the backbone of the application to handle web requests, routing, and server logic, and orchestrates the overall API structure. Although not used here, FastAPI provides robust features such as data validation, serialisation, and asynchronous request handling.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.opencv.org/" rel="noopener noreferrer"&gt;&lt;strong&gt;OpenCV (cv2)&lt;/strong&gt;&lt;/a&gt;: OpenCV is a powerful library used for image processing operations. It is utilised to read and transform images, such as converting colour spaces and resizing images, which are essential pre-processing steps before any segmentation tasks.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://numpy.org/" rel="noopener noreferrer"&gt;&lt;strong&gt;NumPy&lt;/strong&gt;&lt;/a&gt;: This library is fundamental for handling arrays and matrices, such as for operations that involve image data. NumPy is used to manipulate image data and perform calculations for image transformations and mask operations.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pillow.readthedocs.io/en/stable/" rel="noopener noreferrer"&gt;&lt;strong&gt;PIL (Pillow)&lt;/strong&gt;&lt;/a&gt;: The Python Imaging Library (Pillow) is used for opening, manipulating, and saving many different image file formats. Here it is specifically used to convert images to different formats, handle image cropping, and integrate alpha channels to extract the cutouts.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://supervision.roboflow.com/latest/" rel="noopener noreferrer"&gt;&lt;strong&gt;Supervision&lt;/strong&gt;&lt;/a&gt;: Although not yet a widely known library, this powerful library provides a seamless process for annotating predictions generated by various object detection and segmentation models; in this case, I used it to evaluate the results of SAM, and to convert its predictions to Polygon masks.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/janl/mustache.js" rel="noopener noreferrer"&gt;&lt;strong&gt;Mustache.js&lt;/strong&gt;&lt;/a&gt;: This is a templating engine used for rendering templates on the web. In your application, Mustache.js is used to dynamically create HTML forms based on the data received from the server, such as image cutouts and identifiers.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Closing thoughts
&lt;/h2&gt;

&lt;p&gt;I hope I did not bore you to death with some of these deep dives into my (and some of my friendly coding assistant's) code – I tried my best to be thorough. But if you still have doubts do not hesitate to reach out to me.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/fferegrino/pin-detection-with-sam/tree/part3" rel="noopener noreferrer"&gt;&lt;strong&gt;Here is the code by the way&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Believe it or not, this app is not complete yet, there is some other functionality yet to be implemented:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A way to easily recover the selected cutouts&lt;/li&gt;
&lt;li&gt;A way to match already existing cutouts so that when the user selects the same cutout we don't duplicate entries&lt;/li&gt;
&lt;li&gt;A way to handle updated canvas pictures, because what is going to happen when I inevitably expand my collection?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I will explore these details in the next blog post in the series.&lt;/p&gt;

</description>
      <category>fastapi</category>
      <category>segmentanything</category>
      <category>imagesegmentation</category>
    </item>
    <item>
      <title>Documenting my pin collection with Segment Anything: Part 2</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Tue, 11 Jun 2024 17:33:13 +0000</pubDate>
      <link>https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-2-4pjc</link>
      <guid>https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-2-4pjc</guid>
      <description>&lt;p&gt;In a previous post &lt;a href="https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-1-4k3o"&gt;I shared my desire to create an interactive display for my pin collection&lt;/a&gt;. In it, I decided to use Meta AI’s Segment Anything Model to extract cutouts from my crowded canvas:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqcd0rg742latm62utn4p.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqcd0rg742latm62utn4p.jpg" width="800" height="328"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But as I discovered, with such a crowded and detailed image, the automatic segmentator struggles with identifying all the pins individually.&lt;/p&gt;

&lt;p&gt;Luckily for me, &lt;em&gt;segment anything&lt;/em&gt;, has other ways of extracting masks from an image, via the use of prompts; there are two kinds of prompts: boxes and points.&lt;/p&gt;

&lt;p&gt;In this post, I will show you these two features.&lt;/p&gt;

&lt;h2&gt;
  
  
  Load the model and image
&lt;/h2&gt;

&lt;p&gt;First thing, we load the model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;segment_anything&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sam_model_registry&lt;/span&gt;

&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sam_model_registry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;vit_b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sam_vit_b_01ec64.pth&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;device&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cpu&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we load the image that contains the pins. We use OpenCV for reading the image and convert it to RGB color space, as the model expects the input in this format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;

&lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pins@high.jpg&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;image_rgb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cvtColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COLOR_BGR2RGB&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Create a Segment Anything Model Predictor
&lt;/h2&gt;

&lt;p&gt;Segment Anything offers a predictor that requires a model to be instantiated. Then we need to set an image using &lt;code&gt;set_image&lt;/code&gt;, which will process the image to produce an image embedding; The predictor will store this embedding and will use it for subsequent mask prediction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;segment_anything&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SamPredictor&lt;/span&gt;

&lt;span class="n"&gt;mask_predictor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SamPredictor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;mask_predictor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_rgb&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prompting with a box
&lt;/h2&gt;

&lt;p&gt;To prompt SAM with a bounding box it is necessary to define a NumPy array, where the order of the values is &lt;code&gt;x1,y1,x2,y2&lt;/code&gt;, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;box&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;759&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;913&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1007&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1174&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbe1p3y88eblukg8ceuos.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbe1p3y88eblukg8ceuos.png" alt="The image is just an illustration, the model operates on the image alone with the box as a NumPy array" width="794" height="285"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The image is just an illustration, the model operates on the image alone with the box as a NumPy array&lt;/p&gt;

&lt;p&gt;To prompt the model, one has to call the &lt;code&gt;predict&lt;/code&gt; method on the &lt;code&gt;mask_predictor&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;masks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;logits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mask_predictor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;multimask_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result will be a triplet, with the following values:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;masks&lt;/code&gt;: The output masks in CxHxW format, where C is the number of masks, and (H, W) is the original image size.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;scores&lt;/code&gt;: An array of length C containing the model's predictions for the quality of each mask.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;logits&lt;/code&gt;: An array of shape CxHxW, where C is the number of masks and H=W=256. These low resolution logits can be passed to a subsequent iteration as mask input.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the way, if you specify &lt;code&gt;multimask_output = True&lt;/code&gt; you will get three masks for each prediction, I find this ability truly useful, as some of the generated masks are not usable, so I rather keep my options with multiple masks.&lt;/p&gt;

&lt;p&gt;Ultimately, the result will be masks that when applied to the image, yield the following resit:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fba803qvelpaz614ytlpo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fba803qvelpaz614ytlpo.png" width="800" height="272"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompting with points
&lt;/h2&gt;

&lt;p&gt;The input to the model is comprised of two arrays:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;point_coords&lt;/code&gt;: A Nx2 array of point prompts to the model. Each point is in (X,Y) in pixels&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;point_labels&lt;/code&gt;: A length N array of labels for the point prompts. 1 indicates a foreground point and 0 indicates a background point.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;point_coords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;160&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;point_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we visualise the points, they look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6iwmh27hfevqxtegtps.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6iwmh27hfevqxtegtps.png" alt="The selected points in red" width="794" height="285"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The call to &lt;code&gt;predict&lt;/code&gt; looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;masks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;logits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mask_predictor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;point_coords&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;point_coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;point_labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;point_labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;multimask_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the results… well, they're not great:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx0ydnshw5aby8zb26n9q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx0ydnshw5aby8zb26n9q.png" width="800" height="272"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Speed
&lt;/h2&gt;

&lt;p&gt;When prompted the model takes significantly less time (&amp;lt;1 second) when compared to my previous attempt using the automatic segmentator.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;For my pin collection, manual prompting with bounding boxes proved more effective than using point prompts.&lt;/p&gt;

&lt;p&gt;In my &lt;a href="https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-3-4iam"&gt;next entry&lt;/a&gt;, I will demonstrate how I integrated this model into a custom web-based application, enhancing the interactive display of my collection.&lt;/p&gt;

</description>
      <category>imagesegmentation</category>
      <category>segmentanything</category>
      <category>python</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Documenting my pin collection with Segment Anything: Part 1</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Sun, 09 Jun 2024 19:46:04 +0000</pubDate>
      <link>https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-1-4k3o</link>
      <guid>https://dev.to/feregri_no/documenting-my-pin-collection-with-segment-anything-part-1-4k3o</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5w8eg2vw39446fqsau1i.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5w8eg2vw39446fqsau1i.jpg" alt="Image description" width="800" height="450"&gt;&lt;/a&gt;As a hobby that spans across various cultures and ages, &lt;a href="https://en.wikipedia.org/wiki/Pin_trading" rel="noopener noreferrer"&gt;pin collecting&lt;/a&gt; allows enthusiasts like me to hold onto pieces of art, history, and personal milestones. Whether they're enamel pins from theme parks or vintage lapel pins, I believe each piece in a collection tells a unique story.&lt;/p&gt;

&lt;p&gt;In this blog series, I'm excited to share my journey of documenting my extensive pin collection, which consists of gifts, purchases, and serendipitous finds from the streets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqcd0rg742latm62utn4p.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqcd0rg742latm62utn4p.jpg" alt="My pin collection" width="800" height="328"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Version 1
&lt;/h2&gt;

&lt;p&gt;With the help of ChatGPT I built a simple website that allows you to zoom into the whole canvas so that you can look at the pins in more detail, and while I liked the result (&lt;a href="https://pins.feregri.no/v1/" rel="noopener noreferrer"&gt;you can view it here!&lt;/a&gt;), it is far from what I wanted to document my collection.&lt;/p&gt;

&lt;h2&gt;
  
  
  My ideal collection display
&lt;/h2&gt;

&lt;p&gt;My ideal solution is to create an interactive website where viewers can hover over each pin to see it highlighted, and click for a detailed view and additional information about the pin's background. Using the canvas image shown above, I embarked on a project to bring this vision to life, leveraging modern machine learning techniques.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter Segment Anything
&lt;/h2&gt;

&lt;p&gt;To extract the cutouts from the canvas I thought of using an image segmentation algorithm to extract the silhouettes of the pins. Now, the last time I tried to do something related object/edge detection, the model to go with was YOLO V2, with great surprise I discovered that advancements have led to YOLO V10!&lt;/p&gt;

&lt;p&gt;However, Intrigued by the capabilities of the latest models, I decided to experiment with Meta AI's Segment Anything Model (SAM), which was released with the promise of being a powerful image segmentation model so I tried it&lt;/p&gt;

&lt;p&gt;It turns out that now there is a V10 of YOLO, and it is more powerful than what I was already familiar with. But at the same time, I wanted to try the Segment Anything Model released by Meta AI… so that is what I did. &lt;/p&gt;

&lt;h2&gt;
  
  
  Installing SAM
&lt;/h2&gt;

&lt;p&gt;I wanted to run everything locally, so I set out to install everything in my Mac M2, it was a bit tricky and involved a lot of trial and error, but here is what in the end worked for me:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create a new Python environment
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The version I used to create the environment was &lt;code&gt;3.10.12&lt;/code&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  2. Install &lt;code&gt;torch&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;I found there is specific &lt;a href="https://developer.apple.com/metal/pytorch/" rel="noopener noreferrer"&gt;Apple guidance&lt;/a&gt; on how to do this, given that on certain Macs it is possible to take advantage of the GPU:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;torch torchvision torchaudio &lt;span class="nt"&gt;--extra-index-url&lt;/span&gt; https://download.pytorch.org/whl/nightly/cpu
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the end, these are the versions I ended up having: &lt;code&gt;torch==2.3.1&lt;/code&gt;, &lt;code&gt;torchvision==0.18.1&lt;/code&gt; and &lt;code&gt;torchaudio==2.3.1&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Fix NumPy
&lt;/h3&gt;

&lt;p&gt;For some reason, I ran into an issue with &lt;code&gt;NumPy&lt;/code&gt; and &lt;code&gt;torch&lt;/code&gt;, &lt;a href="https://stackoverflow.com/questions/20518632/importerror-numpy-core-multiarray-failed-to-import/47433969#47433969" rel="noopener noreferrer"&gt;this StackOverflow answer&lt;/a&gt; helped me solving it, by re-installing it with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;numpy &lt;span class="nt"&gt;-I&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The final &lt;code&gt;numpy&lt;/code&gt; version was &lt;code&gt;1.26.4&lt;/code&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  4. Install Segment Anything
&lt;/h3&gt;

&lt;p&gt;As far as I know, the only way to install the necessary code for SAM is through their GitHub repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s1"&gt;'git+https://github.com/facebookresearch/segment-anything.git'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Download the SAM model
&lt;/h3&gt;

&lt;p&gt;The models and the code for &lt;em&gt;Segment Anything&lt;/em&gt; come separately, so to download a model to use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;wget &lt;span class="nt"&gt;-q&lt;/span&gt; https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are different versions of models, but &lt;code&gt;sam_vit_b_01ec64&lt;/code&gt; was my choice, mainly because as far as I know, it is the smallest.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Remaining tools
&lt;/h3&gt;

&lt;p&gt;To develop and test the model, I used Jupyter, and to visualise the results of the image segmentation I used a package called &lt;code&gt;supervision&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Accessing the SAM model
&lt;/h2&gt;

&lt;p&gt;In order to use the model, it is necessary to open it using Python, here is where you can configure where the model should run (either GPU or CPU, for example), in the code below you will see me configuring the model  &lt;code&gt;vit_b&lt;/code&gt;, I also attempted to use MPS (metal performance shaders) however I found an error and I just decided to run everything in the CPU:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;segment_anything&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sam_model_registry&lt;/span&gt;

&lt;span class="c1"&gt;# DEVICE = torch.device('mps' if torch.backends.mps.is_available() else 'cpu')
&lt;/span&gt;&lt;span class="n"&gt;DEVICE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;device&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cpu&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;MODEL_TYPE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vit_b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;CHECKPOINT_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sam_vit_b_01ec64.pth&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sam_model_registry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MODEL_TYPE&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;CHECKPOINT_PATH&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DEVICE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Opening the image
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;

&lt;span class="n"&gt;IMAGE_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pins@high.jpg&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IMAGE_PATH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;image_rgb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cvtColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COLOR_BGR2RGB&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;cv2.cvtColor()&lt;/code&gt; function converts the colour space of an image. In this case, it's converting the colour format from BGR to RGB. This is often done because while OpenCV uses BGR, most image applications and libraries use RGB. The converted image is stored in the variable &lt;code&gt;image_rgb&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Generating Automated Masks
&lt;/h2&gt;

&lt;p&gt;SAM has different methods of generating masks, the one I wanted to try initially is by far the easiest one because all you need to do is provide an image and have the model generate the masks for you, all you need is to pass the &lt;code&gt;sam&lt;/code&gt; variable (containing the model) to an instance of the &lt;code&gt;SamAutomaticMaskGenerator&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;segment_anything&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SamAutomaticMaskGenerator&lt;/span&gt;

&lt;span class="n"&gt;mask_generator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SamAutomaticMaskGenerator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, to generating the masks is as easy as calling the &lt;code&gt;generate&lt;/code&gt; method of the automatic generator passing the RGB image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;output_mask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mask_generator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_rgb&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SAM is indeed a very powerful model, much more powerful than what I need, at least out of the box, this is the result I get from running the entire image through SAM:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdmlmyiv0mni9t1rn6fgi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdmlmyiv0mni9t1rn6fgi.png" alt="wrong-full-output.png" width="800" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Upon running SAM, the results were not as expected. The model struggled to accurately detect all pins and sometimes misinterpreted parts of pins as separate entities.&lt;/p&gt;

&lt;p&gt;I then decided to try to work on a smaller crop of the image, however, I got the same results:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnexkqk3f3nopumtamsv7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnexkqk3f3nopumtamsv7.png" alt="wrong-quarter-output.png" width="800" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you are interested in how I managed to display the results, you can have a look at the function I wrote for this task:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;supervision&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sv&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;view_masks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;masks&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;""""&lt;/span&gt;&lt;span class="s"&gt;
    Display the source image, the segmented image and the binary mask

    :param source: The source image in BGR format
    :param masks: The result of the automatic mask generator call
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;mask_annotator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;MaskAnnotator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;color_lookup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ColorLookup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INDEX&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;detections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Detections&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_sam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sam_result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;masks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dark&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros_like&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;annotated_image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mask_annotator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;annotate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;detections&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;detections&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;masked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mask_annotator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;annotate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dark&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detections&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;detections&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot_images_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;images&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;annotated_image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;masked&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;grid_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;titles&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;source image&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;segmented image&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;binary mask&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;It may be possible to modify the behaviour of the &lt;code&gt;SamAutomaticMaskGenerator&lt;/code&gt; via arguments, however, when I modified some of these arguments I realised that (I did not know what I was doing, and)  sometimes the kernel died on me. I suppose my laptop does not have enough memory to run some combinations.&lt;/p&gt;

&lt;p&gt;While the initial attempts with SAM presented challenges, they provided valuable learning opportunities. In the next blog post, I will explore alternative methods and adjustments to enhance pin detection and achieve the interactivity I envision for my collection's display.&lt;/p&gt;

</description>
      <category>imagesegmentation</category>
      <category>python</category>
      <category>objectdetection</category>
    </item>
    <item>
      <title>Mi biblioteca de MLOps</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Thu, 03 Aug 2023 19:35:40 +0000</pubDate>
      <link>https://dev.to/feregri_no/mi-biblioteca-de-mlops-2gp1</link>
      <guid>https://dev.to/feregri_no/mi-biblioteca-de-mlops-2gp1</guid>
      <description>&lt;p&gt;En los últimos tres años los temas en mi biblioteca personal han cambiado para inclinarse más hacia MLOps; He leído demasiados libros sobre el tema que finalmente puedo elegir mis favoritos; echemos un vistazo:&lt;/p&gt;

&lt;h2&gt;
  
  
  Mis favoritos
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Building Machine Learning Powered Applications
&lt;/h3&gt;

&lt;p&gt;Este libro de &lt;strong&gt;Emmanuel Ameisen&lt;/strong&gt; (&lt;em&gt;O'Reilly&lt;/em&gt;) fue el que me empujó a dejar mi trabajo como científico de datos y comenzar a mover mi carrera hacia MLOps.&lt;/p&gt;

&lt;p&gt;El libro es muy ligero en contenido práctico, si bien trae código para practicar, las ideas que presenta son lo que más vale la pena; lo que me encantó de él es la vista de alto nivel que quita el enfoque del desarrollo del modelo de machine learning y lo pone en los componentes que rodean a una aplicación práctica de aprendizaje automático. Cómpralo en &lt;a href="https://www.amazon.com.mx/dp/149204511X?_encoding=UTF8&amp;amp;__mk_es_US=%EF%BF%BDM%EF%BF%BD%3F%EF%BF%BD%EF%BF%BD&amp;amp;crid=R4LSGTNBTBSM&amp;amp;keywords=Building+Machine+Learning+Powered+Applications&amp;amp;sprefix=building+machine+learning+powered+applications%2Caps%2C152&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=435acbc7f941ab0058c6f1a3591cf363&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/dp/149204511X?_encoding=UTF8&amp;amp;__mk_es_US=%EF%BF%BDM%EF%BF%BD%3F%EF%BF%BD%EF%BF%BD&amp;amp;crid=R4LSGTNBTBSM&amp;amp;keywords=Building+Machine+Learning+Powered+Applications&amp;amp;sprefix=building+machine+learning+powered+applications%2Caps%2C152&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=6d58b66c513c98cfab2649ad483a9ecb&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reliable Machine Learning
&lt;/h3&gt;

&lt;p&gt;Este libro de &lt;strong&gt;Cathy Chen et al.&lt;/strong&gt; (&lt;em&gt;O'Reilly&lt;/em&gt;) fue un hallazgo realmente agradable. No habla de una tecnología en específico y, a pesar de no mencionar MLOps en el título, MLOps es el tema de todo el libro.&lt;/p&gt;

&lt;p&gt;Pone ML a través de una lente del concepto de &lt;em&gt;"reliability"&lt;/em&gt; (confiabilidad), cubriendo una amplia gama de temas, desde problemas que pueden ocurrir en la capa de datos y almacenamiento, pasando por alto algunas técnicas para calcular costos, hasta quién debe estar a cargo de la calidad del modelo. Cómpralo en &lt;a href="https://www.amazon.com.mx/Reliable-Machine-Learning-Principles-Production/dp/1098106229?crid=1J5EEQLYRUJ3M&amp;amp;keywords=reliable+machine+learning&amp;amp;qid=1681674347&amp;amp;sprefix=reliable+machi%2Caps%2C199&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=fafd0161ab76d1cd3fe9fdaafc39dfdd&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Reliable-Machine-Learning-Principles-Production/dp/1098106229?crid=1J5EEQLYRUJ3M&amp;amp;keywords=reliable+machine+learning&amp;amp;qid=1681674347&amp;amp;sprefix=reliable+machi%2Caps%2C199&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=cbeba4a1b88da28abf15f68b516e7d1e&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Machine Learning Engineering in Action
&lt;/h3&gt;

&lt;p&gt;Este es un libro de Ben Wilson (Manning), es extremadamente denso, pero cada una de sus más de 500 páginas vale completamente la pena porque cubre todos los aspectos involucrados en poner en producción una aplicación exitosa de aprendizaje automático, incluida la planificación y el alcance de un proyecto, experimentación, pruebas, despliegue y seguimiento.&lt;/p&gt;

&lt;p&gt;Este libro está lleno de ejemplos y diagramas que facilitan la comprensión de los conceptos. Muy recomendable. Cómpralo en &lt;a href="https://www.amazon.com.mx/Machine-Learning-Engineering-Action-Wilson/dp/1617298719?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=48B1H38VVL8S&amp;amp;keywords=machine+learning+engineering+in+action&amp;amp;qid=1681674429&amp;amp;sprefix=machine+learning+engineering+in+action%2Caps%2C159&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=4310d644852196e76d559d4d34a54937&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Machine-Learning-Engineering-Action-Wilson/dp/1617298719?crid=5G4NYTG9P1NU&amp;amp;keywords=machine+learning+engineering+in+action&amp;amp;qid=1681674417&amp;amp;sprefix=machine+learning+engineering+in+action%2Caps%2C159&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=76ec94703d7b5f746cf500d0760c577c&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Effective Data Science Infrastructure
&lt;/h3&gt;

&lt;p&gt;Lo que me encanta de este libro de &lt;strong&gt;Ville Tuulos&lt;/strong&gt; (&lt;em&gt;Manning&lt;/em&gt;) es que, a pesar de ser un libro técnico centrado en una herramienta, Metaflow, los conceptos que toca son atemporales cuando se trata de una buena plataforma de experimentación y despliegue de modelos de machine learning.&lt;/p&gt;

&lt;p&gt;El autor se basa en su experiencia en la construcción de la infraestructura de ciencia de datos para una de las grandes empresas tecnológicas. Incluso si no estás interesado en Metaflow en absoluto, te recomiendo que consulte este libro. En mi caso, este libro fue una fuente de inspiración a la hora de diseñar la experiencia de ciencia de datos en mi empresa actual. Cómpralo en &lt;a href="https://www.amazon.com.mx/Effective-Data-Science-Infrastructure-Scientists/dp/1617299197?crid=3KQKCQFEQCUDD&amp;amp;keywords=effective+data+science+infrastructure&amp;amp;qid=1681674520&amp;amp;sprefix=effective+data+science%2Caps%2C166&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=396d54f9046e4d9f4fb83250ef407228&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Effective-Data-Science-Infrastructure-scientists/dp/1617299197?crid=T0TOVY8JC84P&amp;amp;keywords=Effective+Data+science+infrastructure&amp;amp;qid=1681674507&amp;amp;sprefix=effective+data+science+infrastructure%2Caps%2C166&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=822cf85711c5f2635f811b8923de6389&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Machine Learning Design Patterns
&lt;/h3&gt;

&lt;p&gt;Si bien este libro de &lt;strong&gt;Valliappa Lakshmanan&lt;/strong&gt; (&lt;em&gt;O'Reilly&lt;/em&gt;) no trata sobre MLOps como tal, me pareció un libro de referencia interesante que ayuda a resolver algunos de los problemas comunes que un ingeniero relacionado con ML puede encontrar en el día a día.&lt;/p&gt;

&lt;p&gt;No pienses en este libro como el que te guiará de principio a fin mientras crea una aplicación de aprendizaje automático; piensa en este libro como un libro de recetas al que puede seguir haciendo referencia. Hay algunas críticas con respecto a las opciones tecnológicas de los autores, pero para mí, el valor compartido en este libro supera la molesta promoción que los autores hacen de GCP, Tensorflow y BigQuery. Cómpralo en &lt;a href="https://www.amazon.com.mx/Machine-Learning-Design-Patterns-Preparation/dp/1098115783?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=11YVZVTSO4YPM&amp;amp;keywords=machine+learning+design+patterns&amp;amp;qid=1681674617&amp;amp;sprefix=machine+learning+design+patterns%2Caps%2C162&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=07b55146997aaa7a6455d1a5de3bd745&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Machine-Learning-Design-Patterns-Preparation/dp/1098115783?crid=YNXGEL5W45WM&amp;amp;keywords=machine+learning+design+patterns&amp;amp;qid=1681674596&amp;amp;sprefix=machine+learning+design+patterns%2Caps%2C160&amp;amp;sr=8-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=b1530679b48d76b6bc02a5fe6cffc1fe&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Designing Machine Learning Systems
&lt;/h3&gt;

&lt;p&gt;Todavía estoy leyendo este libro de &lt;strong&gt;Chip Huyen&lt;/strong&gt; (&lt;em&gt;O'Reilly&lt;/em&gt;), pero hasta ahora es prometedor, de ahí su posición en la lista. Actualizaré este listado cuando lo termine. Cómpralo en &lt;a href="https://www.amazon.com.mx/Designing-Machine-Learning-Systems-Production-Ready/dp/1098107969?pd_rd_w=kHEFY&amp;amp;content-id=amzn1.sym.073f9082-4467-4f70-939c-f1e02d32ade1&amp;amp;pf_rd_p=073f9082-4467-4f70-939c-f1e02d32ade1&amp;amp;pf_rd_r=YDKHMKV9Q5KXT7XMZJ4J&amp;amp;pd_rd_wg=rytW0&amp;amp;pd_rd_r=861328fd-21a8-491e-8643-62226915eabb&amp;amp;pd_rd_i=1098107969&amp;amp;psc=1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=421503f01f3ac13c6bcdf09c97d398a1&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Designing-Machine-Learning-Systems-Production-Ready/dp/1098107969?content-id=amzn1.sym.f6adc278-60fd-4516-8b02-0dbae28f44f1%3Aamzn1.sym.f6adc278-60fd-4516-8b02-0dbae28f44f1&amp;amp;crid=YNXGEL5W45WM&amp;amp;cv_ct_cx=machine+learning+design+patterns&amp;amp;keywords=machine+learning+design+patterns&amp;amp;pd_rd_i=1098107969&amp;amp;pd_rd_r=650d784b-3b29-46f3-8876-c006cbf77890&amp;amp;pd_rd_w=Z22fL&amp;amp;pd_rd_wg=eoh2M&amp;amp;pf_rd_p=f6adc278-60fd-4516-8b02-0dbae28f44f1&amp;amp;pf_rd_r=S4QQPMFSQYV63DVQEZ3G&amp;amp;qid=1681674596&amp;amp;sbo=RZvfv%2F%2FHxDF%2BO5021pAnSA%3D%3D&amp;amp;sprefix=machine+learning+design+patterns%2Caps%2C160&amp;amp;sr=1-1-9e7645f9-2d19-4bff-863e-f6cdbe50f990&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=7ad48552c07d050708697a6cdb6c902b&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Estos libros están buenos
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Machine Learning Engineering with Python
&lt;/h3&gt;

&lt;p&gt;Un libro de &lt;strong&gt;Andrew P. McMahon&lt;/strong&gt; (&lt;em&gt;Packt&lt;/em&gt;) que trata de cubrir muchos temas pero que la cantidad de páginas impide darle profundidad a alguno de ellos. Lo bueno es que cubre algunos de los aspectos que otros libros no cubren, como los flujos de trabajo de Git y el empaquetado de proyectos en Python.&lt;/p&gt;

&lt;p&gt;También tiene algunas preguntas de autoevaluación que son buenas para evaluar su conocimiento a medida que avanza en el libro. Cómpralo en &lt;a href="https://www.amazon.com.mx/Machine-Learning-Engineering-Python-production/dp/1801079250?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=38ZOFT7JUDGIN&amp;amp;keywords=Machine+Learning+Engineering+with+Python&amp;amp;qid=1681674746&amp;amp;s=books&amp;amp;sprefix=machine+learning+engineering+with+python%2Cstripbooks%2C168&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=9ca26a7f08086f920e8eddad49d0bcc7&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Machine-Learning-Engineering-Python-production/dp/1801079250?crid=2GRGU7LP9475S&amp;amp;keywords=Machine+Learning+Engineering+with+Python&amp;amp;qid=1681674739&amp;amp;s=books&amp;amp;sprefix=machine+learning+engineering+with+python%2Cstripbooks-intl-ship%2C170&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=606868ef969c919920198bfdc0a29142&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  MLOps engineering at Scale
&lt;/h3&gt;

&lt;p&gt;Un libro relativamente pequeño de &lt;strong&gt;Carl Osipov&lt;/strong&gt; (&lt;em&gt;Manning&lt;/em&gt;), es una introducción ligera a PyTorch y cómo servirlo en AWS. Tiene buenas ideas, pero no puedo deshacerme de la sensación de que su tamaño se interpone en el camino de una explicación profunda. Si trabajas con las tecnologías que cubre el libro, sería una buena adición a tu biblioteca. Cómpralo en &lt;a href="https://www.amazon.com.mx/Cloud-Native-Machine-Learning-Deploying/dp/1617297763?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=3RFHJK8DG52SP&amp;amp;keywords=MLOps+engineering+at+Scale&amp;amp;qid=1681674808&amp;amp;s=books&amp;amp;sprefix=mlops+engineering+at+scale%2Cstripbooks%2C153&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=987839bca1972d67c271ff15cded32cd&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Cloud-Native-Machine-Learning-Osipov/dp/1617297763?crid=2Q5RU2SE5JUFL&amp;amp;keywords=MLOps+engineering+at+Scale&amp;amp;qid=1681674797&amp;amp;s=books&amp;amp;sprefix=mlops+engineering+at+scale%2Cstripbooks-intl-ship%2C161&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=1333b3d7fefc8891b2b559a2de1042c6&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Machine Learning Engineering
&lt;/h3&gt;

&lt;p&gt;Un buen libro de referencia de &lt;strong&gt;Andriy Burkov&lt;/strong&gt; (&lt;em&gt;True Positive Inc&lt;/em&gt;) es una descripción general concisa, aunque desorganizada, del campo de la ingeniería de aprendizaje automático. No es un libro que te va a cambiar radicalmente tu visión sobre machine learning en producción, pero tiene algunas ideas valiosas.&lt;/p&gt;

&lt;p&gt;Se lo recomiendo a ese científico de datos que recién comienza y quiere comunicarse con científicos de datos experimentados o ingenieros de aprendizaje automático, pero si ya tienes experiencia, no será muy útil. Cómpralo en &lt;a href="https://www.amazon.com.mx/Machine-Learning-Engineering-Andriy-Burkov/dp/1777005469?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=15ZCNWUD2WY5S&amp;amp;keywords=Machine+Learning+Engineering&amp;amp;qid=1681674904&amp;amp;s=books&amp;amp;sprefix=machine+learning+engineering%2Cstripbooks%2C198&amp;amp;sr=1-7&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=29ba72ff6b47896baf0036f1aaff3748&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Machine-Learning-Engineering-Andriy-Burkov/dp/1999579577?crid=21V2219Q6OHYZ&amp;amp;keywords=Machine+Learning+Engineering&amp;amp;qid=1681674898&amp;amp;s=books&amp;amp;sprefix=machine+learning+engineering%2Cstripbooks-intl-ship%2C144&amp;amp;sr=1-2&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=68d41d9f991f87d50c5b80e0f241e93d&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Introducing MLOps
&lt;/h3&gt;

&lt;p&gt;Como sugiere el título, este libro de &lt;strong&gt;Mark Treveil et al.&lt;/strong&gt; (&lt;em&gt;O'Reilly&lt;/em&gt;) es una buena introducción a MLOps; Recomiendo este libro si eres completamente nuevo en el tema. Léelo sin esperar demasiada profundidad técnica y más una descripción general de alto nivel. Cómpralo en &lt;a href="https://www.amazon.com.mx/Introducing-Mlops-Machine-Learning-Enterprise/dp/1492083291?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=26H6ZKLCV3ASA&amp;amp;keywords=Introducing+MLOps&amp;amp;qid=1681674951&amp;amp;s=books&amp;amp;sprefix=introducing+mlops%2Cstripbooks%2C201&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=77dfa092aa19049ef1fd3456b70ba783&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Introducing-MLOps-Machine-Learning-Enterprise/dp/1492083291?crid=1I45VNS1A8E31&amp;amp;keywords=Introducing+MLOps&amp;amp;qid=1681674947&amp;amp;s=books&amp;amp;sprefix=introducing+mlops%2Cstripbooks-intl-ship%2C162&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=1a0b50ceff25858fa6a12adeab68309b&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recomendados solo si...
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Building Machine Learning Pipelines
&lt;/h3&gt;

&lt;p&gt;Libro de Hannes &lt;strong&gt;Hapke et al.&lt;/strong&gt; (&lt;em&gt;O'Reilly&lt;/em&gt;), es solo un libro de "recetas" que se parece a la documentación de Tensorflow Extended, excepto que está un poco desactualizado.&lt;/p&gt;

&lt;p&gt;No doy una peor opinión de este libro porque los autores fueron sinceros sobre el uso de Tensorflow en la portada; Sabía en lo que me estaba metiendo. Si Tensorflow es su tecnología preferida, léelo; sin embargo, ten en cuenta los cambios en la API de TFX desde que se lanzó el libro. Cómpralo en &lt;a href="https://www.amazon.com.mx/Building-Machine-Learning-Pipelines-Automating/dp/1492053198?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=35T8OC7EH7U6N&amp;amp;keywords=Building+Machine+Learning+Pipelines&amp;amp;qid=1681675423&amp;amp;s=books&amp;amp;sprefix=building+machine+learning+pipelines%2Cstripbooks%2C270&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=eea864b09fd14e0a9a9ba8eda79c07f1&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Building-Machine-Learning-Pipelines-Automating/dp/1492053198?crid=1DEI7X8MVOLOD&amp;amp;keywords=Building+Machine+Learning+Pipelines&amp;amp;qid=1681675419&amp;amp;s=books&amp;amp;sprefix=building+machine+learning+pipelines%2Cstripbooks-intl-ship%2C270&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=032342592ef3c2703815f09c7c4e1bc2&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Nota: uno de los autores me comentó que están trabajando en una nueva versión.&lt;/p&gt;

&lt;h3&gt;
  
  
  Engineering MLOps
&lt;/h3&gt;

&lt;p&gt;Lamentablemente, este libro de &lt;strong&gt;Emmanuel Raj&lt;/strong&gt; (&lt;em&gt;Packt&lt;/em&gt;) es demasiado superficial y trata de cubrir demasiado terreno, pero no logra abordar ningún tema en particular con confianza.&lt;/p&gt;

&lt;p&gt;El autor usa Azure para generar el proyecto al que se hace referencia en todo el libro (cuando compré el libro, esto fue una sorpresa para mí, ya que nunca se mencionó en ninguna parte de la descripción del libro). Puede que le guste este libro si ya conoces de MLOps y necesitas trabajar con Azure como back-end en la nube. De lo contrario, no puedo recomendarlo. Cómpralo en &lt;a href="https://www.amazon.com.mx/Engineering-MLOps-Rapidly-production-ready-learning/dp/1800562888?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=290MP13L07VTQ&amp;amp;keywords=Engineering+MLOps&amp;amp;qid=1681675518&amp;amp;s=books&amp;amp;sprefix=engineering+mlops%2Cstripbooks%2C161&amp;amp;sr=1-2&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=0fb2998b3ec2656521097c0199eb699e&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Engineering-MLOps-Rapidly-production-ready-learning/dp/1800562888?crid=ESOKMXXN12TU&amp;amp;keywords=Engineering+MLOps&amp;amp;qid=1681675514&amp;amp;s=books&amp;amp;sprefix=engineering+mlops%2Cstripbooks-intl-ship%2C209&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=52d5a56870081dbb2ab015c7eb3d3fc5&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Machine Learning Systems
&lt;/h3&gt;

&lt;p&gt;Si usas Scala para tu día a día, considera este libro de &lt;strong&gt;Jeff Smith&lt;/strong&gt; (&lt;em&gt;Manning&lt;/em&gt;). Si no te gusta Scala, evítalo. El libro es demasiado superficial para mi gusto y habla más sobre el desarrollo de software con Scala que sobre el aprendizaje automático (o su relación con ML). Cómpralo en &lt;a href="https://www.amazon.com.mx/Engineering-MLOps-Rapidly-production-ready-learning/dp/1800562888?__mk_es_MX=%C3%85M%C3%85%C5%BD%C3%95%C3%91&amp;amp;crid=290MP13L07VTQ&amp;amp;keywords=Engineering+MLOps&amp;amp;qid=1681675518&amp;amp;s=books&amp;amp;sprefix=engineering+mlops%2Cstripbooks%2C161&amp;amp;sr=1-2&amp;amp;linkCode=ll1&amp;amp;tag=thcgu02-20&amp;amp;linkId=0fb2998b3ec2656521097c0199eb699e&amp;amp;language=es_MX&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;Amazon México&lt;/a&gt; o en &lt;a href="https://www.amazon.com/Engineering-MLOps-Rapidly-production-ready-learning/dp/1800562888?crid=ESOKMXXN12TU&amp;amp;keywords=Engineering+MLOps&amp;amp;qid=1681675514&amp;amp;s=books&amp;amp;sprefix=engineering+mlops%2Cstripbooks-intl-ship%2C209&amp;amp;sr=1-1&amp;amp;linkCode=ll1&amp;amp;tag=antonioferegr-20&amp;amp;linkId=52d5a56870081dbb2ab015c7eb3d3fc5&amp;amp;language=en_US&amp;amp;ref_=as_li_ss_tl" rel="noopener noreferrer"&gt;otras partes del mundo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Estos no te los recomiendo
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Practical MLOps
&lt;/h3&gt;

&lt;p&gt;No puedo entender qué se trata este libro de &lt;strong&gt;Noah Gift et al.&lt;/strong&gt; (&lt;em&gt;O'Reilly&lt;/em&gt;); definitivamente no se trata de MLOps. Hay un montón de ejemplos irrelevantes y experiencias personales que se sienten forzadas. Pero lo que acaba con el libro son los cambios de tema tan radicales de un capítulo a otro; me recordó cuando tenía que trabajar en equipo con personas con las que no me llevaba bien en la universidad y cada quien hacía su parte por separado.&lt;/p&gt;

&lt;p&gt;Probablemente uno de los peores libros de O'Reilly que he leído en mi carrera tecnológica. Mantente alejado de él.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusión
&lt;/h2&gt;

&lt;p&gt;Espero que esta lista te dé una buena idea de qué libros agregar a tu biblioteca (o cuáles evitar), y si ya ha leído los que menciono, siéntase libre de comentar sobre ellos o discrepar conmigo en el comentarios.&lt;/p&gt;

&lt;p&gt;Siempre estoy abierto a sugerencias de libros, así que déjalas también en los comentarios.&lt;/p&gt;

</description>
      <category>mlops</category>
      <category>books</category>
      <category>machinelearning</category>
      <category>libros</category>
    </item>
    <item>
      <title>Automating Sketch with GitHub Actions</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Sun, 05 Jun 2022 15:57:18 +0000</pubDate>
      <link>https://dev.to/feregri_no/automating-sketch-with-github-actions-4abf</link>
      <guid>https://dev.to/feregri_no/automating-sketch-with-github-actions-4abf</guid>
      <description>&lt;p&gt;Sometimes I use Sketch to create graphics for my content; however, I have always found it challenging to keep track of my work. Saving files here and there, versioning them with what I thought were sensible name schemes, only to realise that following such schemes requires a lot of mental effort.&lt;/p&gt;

&lt;p&gt;My dream was always to be able to store my Sketch files in a &lt;em&gt;Git&lt;/em&gt; repo; for some reason, I always thought this would be impossible, that Sketch's files were binaries impossible to properly version.&lt;/p&gt;

&lt;p&gt;In the following post, I'll explain why I was wrong and how is it that you can version your &lt;em&gt;Sketch&lt;/em&gt; files as plain text documents.&lt;/p&gt;

&lt;p&gt;So I start with a base document, nothing too complex as I don't want to overcomplicate things:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0qn89pa0kcdqcnfpx2n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0qn89pa0kcdqcnfpx2n.png" alt="Simple image" width="800" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have named this document &lt;em&gt;tcsg.sketch&lt;/em&gt;; then, the next thing to understand is how Sketch actually saves these documents as a single file. The most important thing is that a &lt;em&gt;.sketch&lt;/em&gt; file &lt;a href="https://developer.sketch.com/file-format/?_ga=2.160187325.1466637750.1654335985-1710079208.1653454852" rel="noopener noreferrer"&gt;is nothing more than a &lt;em&gt;.zip&lt;/em&gt;&lt;/a&gt; file with a bunch of &lt;em&gt;.json&lt;/em&gt; files inside.&lt;/p&gt;

&lt;h2&gt;
  
  
  De-&lt;em&gt;sketchify&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;We know that &lt;em&gt;Git&lt;/em&gt; does not play nice with binary files but plays very nicely with plain text files – and &lt;em&gt;JSON&lt;/em&gt; is just that. Why not decompress the &lt;em&gt;.sketch&lt;/em&gt; file and keep track of the &lt;em&gt;.json&lt;/em&gt; files alone; after all, when we want to open our file in Sketch again, we can compress those files again.&lt;/p&gt;

&lt;h3&gt;
  
  
  Decompress
&lt;/h3&gt;

&lt;p&gt;With this in mind, we can use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;unzip &lt;span class="nt"&gt;-d&lt;/span&gt; tcsg tcsg.sketch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To unzip the files into the &lt;code&gt;tcsg&lt;/code&gt; repository. A quick glance into the newly unzipped repository gives us the following repo structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tcsg
├── document.json
├── meta.json
├── pages
│   └── 4FB4BFA1-4E01-4EE8-9962-F07A85622B2F.json
├── previews
│   └── preview.png
└── user.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I will not discuss the details of the files, as they are well explained in Sketch's documentation.&lt;/p&gt;

&lt;p&gt;We should note that for optimisation purposes, the &lt;em&gt;JSON&lt;/em&gt; files are saved with no indentation, and all the contents are stored in a single line. As I want to embrace the full power of Git, I need to format these files to be able to view the diffs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Indent files
&lt;/h3&gt;

&lt;p&gt;There is a useful tool to work with &lt;em&gt;JSON&lt;/em&gt; fines from the &lt;em&gt;CLI&lt;/em&gt;, it is named &lt;em&gt;jq&lt;/em&gt;, I will use it to format the files with indentation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find ./tcsg &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-exec&lt;/span&gt; sh &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"jq . {} | sponge {}"&lt;/span&gt;  &lt;span class="se"&gt;\;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An explanation of the above command:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;find ./tcsg&lt;/code&gt;: Searches for objects in the &lt;code&gt;./tcsg&lt;/code&gt; folder&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-type f&lt;/code&gt;: Specifies, with &lt;code&gt;f&lt;/code&gt; that we are looking for a regular file&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-name "*.json"&lt;/code&gt;: Filters the files we will find to all those ending in &lt;code&gt;.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-exec [command]&lt;/code&gt;: Executes a command for each file; within this command we can use &lt;code&gt;{}&lt;/code&gt; to refer to the file name. The command to execute should be followed by &lt;code&gt;\;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sh -c "jq . {} | sponge {}"&lt;/code&gt;: In this case, the command that will be executed for each file is &lt;code&gt;jq . [filename] | sponge [filename]&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Delete previews (optional)
&lt;/h3&gt;

&lt;p&gt;There is a &lt;code&gt;previews&lt;/code&gt; folder where the last page edited by the user is preserved to be used as a thumbnail (and a preview) for the document. Again, this is an image, and for the time being, I will delete it since it is not needed for the file format.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ./tcsg/previews/preview.png
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that is it! we now have a Sketch document as a series of plain text files.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;em&gt;Sketchify&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;Of course, I want this process to be reversible – I want to be able to open my documents in Sketch again.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create a temporary folder
&lt;/h3&gt;

&lt;p&gt;I rather not modify the original directory, so I will create a copy of the working directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; ./tcsg ./tcsg_temp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;em&gt;Un&lt;/em&gt;-indent files
&lt;/h3&gt;

&lt;p&gt;When I decompressed the files, I realised that the &lt;em&gt;JSON&lt;/em&gt; files contained all the information in a single line; to respect that format, let's apply the &lt;code&gt;jq -c . {} | sponge {}&lt;/code&gt; command to all those files. It is pretty similar to the format command above, with the difference of the &lt;code&gt;-c&lt;/code&gt; flat of &lt;em&gt;jq&lt;/em&gt;, which "compresses" the output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find ./tcsg_temp &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-exec&lt;/span&gt; sh &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"jq -c . {} | sponge {}"&lt;/span&gt;  &lt;span class="se"&gt;\;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Remove previews (optional)
&lt;/h3&gt;

&lt;p&gt;Again, let's delete any preview image, for consistency with the process above:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ./tcsg_temp/previews/preview.png
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Putting everything together
&lt;/h3&gt;

&lt;p&gt;I placed all the above code into a single file called &lt;code&gt;desketchify.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

unzip &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; tcsg tcsg.sketch

find ./tcsg &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-exec&lt;/span&gt;  sh &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"jq . {} | sponge {}"&lt;/span&gt; &lt;span class="se"&gt;\;&lt;/span&gt;

&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ./tcsg/previews/preview.png
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Compress
&lt;/h3&gt;

&lt;p&gt;Finally, in the compression step, we need to change directory to the temporary folder I have been working on. Then apply the compression step using the &lt;code&gt;zip&lt;/code&gt; utility:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ./tcsg_temp&lt;span class="p"&gt;;&lt;/span&gt; zip &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; ../tcsg.sketch &lt;span class="k"&gt;*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The flag &lt;code&gt;-r&lt;/code&gt; specifies that &lt;code&gt;zip&lt;/code&gt; should recursively compress the files; the &lt;code&gt;-X&lt;/code&gt; flag specifies that the compression should not save any extra file attributes.&lt;/p&gt;

&lt;p&gt;At the end of this command I should have a &lt;code&gt;.sketch&lt;/code&gt; file that can be opened in the app.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cleanup
&lt;/h3&gt;

&lt;p&gt;Lastly, let's clean up what I just did:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ..&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ./tcsg_temp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Putting everything together
&lt;/h3&gt;

&lt;p&gt;I placed all the above code into a single file called &lt;code&gt;sketchify.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; ./tcsg ./tcsg_temp

find ./tcsg_temp &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-exec&lt;/span&gt;  sh &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"jq -c . {} | sponge {}"&lt;/span&gt; &lt;span class="se"&gt;\;&lt;/span&gt;

&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ./tcsg_temp/previews/preview.png

&lt;span class="nb"&gt;cd&lt;/span&gt; ./tcsg_temp&lt;span class="p"&gt;;&lt;/span&gt; zip &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; ../tcsg.sketch &lt;span class="k"&gt;*&lt;/span&gt;

&lt;span class="nb"&gt;cd&lt;/span&gt; ..&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ./tcsg_temp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Exporting artboards
&lt;/h2&gt;

&lt;p&gt;But why stop there? what if I want to export the contents of the file as images? This will make it easy to share the assets with people who do not have Sketch installed at all!&lt;/p&gt;

&lt;p&gt;This is surprisingly easy using GitHub Actions; all I need to do is use a kind of hidden gem in the Sketch ecosystem: their &lt;em&gt;sketchtool&lt;/em&gt; utility, &lt;a href="https://developer.sketch.com/cli/" rel="noopener noreferrer"&gt;read more about it here&lt;/a&gt;. It allows you to interact with Sketch documents without human interaction.&lt;/p&gt;

&lt;p&gt;In particular, the command that I am interested in the most is the one that exports artboards: &lt;code&gt;sketchtool export artboards [file]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The tool itself is free to use for my purposes, but we need to download Sketch, I wrote the following code to achieve that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;wget &lt;span class="nt"&gt;-O&lt;/span&gt; sketch.zip &lt;span class="se"&gt;\&lt;/span&gt;
    https://download.sketch.com/sketch-88.1-145978.zip
unzip &lt;span class="nt"&gt;-qq&lt;/span&gt; sketch.zip
Sketch.app/Contents/MacOS/sketchtool &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Leaving the &lt;em&gt;sketchtool&lt;/em&gt; accesible via the &lt;code&gt;Sketch.app/Contents/MacOS/sketchtool&lt;/code&gt; command. Obviously, be mindful of the version you are working with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full automation
&lt;/h2&gt;

&lt;p&gt;It is finally time to put everything together using GitHub actions, I want to run all these steps only when the source files of the Sketch document change:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Sketchify&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main"&lt;/span&gt; &lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tcsg/**'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The jobs are organised in steps, where each step performs one and only one action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;generate_assets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Generate assets&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;macos-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout code&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v3&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install dependencies&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;brew install jq&lt;/span&gt;
          &lt;span class="s"&gt;brew install moreutils&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create Sketchfile&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./sketchify.sh&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install Sketch&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;wget -O sketch.zip https://download.sketch.com/sketch-88.1-145978.zip&lt;/span&gt;
          &lt;span class="s"&gt;unzip -qq sketch.zip&lt;/span&gt;
          &lt;span class="s"&gt;Sketch.app/Contents/MacOS/sketchtool -v&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Export artboards&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Sketch.app/Contents/MacOS/sketchtool export artboards tcsg.sketch --output=export --formats=jpg --scales=1,2&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Save exported images&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-artifact@v3&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;images&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;export/&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Save generated Sketch file&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-artifact@v3&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sketch-file&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;tcsg.sketch&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will rebuild my Sketch file every time new changes are made to the repository and will export all the artboards in it. The best part? the artefacts will be available for download in the GitHub UI:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zi051wbe4ajh0mzq7bv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zi051wbe4ajh0mzq7bv.png" alt="Artifact available to download" width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I consider this to be a pretty decent way to store Sketch files as assets in a Git repository; of course, depending on the changes you make, the diffs may still be monstrous; but at least they are more trackable than as a single &lt;em&gt;zip&lt;/em&gt; file.&lt;/p&gt;

&lt;p&gt;To use the code described in this post you will need to make some adjustments to it to refer to your own files.&lt;/p&gt;

&lt;p&gt;So, tell me, do you use Sketch? I hope this post was useful for you as it was helpful for me, I discovered so many things about Sketch. If you have any doubts, let me know on &lt;a href="https://twitter.com/feregri_no" rel="noopener noreferrer"&gt;Twitter at @feregri_no&lt;/a&gt;. As always, find the code for this post &lt;a href="https://github.com/fferegrino/tcsg-logos/tree/automating-sketch-with-github-actions" rel="noopener noreferrer"&gt;in GitHub&lt;/a&gt;. Happy &lt;em&gt;Sketch&lt;/em&gt;-ing.&lt;/p&gt;

</description>
      <category>sketch</category>
      <category>github</category>
      <category>githubactions</category>
      <category>automation</category>
    </item>
    <item>
      <title>Configuring GitHub Actions – Tweeting from a lambda</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Mon, 14 Feb 2022 11:38:13 +0000</pubDate>
      <link>https://dev.to/feregri_no/configuring-github-actions-tweeting-from-a-lambda-bii</link>
      <guid>https://dev.to/feregri_no/configuring-github-actions-tweeting-from-a-lambda-bii</guid>
      <description>&lt;p&gt;Here comes the automation part using GitHub through a CI/CD pipeline. The first thing we're going to do is create a file called &lt;em&gt;aws.yml&lt;/em&gt; in the &lt;em&gt;.github/workflows&lt;/em&gt; folder, as the extension suggests is a file that follows the YAML format.&lt;/p&gt;

&lt;p&gt;The first thing we are going to specify is the name of the pipeline and the conditions under which it should be executed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and deploy lambda-cycles image&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;main&lt;/span&gt; &lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;main&lt;/span&gt; &lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="c1"&gt;# Job definitions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For now I want the pipeline to run every time someone puts something into &lt;em&gt;main&lt;/em&gt; and every time someone opens a pull request to &lt;em&gt;main&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The next thing is to define the &lt;em&gt;jobs&lt;/em&gt; that are part of the workflow – in this case we will have two: one to prepare our application and the other to publish it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparing the image – &lt;em&gt;build&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;To define a &lt;em&gt;job&lt;/em&gt; we must specify the &lt;code&gt;steps&lt;/code&gt; that form part of it, individually you can specify a more friendly name and the type of &lt;em&gt;runner&lt;/em&gt; in which it is executed. We won't be using anything complicated, so &lt;code&gt;ubuntu-latest&lt;/code&gt; works fine for us.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next thing to do is to specify the &lt;em&gt;steps&lt;/em&gt; that are part of the &lt;em&gt;job&lt;/em&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  Steps
&lt;/h3&gt;

&lt;p&gt;We need to get a copy of our newly pushed code to &lt;em&gt;main&lt;/em&gt;, we use the &lt;em&gt;checkout&lt;/em&gt; action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since we are going to interact with AWS, we need to configure the credentials in the runner*&lt;em&gt;, Amazon offers an action for this, what we must specify are our credentials (which we previously set as secrets in our *repo&lt;/em&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Configure AWS credentials&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v1&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;aws-access-key-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_ACCESS_KEY_ID }}&lt;/span&gt;
        &lt;span class="na"&gt;aws-secret-access-key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_SECRET_ACCESS_KEY }}&lt;/span&gt;
        &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eu-west-1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This couple of &lt;em&gt;steps&lt;/em&gt; are specific to my implementation since I am using &lt;em&gt;pipenv&lt;/em&gt;, so it is necessary to install Python, then pipenv and install the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Set up Python &lt;/span&gt;&lt;span class="m"&gt;3.8&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-python@v1&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;python-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3.8&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install pipenv&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
        &lt;span class="s"&gt;pip install pipenv&lt;/span&gt;
        &lt;span class="s"&gt;pipenv install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next three steps are all about creating the image that will be used to create our &lt;em&gt;lambda&lt;/em&gt; instance.&lt;/p&gt;

&lt;p&gt;The first step calls &lt;code&gt;make container&lt;/code&gt;, the utility I added in past posts to build and tag as &lt;code&gt;lambda-cycles&lt;/code&gt;. The second step exports this image to a compressed file. The third step stores the newly exported Docker image as an &lt;em&gt;artifact&lt;/em&gt;, which we will use in the next &lt;em&gt;job&lt;/em&gt;, were we will deploy it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build lambda-cycles image&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;make container&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pack docker image&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker save lambda-cycles &amp;gt; ./lambda-cycles.tar&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Temporarily save Docker image and dependencies&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-artifact@v2&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda-cycles-build&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;./shapefiles/&lt;/span&gt;
          &lt;span class="s"&gt;./requirements.txt&lt;/span&gt;
          &lt;span class="s"&gt;./lambda-cycles.tar&lt;/span&gt;
        &lt;span class="na"&gt;retention-days&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We must configure, initialize and finally, plan the creation of the infrastructure using &lt;em&gt;terraform&lt;/em&gt;. For the first action, Hashicorp offers a pre-defined action, for the following two using the &lt;em&gt;terraform&lt;/em&gt; console tool is enough:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Set up terraform&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;hashicorp/setup-terraform@v1&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Terraform init&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform -chdir=terraform init&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Terraform plan&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform -chdir=terraform plan&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Creating infrastructure in AWS
&lt;/h2&gt;

&lt;p&gt;Once GitHub Actions has finished the build &lt;em&gt;job&lt;/em&gt;, we can move on to the deploy &lt;em&gt;job&lt;/em&gt;. To define it (in addition to the name and &lt;em&gt;runner&lt;/em&gt; information) I indicate that it depends on the &lt;code&gt;build&lt;/code&gt; job and very importantly, that it should only be executed when the branch that is going to execute this job is the &lt;code&gt;main&lt;/code&gt; branch, see the &lt;code&gt;if&lt;/code&gt; instruction?.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
    &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github.ref == 'refs/heads/main'&lt;/span&gt;

    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Steps
&lt;/h3&gt;

&lt;p&gt;As usual, we get a copy of the code with &lt;code&gt;actions/checkout@v2&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We configure our credentials, remember, each job is executed in a different &lt;em&gt;runner&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Configure AWS credentials&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v1&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;aws-access-key-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_ACCESS_KEY_ID }}&lt;/span&gt;
        &lt;span class="na"&gt;aws-secret-access-key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_SECRET_ACCESS_KEY }}&lt;/span&gt;
        &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eu-west-1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do you remember that in the previous job we created an artifact named &lt;code&gt;lambda-cycles-build&lt;/code&gt; that contained a &lt;em&gt;Docker&lt;/em&gt; image and some other dependencies? – well, now we are going to download it, and after that we will use &lt;code&gt;docker load&lt;/code&gt; to import the image and make it available to be used by &lt;em&gt;docker&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Retrieve saved Docker image&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/download-artifact@v2&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda-cycles-build&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Docker load&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker load &amp;lt; ./lambda-cycles.tar&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we configure &lt;em&gt;terraform&lt;/em&gt; again, we initialise it and lastly, we apply the planned changes. Note that we are using the &lt;code&gt;-auto-approve&lt;/code&gt; option so that the changes are automatically approved without the need for human interaction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Set up terraform&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;hashicorp/setup-terraform@v1&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Terraform init&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform -chdir=terraform init&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Terraform apply&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform -chdir=terraform apply -auto-approve&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that is it, with this concludes the 6-part series explaining how to automatically build and deploy a &lt;em&gt;lambda&lt;/em&gt; from GitHub. &lt;/p&gt;

&lt;p&gt;This is &lt;a href="https://github.com/fferegrino/tweeting-cycles-lambda/tree/part-5-github-action" rel="noopener noreferrer"&gt;how the repository looks like&lt;/a&gt; by the end of this post.&lt;/p&gt;

&lt;p&gt;Remember that you can find me on Twitter &lt;a href="https://twitter.com/feregri_no" rel="noopener noreferrer"&gt;at @feregri_no&lt;/a&gt; to ask me about this post – if something is not so clear or you found a typo. The final code for this series is on &lt;a href="https://github.com/fferegrino/tweeting-cycles-lambda" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and the account tweeting the status of the bike network is &lt;a href="https://twitter.com/CyclesLondon" rel="noopener noreferrer"&gt;@CyclesLondon&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>github</category>
      <category>aws</category>
      <category>terraform</category>
      <category>githubactions</category>
    </item>
    <item>
      <title>Infrastructure with Terraform – Tweeting from a lambda</title>
      <dc:creator>Antonio Feregrino</dc:creator>
      <pubDate>Mon, 14 Feb 2022 11:37:51 +0000</pubDate>
      <link>https://dev.to/feregri_no/infrastructure-with-terraform-tweeting-from-a-lambda-1ael</link>
      <guid>https://dev.to/feregri_no/infrastructure-with-terraform-tweeting-from-a-lambda-1ael</guid>
      <description>&lt;p&gt;Since we are working in AWS with a &lt;em&gt;lambda&lt;/em&gt; we need to create infrastructure in there.&lt;/p&gt;

&lt;p&gt;As a programmer I like to define everything in code, however infrastructure provisioning is something that until recently needed to be managed manually – either through a graphical interface or a CLI with limited scripting capabilities.&lt;/p&gt;

&lt;p&gt;Over the years, tools have emerged that brought us closer to the dream of being able to create infrastructure just by defining it in code, tools such as Ansible, CloudFormation and Terraform allow us to do just that. And it is precisely the last one that I chose to create the necessary elements for this series of posts.&lt;/p&gt;

&lt;p&gt;It is not my interest to explain to you how Terraform works (I don't even know properly myself, in this post I did the minimum for the &lt;em&gt;lambda&lt;/em&gt; to work). The way I present this post is by describing the content of the &lt;em&gt;terraform/main.tf&lt;/em&gt; file that will contain the infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Providers
&lt;/h2&gt;

&lt;p&gt;Terraform interacts with remote systems (such as AWS) through plugins; these plugins are known as &lt;code&gt;providers&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Each terraform module must specify the &lt;code&gt;providers&lt;/code&gt; it needs via the block &lt;code&gt;required_providers&lt;/code&gt;, each provider has a name, a location, and a version. For example, in the lambda example that I am going to post, I am using 2 providers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;aws&lt;/code&gt;, which exists in &lt;code&gt;hashicorp/aws&lt;/code&gt; any version adhering to &lt;code&gt;3.27.X&lt;/code&gt; will work&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;null&lt;/code&gt;, it is an special provider, I'll tell you more about it later.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~&amp;gt; 3.27"
    }

    null = {
      version = "~&amp;gt; 3.0.0"
    }
  }
  required_version = "&amp;gt;= 0.14.9"

  backend "s3" {
    bucket = "feregrino-terraform-states"
    key    = "lambda-cycles-final"
    region = "eu-west-1"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Backend configuration
&lt;/h3&gt;

&lt;p&gt;Within the terraform configuration block you can also see that there is another block defined as &lt;code&gt;backend "s3"&lt;/code&gt;, this block helps us specify where the state file will be located, in this file we keep the state of the infrastructure that we have created with terraform so far. As I discussed in the first post of the series, this file will exist in an &lt;em&gt;S3 bucket&lt;/em&gt;, the specification of which we put in the &lt;code&gt;backend&lt;/code&gt; block.&lt;/p&gt;

&lt;h3&gt;
  
  
  Provider configuration
&lt;/h3&gt;

&lt;p&gt;Some providers require extra configuration, for example, AWS requires us to configure things like the region we want to connect to, the profile and the credentials we are going to use. Although the recommendation is that you do not put passwords or secrets in code, for example, in the AWS configuration I have:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;provider "aws" {
  profile = "default"
  region  = "eu-west-1"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Data Sources
&lt;/h2&gt;

&lt;p&gt;Terraform allows us to access data defined outside our configuration files, through &lt;code&gt;data&lt;/code&gt; blocks, through these we can access information about the user who is executing commands in AWS, using &lt;code&gt;aws_caller_identity&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data "aws_caller_identity" "current_identity" {}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Local Values
&lt;/h2&gt;

&lt;p&gt;I like to think of local values ​​as variables within each module, and we must define them within a &lt;code&gt;locals&lt;/code&gt; block; &lt;code&gt;locals&lt;/code&gt; can also take values ​​from other sources, such as variables or &lt;em&gt;data sources&lt;/em&gt; to simplify access to them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;locals {
  account_id          = data.aws_caller_identity.current_identity.account_id
  prefix              = "lambda-cycles-final"
  ecr_repository_name = "${local.prefix}-image-repo"
  region              = "eu-west-1"
  ecr_image_tag       = "latest"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  AWS
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Secretos
&lt;/h3&gt;

&lt;p&gt;Given the nature of the service I am trying to deploy, it is necessary to access the secrets stored in the AWS secret manager, these must be specified as &lt;em&gt;data sources&lt;/em&gt;, with &lt;code&gt;data&lt;/code&gt; blocks, in the case of secrets, it is necessary to access the secret with &lt;code&gt;aws_secretsmanager_secret&lt;/code&gt; and then to the latest version of it with &lt;code&gt;aws_secretsmanager_secret_version&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data "aws_secretsmanager_secret" "twitter_secrets" {
  arn = "arn:aws:secretsmanager:${local.region}:${local.account_id}:secret:lambda/cycles/twitter-2GMvKu"
}

data "aws_secretsmanager_secret_version" "current_twitter_secrets" {
  secret_id = data.aws_secretsmanager_secret.twitter_secrets.id
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ECR repository
&lt;/h3&gt;

&lt;p&gt;As the &lt;em&gt;lambda&lt;/em&gt; is going to be deployed using a docker container it is necessary to create a repository in ECR, we can use the &lt;code&gt;aws_ecr_repository&lt;/code&gt; resource by specifying the repository name from one of the local variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_ecr_repository" "lambda_image" {
  name                 = local.ecr_repository_name
  image_tag_mutability = "MUTABLE"

  image_scanning_configuration {
    scan_on_push = false
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating a Docker image
&lt;/h3&gt;

&lt;p&gt;Once the repository is created, it is necessary to upload an image to it, however Terraform is used to define infrastructure, not to perform tasks such as building a docker image, much less uploading it. I am going to assume that for this step, before executing the Terraform I already have an image built with the name &lt;code&gt;lambda-cycles&lt;/code&gt;, the only thing that would be missing then is uploading it to the ECR repository.&lt;/p&gt;

&lt;p&gt;We can use a little &lt;em&gt;hack&lt;/em&gt; to accomplish this with Terraform by using a &lt;em&gt;null&lt;/em&gt; resource (&lt;code&gt;null_resource&lt;/code&gt;) and a called provider &lt;code&gt;local-exec&lt;/code&gt; that allows you to specify commands to be executed on the local computer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "null_resource" "ecr_image" {
  triggers = {
    python_file_1 = filemd5("../app.py")
    python_file_2 = filemd5("../plot.py")
    python_file_3 = filemd5("../tweeter.py")
    python_file_4 = filemd5("../download.py")
    requirements  = filemd5("../requirements.txt")
    docker_file   = filemd5("../Dockerfile")
  }
  provisioner "local-exec" {
    command = &amp;lt;&amp;lt;EOF
           aws ecr get-login-password --region ${local.region} | docker login --username AWS --password-stdin ${local.account_id}.dkr.ecr.${local.region}.amazonaws.com
           docker tag lambda-cycles ${aws_ecr_repository.lambda_image.repository_url}:${local.ecr_image_tag}
           docker push ${aws_ecr_repository.lambda_image.repository_url}:${local.ecr_image_tag}
       EOF
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Did you notice the &lt;code&gt;triggers&lt;/code&gt; block? this block will help us track changes to files that will determine if the &lt;em&gt;lambda&lt;/em&gt; container has changed; with &lt;code&gt;filemd5&lt;/code&gt; we get a hash of the specified files. This would mean that if we make any changes to the &lt;em&gt;.py&lt;/em&gt; files the Docker image to be rebuilt and uploaded to the ECR repository.&lt;/p&gt;

&lt;h3&gt;
  
  
  Image information
&lt;/h3&gt;

&lt;p&gt;It is necessary to generate a &lt;em&gt;data source&lt;/em&gt; (in the form of a &lt;code&gt;aws_ecr_image&lt;/code&gt;) that specifies a dependency on the creation and publication of the image, we can do this thanks to &lt;code&gt;depends_on&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data "aws_ecr_image" "lambda_image" {
  depends_on = [
    null_resource.ecr_image
  ]
  repository_name = local.ecr_repository_name
  image_tag       = local.ecr_image_tag
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;em&gt;Policies&lt;/em&gt; and &lt;em&gt;permissions&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;Before creating the &lt;em&gt;lambda&lt;/em&gt;, I have to take care of other administrative tasks, the first is to create a role the &lt;em&gt;lambda&lt;/em&gt; can assume to be executed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_iam_role" "lambda" {
  name               = "${local.prefix}-lambda-role"
  assume_role_policy = &amp;lt;&amp;lt;EOF
{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Action": "sts:AssumeRole",
           "Principal": {
               "Service": "lambda.amazonaws.com"
           },
           "Effect": "Allow"
       }
   ]
}
 EOF
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, since I want to monitor my &lt;em&gt;lambda&lt;/em&gt;, and to know if any errors occurred during its execution, it is necessary to grant it permissions so that it can create logs in &lt;em&gt;CloudWatch&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data "aws_iam_policy_document" "lambda" {
  statement {
    actions = [
      "logs:CreateLogGroup",
      "logs:CreateLogStream",
      "logs:PutLogEvents"
    ]
    effect    = "Allow"
    resources = ["*"]
    sid       = "CreateCloudWatchLogs"
  }
}

resource "aws_iam_policy" "lambda" {
  name   = "${local.prefix}-lambda-policy"
  path   = "/"
  policy = data.aws_iam_policy_document.lambda.json
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Lambda – at least
&lt;/h2&gt;

&lt;p&gt;Now that I have almost everything in place, I can create the lambda via the &lt;code&gt;aws_lambda_function&lt;/code&gt; resource, this is one of the more convoluted definitions in this tutorial, so I'll try to explain it a bit more in detail:&lt;/p&gt;

&lt;p&gt;The first thing I do is add a dependency to my docker image build with &lt;code&gt;depends_on&lt;/code&gt;, then I specify the name of the &lt;em&gt;lambda&lt;/em&gt; and the role it should assume with &lt;code&gt;function_name&lt;/code&gt; and &lt;code&gt;role&lt;/code&gt;. I know in advance that this lambda can take a bit of time so I'll leave it &lt;code&gt;timeout&lt;/code&gt; a bit high.&lt;/p&gt;

&lt;p&gt;Once we create our image in ECR we must tell the lambda that the package_typeis an image, followed by the image_uriso that it knows where to find it.&lt;/p&gt;

&lt;p&gt;Once we create our image in ECR we must tell the &lt;em&gt;lambda&lt;/em&gt; that the &lt;code&gt;package_type&lt;/code&gt; is an image, followed by the &lt;code&gt;image_uri&lt;/code&gt; so that it knows where to find it.&lt;/p&gt;

&lt;p&gt;Finally, since my &lt;em&gt;lambda&lt;/em&gt; is going to send a Tweet, it is necessary to pass the necessary secrets to it. Again, in the interest of keeping everything as private as possible, we will have to define them as environment variables (instead of hardcoding them); I achieve this from the block &lt;code&gt;environment&lt;/code&gt; and extracting the secrets from –yeah, it is repetitivw– the secrets previously stored in AWS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_lambda_function" "lambda" {
  depends_on = [
    null_resource.ecr_image
  ]
  function_name = "${local.prefix}-lambda"
  role          = aws_iam_role.lambda.arn
  timeout       = 300
  image_uri     = "${aws_ecr_repository.lambda_image.repository_url}@${data.aws_ecr_image.lambda_image.id}"
  package_type  = "Image"
  environment {
    variables = {
      API_KEY             = jsondecode(data.aws_secretsmanager_secret_version.current_twitter_secrets.secret_string)["API_KEY"]
      API_SECRET          = jsondecode(data.aws_secretsmanager_secret_version.current_twitter_secrets.secret_string)["API_SECRET"]
      ACCESS_TOKEN        = jsondecode(data.aws_secretsmanager_secret_version.current_twitter_secrets.secret_string)["ACCESS_TOKEN"]
      ACCESS_TOKEN_SECRET = jsondecode(data.aws_secretsmanager_secret_version.current_twitter_secrets.secret_string)["ACCESS_TOKEN_SECRET"]
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running every X minutes
&lt;/h3&gt;

&lt;p&gt;So far so good, if you run terraform up to this point we woill have created several resources: an ECR repository, a docker image, and a lambda. But the icing on the cake is missing, and that is that the point of turning the code into a lambda; I want to run it multiple times throughout the day, every so often.&lt;/p&gt;

&lt;p&gt;To achieve this task, I can use a trigger with the AWS CloudWatch service, something that executes my lambda at time intervals defined by me, this is possible with Terraform as well.&lt;/p&gt;

&lt;p&gt;The first thing is to define an event rule in CloudWatch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_cloudwatch_event_rule" "every_x_minutes" {
  name                = "${local.prefix}-event-rule-lambda"
  description         = "Fires every 20 minutes"
  schedule_expression = "cron(0/20 * * * ? *)"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This event needs a target, in this case it's my lambda:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_cloudwatch_event_target" "trigger_every_x_minutes" {
  rule      = aws_cloudwatch_event_rule.every_x_minutes.name
  target_id = "lambda"
  arn       = aws_lambda_function.lambda.arn
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And of course, like almost everything in AWS, we also need to grant it permissions so that the event can invoke the &lt;em&gt;lambda&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_lambda_permission" "allow_cloudwatch_to_call_lambda" {
  statement_id  = "AllowExecutionFromCloudWatch"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.lambda.function_name
  principal     = "events.amazonaws.com"
  source_arn    = aws_cloudwatch_event_rule.every_x_minutes.arn
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;em&gt;et voilà !&lt;/em&gt;&lt;/strong&gt; – we already have all the necessary ingredients to run and create our lambda using Terraform.&lt;/p&gt;

&lt;p&gt;Remember, all of this content exists in the &lt;em&gt;terraform/main.tf&lt;/em&gt; file within the repository we've been working on.&lt;/p&gt;

&lt;p&gt;This is how &lt;a href="https://github.com/fferegrino/tweeting-cycles-lambda/tree/part-4-terraform" rel="noopener noreferrer"&gt;the repository looks like at this point&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Remember that you can find me on Twitter &lt;a href="https://twitter.com/feregri_no" rel="noopener noreferrer"&gt;at @feregri_no&lt;/a&gt; to ask me about this post – if something is not so clear or you found a typo. The final code for this series is on &lt;a href="https://github.com/fferegrino/tweeting-cycles-lambda" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and the account tweeting the status of the bike network is &lt;a href="https://twitter.com/CyclesLondon" rel="noopener noreferrer"&gt;@CyclesLondon&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>terraform</category>
      <category>ecr</category>
      <category>lambda</category>
    </item>
  </channel>
</rss>
