<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Frendhi</title>
    <description>The latest articles on DEV Community by Frendhi (@frendhisaido).</description>
    <link>https://dev.to/frendhisaido</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F208597%2F7a1cc745-d59d-4b83-b3cb-8ea9f09c0a62.jpg</url>
      <title>DEV Community: Frendhi</title>
      <link>https://dev.to/frendhisaido</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/frendhisaido"/>
    <language>en</language>
    <item>
      <title>Agentic Entity Resolution for Messy Product Data</title>
      <dc:creator>Frendhi</dc:creator>
      <pubDate>Sat, 29 Nov 2025 17:30:21 +0000</pubDate>
      <link>https://dev.to/frendhisaido/agentic-entity-resolution-for-messy-product-data-3ce4</link>
      <guid>https://dev.to/frendhisaido/agentic-entity-resolution-for-messy-product-data-3ce4</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: E-commerce product data is messy. Traditional aggregation fails because "Philips 9W" and "Lampu Philips 9 Watt" look different to a machine. We built &lt;strong&gt;CONGA&lt;/strong&gt;, an AI agent using &lt;strong&gt;Pydantic AI&lt;/strong&gt;, &lt;strong&gt;Gemini 2.0 Flash&lt;/strong&gt;, and &lt;strong&gt;Meilisearch&lt;/strong&gt;, to intelligently "resolve" these messy titles into a clean canonical data, just like a human analyst would, but at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Problem
&lt;/h2&gt;

&lt;p&gt;E-commerce product titles are often a mess. They aren't consistent and generally don't follow the correct official product names from the brand. Mapping them to the correct product type and brand is crucial to unlock valuable insights that otherwise can't be concluded from raw scraping data.&lt;/p&gt;

&lt;p&gt;For instance, if we want to calculate the average price of "Philips LED Bulbs 9W", we can't simply aggregate the raw data because one seller might list it as "Lampu Philips 9W" while another lists it as "Philips LED Bulb 9 Watt". Without normalization, these are treated as two different products, making accurate analysis impossible.&lt;/p&gt;

&lt;p&gt;Imagine having these 5 different titles from 5 different sellers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; "Lampu Philips 9W Putih Murah Promo"&lt;/li&gt;
&lt;li&gt; "Philips LED Bulb 9 watt Cool Daylight"&lt;/li&gt;
&lt;li&gt; "Bohlam LED Philips 9W E27 6500K"&lt;/li&gt;
&lt;li&gt; "PHILIPS ESSENTIAL 9W PUTIH"&lt;/li&gt;
&lt;li&gt; "Lampu Hemat Energi Philips 9 Watt Original"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To a human, it's obvious these are all the same &lt;strong&gt;"Philips Essential LED Bulb 9W"&lt;/strong&gt;. But to a computer? They are 5 unique strings. &lt;strong&gt;Connecting these variations to the single master record is the goal.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Idea
&lt;/h2&gt;

&lt;p&gt;If we have a complete taxonomy of the brand's products, we could just match them one by one, right? Seems like a pretty simple thing to do, given some brands might not have hundreds or thousands of unique products. But when we have thousands of incoming raw data points to map, scaling becomes a problem.&lt;/p&gt;

&lt;p&gt;Why can't we just delegate this to an LLM? Again, seems like a pretty simple thing to do, right? And in fact, yes, we can.&lt;/p&gt;

&lt;p&gt;Enter &lt;strong&gt;CONGA (Commerce Labeling Agent)&lt;/strong&gt;. The idea is to create an agent that can intelligently look up products in our master taxonomy, and if it finds a match, link it. If it &lt;em&gt;doesn't&lt;/em&gt; find a match, it should identify it, normalize the name, and "learn" it for next time.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Design
&lt;/h3&gt;

&lt;p&gt;The tech stack is straightforward but powerful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Framework&lt;/strong&gt;: &lt;a href="https://github.com/pydantic/pydantic-ai" rel="noopener noreferrer"&gt;Pydantic AI&lt;/a&gt; - This simplifies the agentic workflow A LOT.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM Provider&lt;/strong&gt;: Google Vertex AI (specifically &lt;strong&gt;Gemini 2.0 Flash&lt;/strong&gt; for speed and cost-efficiency), any other provider should works as long as pydantic-ai supports it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search Engine&lt;/strong&gt;: &lt;strong&gt;Meilisearch&lt;/strong&gt; - Used to index the master taxonomy for lightning-fast fuzzy searching.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage&lt;/strong&gt;: Any storage or database will do, in my case I use s simple generated taxonomy (SQLite/JSON) to store "new" products the agent finds.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One of the key components is the &lt;code&gt;create_agent&lt;/code&gt; function which initializes the agent with the system prompt and tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;extra_toolsets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Sequence&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AbstractToolset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AgentDependencies&lt;/span&gt;&lt;span class="p"&gt;]]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;gemini-2.0-flash&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create CONGA agent with custom or default system prompt.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="nf"&gt;load_system_prompt&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Initialize Google model
&lt;/span&gt;    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GoogleModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;toolsets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AbstractToolset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AgentDependencies&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;toolsets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build_env_fastmcp_toolsets&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;extra_toolsets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;toolsets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;extra_toolsets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;output_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;LabelingResult&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;deps_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AgentDependencies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;search_taxonomy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;search_generated_taxonomy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_taxonomy_context&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;toolsets&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;toolsets&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here is how we implement the &lt;code&gt;search_taxonomy&lt;/code&gt; tool that allows the agent to query our Meilisearch index:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_taxonomy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RunContext&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AgentDependencies&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Search for products in the predefined taxonomy.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  🔍 Searching taxonomy for: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; (category: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;use_meilisearch&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;meilisearch_tool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Use Meilisearch
&lt;/span&gt;        &lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;meilisearch_tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No matches found in taxonomy.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Found &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; matches in taxonomy:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- Normalized Name: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_normalized_name&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Brand: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;brand&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Category: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sku_type_complete&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  SKU Type: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sku_type_complete&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Fallback to simple search...
&lt;/span&gt;        &lt;span class="k"&gt;pass&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  How it Works
&lt;/h3&gt;

&lt;p&gt;The agent is given a specific goal: map the raw product title to the master taxonomy. We give it two main tools:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;code&gt;search_taxonomy&lt;/code&gt;: To query the official Meilisearch index.&lt;/li&gt;
&lt;li&gt; &lt;code&gt;search_generated_taxonomy&lt;/code&gt;: To check if we've already seen this "new" product before.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The flow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Analyze&lt;/strong&gt;: The agent reads the raw title.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Search Master&lt;/strong&gt;: It queries the Meilisearch index.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Decision&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;  If a high-confidence match (&amp;gt;0.7) is found, use it.&lt;/li&gt;
&lt;li&gt;  If NOT found, check the "generated" storage.&lt;/li&gt;
&lt;li&gt;  If still not found, create a new label and flag it as &lt;code&gt;source="new"&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Challenge: Prompt Engineering
&lt;/h3&gt;

&lt;p&gt;Since the tech stack is pretty simple, what's the challenge? It's the &lt;strong&gt;System Prompt&lt;/strong&gt;. This turned out to be more of a prompt engineering challenge than a complex ML Ops problem.&lt;/p&gt;

&lt;p&gt;The key was telling the LLM &lt;em&gt;exactly&lt;/em&gt; how to search. We couldn't just say "find this product." We had to give it a &lt;strong&gt;Search Strategy&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the final system prompt, we explicitly instructed it to try multiple queries in order:&lt;/p&gt;

&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;FIRST TRY:&lt;/strong&gt; Brand + Product Line + Wattage (e.g., "Philips Essential LED 9W") - Most specific.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IF NO MATCH:&lt;/strong&gt; Brand + Wattage only (e.g., "Philips 9W") - Catches variations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IF NO MATCH:&lt;/strong&gt; Brand + Product Type (e.g., "Philips LED Bulb") - Broad search.&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;We also added &lt;strong&gt;Critical Decision Logic&lt;/strong&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"If NOT lighting (e.g., smartphone, laptop), skip taxonomy search → source = 'new'"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By forcing the agent to be persistent and systematic, the match rate improved dramatically.&lt;/p&gt;

&lt;p&gt;That alone is a pretty good result, but we can do better if we have a clear understanding of the product's category nuance. As in the case of lighting products, most of the products must have like power, color temperature, color, etc. as a keyword. Adding this contexts to the system prompt will help the agent to determine how to search for the product.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. The Data
&lt;/h2&gt;

&lt;p&gt;To really test this, we need two things: a source of truth (Taxonomy) and a mess of data (Input).&lt;/p&gt;

&lt;h3&gt;
  
  
  The Taxonomy (Source of Truth)
&lt;/h3&gt;

&lt;p&gt;This is what our "Golden Record" looks like. It's a clean list of products with their attributes.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Product Name&lt;/th&gt;
&lt;th&gt;Brand&lt;/th&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Attributes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Philips RadiantLine LEDBulb 5W 6500K&lt;/td&gt;
&lt;td&gt;Philips&lt;/td&gt;
&lt;td&gt;LED Bulb&lt;/td&gt;
&lt;td&gt;5W, 6500K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Philips Essential LEDBulb 9W 6500K&lt;/td&gt;
&lt;td&gt;Philips&lt;/td&gt;
&lt;td&gt;LED Bulb&lt;/td&gt;
&lt;td&gt;9W, 6500K&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Input (The Mess)
&lt;/h3&gt;

&lt;p&gt;And this is what we're trying to fix. Real-world raw titles from scraping:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Raw Input Title&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lampu Philips RadiantLine LEDBulb 5W 6500K Putih Lampu LampuLEDPhilips&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PHILIPS LED BULB 7W RADIANTLINE 3000K WARM WHITE LAMPU HEMAT ENERGI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Samsung Galaxy A54 5G Ram 8/256 GB Garansi Resmi SEIN Indonesia&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Philips Essential LED 9W Daylight 6500K Bohlam Lampu Hemat Listrik&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;iPhone 14 Pro Max 256GB Deep Purple Apple Garansi iBox&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  5. Result
&lt;/h2&gt;

&lt;p&gt;The results are surprisingly good. I'm satisfied with it actually. Modern days LLMs are really great at this logic-heavy text processing.&lt;/p&gt;

&lt;p&gt;The agent returns a structured &lt;code&gt;LabelingResult&lt;/code&gt; like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"original_title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Lampu Philips RadiantLine LEDBulb 5W 6500K Putih Lampu LampuLEDPhilips"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"normalized_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Philips RadiantLine LEDBulb 5W 6500K"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"brand"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Philips"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LED Lamps"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"taxonomy"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Real-World Test
&lt;/h3&gt;

&lt;p&gt;Here is a snippet from our actual experiment log showing the agent in action. Notice how it finds high-confidence matches in the taxonomy for known products (Hannochs, Avaro) but correctly identifies a new product (Philips LED Strip) when the exact SKU isn't in the master record.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;[1/100] Processing: Hannochs Lampu Bohlam LED Vario 45W Cahaya Kuning
  🔍 Searching taxonomy for: 'Hannochs LED 45W' (category: LED Lamps)
✓ SUCCESS
  Normalized Name: Hannochs LED Vario 45 W
  Confidence:      90.00%
  Source:          taxonomy

[2/100] Processing: [Smart ] Avaro Smart Led Light Bulb Rgbww 10W Bluetooth Wireless Iot [Lamp]
  🔍 Searching taxonomy for: 'Avaro Smart Led Light Bulb 10W'
✓ SUCCESS
  Normalized Name: Avaro Smart LED Bluetooth 10 W
  Confidence:      90.00%
  Source:          taxonomy

[3/100] Processing: TERBARU PHILIPS SMART WIFI LED STRIP STARTER KIT 2M - COLOR &amp;amp; TUNABLE T0306
  🔍 Searching taxonomy for: 'Philips LED strip 2M'
✓ SUCCESS
  Normalized Name: Philips Smart Wi-Fi LED Strip 2M
  Confidence:      60.00%
  Source:          new
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a sample of how it handles different scenarios:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Raw Input Title&lt;/th&gt;
&lt;th&gt;Normalized Name&lt;/th&gt;
&lt;th&gt;Brand&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Confidence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lampu Philips LED 9W Putih Murah&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Philips Essential LED Bulb 9 W&lt;/td&gt;
&lt;td&gt;Philips&lt;/td&gt;
&lt;td&gt;taxonomy&lt;/td&gt;
&lt;td&gt;0.95&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hannochs 10 watt cahaya putih&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hannochs LED Bulb 10 W&lt;/td&gt;
&lt;td&gt;Hannochs&lt;/td&gt;
&lt;td&gt;generated&lt;/td&gt;
&lt;td&gt;0.82&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Smart LED Bulb Bardi 12W RGBWW&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bardi Smart LED Bulb 12W RGBWW&lt;/td&gt;
&lt;td&gt;Bardi&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;td&gt;0.45&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;iPhone 15 Pro Max 256GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;iPhone 15 Pro Max 256GB&lt;/td&gt;
&lt;td&gt;Apple&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;td&gt;0.10&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And the cost? Not concerning at all with Gemini Flash. It's a pretty simple solution for a pretty simple job, but it solves a real data quality headache.&lt;/p&gt;

&lt;p&gt;Here's the system prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;You are CONGA (Commerce Labeling Agent), an AI specialized in labeling and normalizing product titles from e-commerce platforms.

IMPORTANT: The taxonomy database is FOCUSED ON LIGHTING PRODUCTS ONLY (LED bulbs, lamps, etc.).
If a product is NOT a lighting product (e.g., smartphones, laptops, etc.), it will NOT be in the taxonomy.
For non-lighting products, you should create a new label (source = "new").

Your task is to:
&lt;span class="p"&gt;1.&lt;/span&gt; Analyze product title provided
&lt;span class="p"&gt;2.&lt;/span&gt; Search predefined taxonomy for matches FIRST (only for lighting products)
&lt;span class="p"&gt;3.&lt;/span&gt; If a good match is found in taxonomy (confidence &amp;gt; 0.7), use it and STOP - do NOT search generated taxonomy
&lt;span class="p"&gt;4.&lt;/span&gt; Only search generated taxonomy if NO good taxonomy match is found
&lt;span class="p"&gt;5.&lt;/span&gt; Extract and normalize product information

TAXONOMY DATA STRUCTURE:
When you search the taxonomy, each entry has these fields:
&lt;span class="p"&gt;-&lt;/span&gt; brand: e.g., "Philips"
&lt;span class="p"&gt;-&lt;/span&gt; category: e.g., "LED Lamps"
&lt;span class="p"&gt;-&lt;/span&gt; product_name: e.g., "Philips Essential LED Bulb 9 W" (THIS IS THE NORMALIZED SKU NAME - use this!)
&lt;span class="p"&gt;-&lt;/span&gt; sub_brand: e.g., "Philips Essential LED Bulb"

Example taxonomy entry:
{
  "brand": "Philips",
  "category": "LED Lamps",
  "product_name": "Philips Essential LED Bulb 9 W",
  "sub_brand": "Philips Essential LED Bulb"
}

Note: sku_type_complete is the authoritative normalized product name from the taxonomy.

SEARCH STRATEGY - TRY MULTIPLE APPROACHES:
The taxonomy search is fuzzy and powerful. You MUST try multiple search queries before giving up.

For a product like "Philips Essential LED 9W Daylight 6500K Bohlam Lampu Hemat Listrik":
&lt;span class="p"&gt;
1.&lt;/span&gt; FIRST TRY: Brand + Product Line + Wattage
&lt;span class="p"&gt;   -&lt;/span&gt; Search: "Philips Essential LED 9W"
&lt;span class="p"&gt;   -&lt;/span&gt; This is the most specific and usually works best
&lt;span class="p"&gt;
2.&lt;/span&gt; IF NO MATCH: Brand + Wattage only
&lt;span class="p"&gt;   -&lt;/span&gt; Search: "Philips 9W"
&lt;span class="p"&gt;   -&lt;/span&gt; Catches products where product line name varies
&lt;span class="p"&gt;
3.&lt;/span&gt; IF NO MATCH: Brand + Product Type
&lt;span class="p"&gt;   -&lt;/span&gt; Search: "Philips LED Bulb"
&lt;span class="p"&gt;   -&lt;/span&gt; Broader search to find any similar products
&lt;span class="p"&gt;
4.&lt;/span&gt; IF NO MATCH: Just Brand
&lt;span class="p"&gt;   -&lt;/span&gt; Search: "Philips"
&lt;span class="p"&gt;   -&lt;/span&gt; See what products exist from this brand

IMPORTANT RULES:
&lt;span class="p"&gt;-&lt;/span&gt; Try AT LEAST 2-3 different search queries before giving up on taxonomy
&lt;span class="p"&gt;-&lt;/span&gt; DO NOT add category filters to search queries - let Meilisearch handle relevance
&lt;span class="p"&gt;-&lt;/span&gt; Extract the sku_type_complete EXACTLY as returned from taxonomy
&lt;span class="p"&gt;-&lt;/span&gt; Only mark as "new" if you've tried multiple searches and found nothing relevant

Guidelines:
&lt;span class="p"&gt;-&lt;/span&gt; Extract brand names accurately regardless of industry
&lt;span class="p"&gt;-&lt;/span&gt; Identify correct product category based on product characteristics
&lt;span class="p"&gt;-&lt;/span&gt; Normalize product names to be clean, consistent, and professional
&lt;span class="p"&gt;-&lt;/span&gt; Preserve key technical specifications (wattage, model numbers, sizes, etc.)
&lt;span class="p"&gt;-&lt;/span&gt; For non-lighting products, set source = "new" since they won't be in the lighting taxonomy

CRITICAL DECISION LOGIC:
&lt;span class="p"&gt;1.&lt;/span&gt; Identify if the product is a lighting product:
&lt;span class="p"&gt;   -&lt;/span&gt; If NOT lighting (e.g., smartphone, laptop), skip taxonomy search → source = "new"
&lt;span class="p"&gt;   -&lt;/span&gt; If lighting product, proceed to search taxonomy
&lt;span class="p"&gt;
2.&lt;/span&gt; Search taxonomy PERSISTENTLY (for lighting products only):
&lt;span class="p"&gt;   -&lt;/span&gt; Try MULTIPLE search queries (see SEARCH STRATEGY above)
&lt;span class="p"&gt;   -&lt;/span&gt; Start specific, then broaden: brand+line+wattage → brand+wattage → brand+type → brand only
&lt;span class="p"&gt;   -&lt;/span&gt; If ANY search finds a good match (confidence &amp;gt;= 0.7):
&lt;span class="p"&gt;     *&lt;/span&gt; Use the taxonomy result
&lt;span class="p"&gt;     *&lt;/span&gt; Set source = "taxonomy"
&lt;span class="p"&gt;     *&lt;/span&gt; Set sku_type_complete = EXACTLY the "sku_type_complete" value from the taxonomy entry
&lt;span class="p"&gt;     *&lt;/span&gt; DO NOT search generated taxonomy
&lt;span class="p"&gt;   -&lt;/span&gt; Only proceed to step 3 if ALL search attempts fail
&lt;span class="p"&gt;
3.&lt;/span&gt; If taxonomy has NO good match after multiple tries:
&lt;span class="p"&gt;   -&lt;/span&gt; Search generated taxonomy
&lt;span class="p"&gt;   -&lt;/span&gt; If found, set source = "generated"
&lt;span class="p"&gt;   -&lt;/span&gt; Set sku_type_complete = None (generated taxonomy doesn't have SKU types)
&lt;span class="p"&gt;
4.&lt;/span&gt; If neither has a match:
&lt;span class="p"&gt;   -&lt;/span&gt; Create a new label
&lt;span class="p"&gt;   -&lt;/span&gt; Set source = "new"
&lt;span class="p"&gt;   -&lt;/span&gt; Set sku_type_complete = None

REMEMBER: For lighting products, you should try AT LEAST 2-3 different taxonomy searches before giving up!

Confidence Scoring Rules:
&lt;span class="p"&gt;-&lt;/span&gt; High confidence (&amp;gt;0.8) when matching existing taxonomy
&lt;span class="p"&gt;-&lt;/span&gt; Medium confidence (0.5-0.8) when matching generated taxonomy
&lt;span class="p"&gt;-&lt;/span&gt; Lower confidence (&amp;lt;0.5) when creating new labels

Source Classification:
&lt;span class="p"&gt;-&lt;/span&gt; "taxonomy" when using predefined taxonomy (always prefer this if a good match exists)
&lt;span class="p"&gt;-&lt;/span&gt; "generated" when using generated taxonomy (only if no good taxonomy match)
&lt;span class="p"&gt;-&lt;/span&gt; "new" when creating new labels (only if neither taxonomy nor generated has a match)

Output Requirements:
Always return a complete LabelingResult with all fields filled.
The source field MUST accurately reflect where the data came from.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;UPDATE:&lt;br&gt;
I ran a few tests with 200-300 products per test&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmhfmw4etewrtzqah2rpo.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmhfmw4etewrtzqah2rpo.jpg" alt="OpenRouter usage"&gt;&lt;/a&gt;&lt;br&gt;
What you're seeing is Llama 4 Scout: ~300 products processed in 49 minutes via the OpenRouter Groq provider. Total cost: $0.213&lt;/p&gt;

&lt;p&gt;Disclaimer: This is not traditional &lt;strong&gt;NER (Named Entity Recognition)&lt;/strong&gt;. Traditional NER models extract entities (like "Philips" or "9W") but don't inherently link them to a master record. This approach leverages the reasoning power of LLMs to mimic a human data analyst: searching, comparing, and making a decision to map the messy input to a canonical entry (Entity Resolution).&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
    </item>
    <item>
      <title>Bridging the Communication Gap: A Leap from Junior to Senior Software Engineer in the Indonesian Tech Scene</title>
      <dc:creator>Frendhi</dc:creator>
      <pubDate>Sun, 15 Oct 2023 02:17:48 +0000</pubDate>
      <link>https://dev.to/frendhisaido/bridging-the-communication-gap-a-leap-from-junior-to-senior-software-engineer-in-the-indonesian-tech-scene-1oe4</link>
      <guid>https://dev.to/frendhisaido/bridging-the-communication-gap-a-leap-from-junior-to-senior-software-engineer-in-the-indonesian-tech-scene-1oe4</guid>
      <description>&lt;h2&gt;
  
  
  Congratulations!
&lt;/h2&gt;

&lt;p&gt;Congratulations on your third year in software development and getting your 5th app into production. You know your programming language, framework, and tools well, which might make you feel like a senior software engineer now. However, moving from junior to senior requires more than just technical skills. If you've been wondering why the title of senior engineer eludes you despite your technical proficiency, you might be experiencing what's known as the &lt;a href="https://blog.tomatopay.co.uk/dunning-kruger-effect-and-journey-of-a-software-engineer" rel="noopener noreferrer"&gt;Dunning-Kruger effect&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzekb65gu7yi66eigpq4l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzekb65gu7yi66eigpq4l.png" alt="Dunning-Kruger Effect" width="800" height="509"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's a common scenario, especially in the software engineering field, to feel like a master early on. Remembering the challenges faced while building my third system, my initial confidence faded when I realized how much more there was to learn. The tech scene is ever-evolving with trends and tools changing every six months, making the chase for technical skills seem endless. However, it's crucial to pause and think about what truly matters for advancing in your career.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Underestimated Skill: Communication
&lt;/h2&gt;

&lt;p&gt;Good communication is essential, regardless of your tech stack or the number of years you've been programming. It shapes how you interact with colleagues, managers, clients, and users. More than just talking, it encompasses sharing information, giving status updates, writing PR descriptions, discussing issues, risks, and estimations, whether verbally or in writing.&lt;/p&gt;

&lt;p&gt;In the Indonesian tech scene, the communication dilemma is glaring. It's not hard to find a developer with a few years of experience in Laravel, WordPress, or NodeJS. However, many struggle to express their thoughts or questions clearly at work. This struggle extends beyond language barriers. Despite better English comprehension thanks to the internet, articulating thoughts professionally, even in our native Bahasa Indonesia, remains a challenge. This struggle manifests in various professional interactions, including describing an issue, writing a PR, or drafting technical documentation.&lt;/p&gt;

&lt;p&gt;Even simple tasks like composing a git commit message become a part of team communication and should be taken seriously. The way junior engineers express themselves, especially in the Indonesian tech scene, often lacks clarity. This lack of clarity is evident in online forums, where junior engineers struggle to formulate clear questions, making it difficult for others to provide helpful answers. It's not about politeness but about the ability to convey one's intent or inquiry effectively. This difficulty in communication is not confined to online interactions but is a reflection of the challenges faced in real-world work settings. Senior engineers often find it hard to understand the juniors' intent or questions, which could hinder teamwork and project progress.&lt;/p&gt;

&lt;h2&gt;
  
  
  The challenges
&lt;/h2&gt;

&lt;p&gt;IMO, the communication skill issue in the Indonesian tech scene stems from various challenges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Preference for Visual Learning:&lt;/strong&gt; Many Indonesians prefer watching demos or tutorials rather than reading. The tendency is to seek straight-to-the-point instructions on the "how" while often overlooking the "why" and the "what."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Language Barriers:&lt;/strong&gt; Though Bahasa Indonesia is our national language, local languages reign supreme in daily communication. This preference can hinder understanding when discussing technical matters, as nuances may get lost in translation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Disinclination Towards Writing:&lt;/strong&gt; Many prefer direct verbal communication over writing. The act of composing sentences is seen as cumbersome and less productive compared to coding. There's a prevailing notion that producing working code is paramount, often at the expense of documenting the rationale (RFC or issue ticket/card), expected output (DOD), and the methodology (PR's description, code comments, git commit) behind the code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Time Constraints:&lt;/strong&gt; The tight deadlines common in Indonesian tech projects leave little room for pause and reflection, which are often wrongly perceived as unproductive. Yet, understanding and visualizing a problem, followed by finding the right solution, is as crucial as implementing it.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Proposing Solutions
&lt;/h2&gt;

&lt;p&gt;Enhancing communication skills within the Indonesian tech community requires a concerted effort:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cultivate Reading and Writing Habits:&lt;/strong&gt; Engage in daily reading and frequent technical writing. Explore issues in open-source GitHub repositories and scrutinize senior engineers' PRs—not just the code, but also how they structure their PR and draft their commit messages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Be the Change:&lt;/strong&gt; Good communication should start with you. While you can't control others, demonstrating effective communication in your work can influence your peers to follow suit.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Learn from Seniors:&lt;/strong&gt; Listen to senior engineers delivering technical explanations in tech talks available on platforms like YouTube. Analyze how they convey complex information clearly and effectively.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Broaden Your Perspective:&lt;/strong&gt; Don't confine your growth to mastering coding and frameworks alone. Embracing a holistic approach towards being a well-rounded senior engineer is vital.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Leverage AI:&lt;/strong&gt; LLM (Language Models) can be a great help! We are living in the future, guys. There's no need to fear embarrassment or muster the courage to ask something that might seem silly. Just type your question as a prompt to, say, ChatGPT, and ask it to compose a proper sentence for that question. Bam! You have your proofreading assistant and learn how to frame a better question along the way.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Bridging the Communication Gap
&lt;/h2&gt;

&lt;p&gt;Addressing the communication gap is crucial for a smooth transition from a junior to a senior role in software engineering, especially in Indonesia. Balancing technical expertise with enhanced communication skills not only fosters personal growth but significantly contributes to elevating the professionalism and competitiveness of the Indonesian tech scene.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
