<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Agile Actors Hellas</title>
    <description>The latest articles on DEV Community by Agile Actors Hellas (@agileactors).</description>
    <link>https://dev.to/agileactors</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F10890%2Fe6dfead2-3ccd-4c61-82d4-ed12ff9d5a25.png</url>
      <title>DEV Community: Agile Actors Hellas</title>
      <link>https://dev.to/agileactors</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/agileactors"/>
    <language>en</language>
    <item>
      <title>Yet another end-to-end streaming dashboarding example</title>
      <dc:creator>Agile Developer</dc:creator>
      <pubDate>Tue, 12 May 2026 11:23:13 +0000</pubDate>
      <link>https://dev.to/agileactors/yet-another-end-to-end-streaming-dashboarding-example-43dp</link>
      <guid>https://dev.to/agileactors/yet-another-end-to-end-streaming-dashboarding-example-43dp</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In this post, we present an introductory example using  &lt;a href="https://pinot.apache.org/" rel="noopener noreferrer"&gt;Apache Pinot&lt;/a&gt; to ingest an Apache Kafka stream. This is an introductory post that builds upon &lt;strong&gt;existing&lt;/strong&gt; Apache Pinot material from the official trainings and documentation. The purpose here is not just to rehash what is in the official docs, but a preparation for a second part. The idea, is to adapt the official examples to this end. Moreover, when I tried to run these examples, I had some extra ideas in how to better present the material. Part of the presented setup is also based on yet another Apache Pinot example in a complementary series of lectures that is written for Javascript. Our focus here is &lt;strong&gt;Python&lt;/strong&gt;. Here are the two references I used&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lecture 4 (&lt;a href="https://github.com/startreedata/learn/tree/main/pinot-advanced/04-stream-ingestion" rel="noopener noreferrer"&gt;https://github.com/startreedata/learn/tree/main/pinot-advanced/04-stream-ingestion&lt;/a&gt;). It is a series of advanced Pinot usage from Startree. I Ported the JS example to Python.&lt;/li&gt;
&lt;li&gt;Updated continuously &lt;a href="https://github.com/pinot-contrib/pinot-docs/blob/latest/tutorials/getting-started/streamlit.md" rel="noopener noreferrer"&gt;Streamlit example&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Another purpose of this introduction is to document my learning process so as to use it later as a reference or personal notes. Consequently the coherence of the material presented is of paramount importance.&lt;/p&gt;

&lt;p&gt;For a formal introduction to Apache Pinot, the excellent playlists below are highly recommended.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/playlist?list=PLihIrF0tCXdfN6y-twj9KtWaXM1GH4RSe" rel="noopener noreferrer"&gt;Apache Pinot 101&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.youtube.com/playlist?list=PLihIrF0tCXdckH2BSA1D8l-QPGfVXEuFV" rel="noopener noreferrer"&gt;Apache Pinot 201&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's start our journey.&lt;/p&gt;
&lt;h2&gt;
  
  
  Booting up setup and running our first streaming session
&lt;/h2&gt;

&lt;p&gt;Our setup is completely local. We will use exclusively &lt;a href="https://podman.io/" rel="noopener noreferrer"&gt;Podman&lt;/a&gt;. All the executions are done on Windows 11 using Command Prompt terminals under VScodium. You might need to apply some minor changes for your environment (if any).&lt;/p&gt;

&lt;p&gt;The docker compose file is mostly covered &lt;a href="https://docs.pinot.apache.org/start-here/install/docker" rel="noopener noreferrer"&gt;here&lt;/a&gt; . We just added a .env file for convenience.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;podman compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This starts an Apache Kafka single-node cluster and an Apache Pinot cluster with one Controller, one Broker and one Server nodes. More on this later. You can visit the Apache Pinot Controller UI &lt;a href="http://localhost:9000" rel="noopener noreferrer"&gt;here&lt;/a&gt;. &lt;br&gt;
Having started Apache Kafka and Apache Pinot we need to push some data to Apache Kafka and link Apache Pinot to Apache Kafka through a streaming table. As in both references, we will use Wikipedia page edits event stream as a data source. Every page edit on Wikipedia is recorded as a event. There are many page edits throughout the world in an ever increasing body of knowledge on Wikipedia. This happens, literally continuously and such activity can be modeled as an event source. This event is made public in the following url &lt;a href="https://stream.wikimedia.org/v2/stream/recentchange" rel="noopener noreferrer"&gt;https://stream.wikimedia.org/v2/stream/recentchange&lt;/a&gt; and people can visit it with their browser and see these events. Obviously, the typical web surfer is not interested in this overwhelming, ever growing list of repetitive JSON context. It is so large that one has to resort to Data Analytics methods, so as to make sense. Moreover, this event stream is not structured in a way to convey meaning as a typical web page. On the contrary, methods of Data Engineering are necessary to capture it in a streaming table (Apache Spark terminology is used here), do whatever data transformations are necessary and then make it available to a Data Analytics system for visualizing the different aspects.&lt;br&gt;
First, we need to understand the data source. The data source is delivered in what is commonly referred to as SSE format. Wikipedia, unsurprisingly has a very detailed page with documentation on &lt;a href="https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams_HTTP_Service" rel="noopener noreferrer"&gt;this&lt;/a&gt;. It also lists various code snippets on how to consume it. In terms of Data Engineering, &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;is a web service that exposes continuous streams of structured event data. It does so over HTTP.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For a Data Engineer, a source transport format is half the story. The rest is the schema. It is available &lt;a href="https://schema.wikimedia.org/repositories/primary/jsonschema/mediawiki/recentchange/latest.yaml" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In terms of software development, this means, that we need a client library. There are many, but &lt;a href="https://pypi.org/project/sseclient/" rel="noopener noreferrer"&gt;SSE client&lt;/a&gt; stands out. It is also used in the Streamlit tutorial of Apache Pinot. For simplicity, we will use the Wikipedia approach.&lt;/p&gt;

&lt;p&gt;Here is the adapted code from Wikipedia.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://stream.wikimedia.org/v2/stream/recentchange&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User-Agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;advanced_pinot_tutorial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
         &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;change&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
                &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

                &lt;span class="c1"&gt;# Kafka Place holder Code is here
&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the schema what stands out for a streaming source is the timestamp&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;timestamp:&lt;br&gt;
    description: Unix timestamp (derived from rc_timestamp).&lt;br&gt;
    type: integer&lt;br&gt;
    maximum: 9007199254740991&lt;br&gt;
    minimum: -9007199254740991&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The above conversion is to avoid a conflict with any internal &lt;code&gt;timestamp&lt;/code&gt; function. Also we convert the Unix timestamp to milliseconds. Keep it in mind.&lt;/p&gt;

&lt;p&gt;Now we need some code to push to an Apache Kafka topic. We use the &lt;a href="https://pypi.org/project/confluent-kafka/" rel="noopener noreferrer"&gt;confluent-kafka&lt;/a&gt; library.&lt;/p&gt;

&lt;p&gt;First we setup our Apache Kafka connection (we implicitly assume the default 9092 port for the Apache Kafka), which is petty much self-explanatory&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;kafka_topic_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wikipedia-events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# conf = {'bootstrap.servers': 'redpanda-0,redpanda-1,redpanda-2'}
&lt;/span&gt;&lt;span class="n"&gt;conf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bootstrap.servers&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;kafka&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;kafka_admin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;admin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AdminClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;kafka_admin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete_topics&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;kafka_topic_name&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;kafka_admin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_topics&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;admin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;NewTopic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kafka_topic_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

&lt;span class="n"&gt;producer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Producer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and then in the Apache Kafka placeholder in the previous snippet we put the push logic&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;produce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kafka_topic_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;meta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;callback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;acked&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;events_processed&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;events_processed&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; Flushing after &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;events_processed&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;events_processed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;every 100 events, we log the push of the batch. Confluent has very good documentation on how this library is used.&lt;/p&gt;

&lt;p&gt;We pack the application a Docker image&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;podman build &lt;span class="nt"&gt;-t&lt;/span&gt; pinot-advanced/python-streaming-ingest ./producer-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and then, we run it&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;podman run &lt;span class="nt"&gt;-it&lt;/span&gt;  &lt;span class="nt"&gt;--network&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;pinot-advanced pinot-advanced/python-streaming-ingest:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmr9iqa9ozrhq2f6zoj1f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmr9iqa9ozrhq2f6zoj1f.png" alt="Producer app executionn" width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now it is time to verify the Apache Kafka push is working appropriately. For convenience a consumer Python app is provided. You can start it with similar commands&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;podman build &lt;span class="nt"&gt;-t&lt;/span&gt; pinot-advanced/python-kafka-consumer ./consumer-app
podman run &lt;span class="nt"&gt;-it&lt;/span&gt;  &lt;span class="nt"&gt;--network&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;pinot-advanced pinot-advanced/python-kafka-consumer:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frwp6m9ovysgwzummpo5q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frwp6m9ovysgwzummpo5q.png" alt="Consumer app execution" width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Everything seems to work fine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up Apache Pinot and running our first query
&lt;/h2&gt;

&lt;p&gt;In order to create the streaming table, we need to tell Apache Pinot both the transport format and the schema. The schema need not be exhaustive, but include a subset of what we need. For this reason we need two files.&lt;/p&gt;

&lt;h3&gt;
  
  
  The schema file.
&lt;/h3&gt;

&lt;p&gt;Each column in Apache Pinot has one of the following types.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dimension&lt;/li&gt;
&lt;li&gt;Metric&lt;/li&gt;
&lt;li&gt;Date/Time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is pretty obvious what the last one is used for. The first one is for filtering (used for drilling down). The second one is for aggregations. This distinction does not exist in relational databases or other Big Data solutions, and is what makes Apache Pinot a true Big Data streaming solution.&lt;/p&gt;

&lt;p&gt;We will not need any metric fields, since we get a stream of data edits. We will do what people call &lt;code&gt;distinctCounts&lt;/code&gt; which in reality is an aggregation, but the fields we will use are not numeric and so, they cannot go to the metric fields section. Here you are&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"schemaName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"wikievents"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dimensionFieldSpecs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"metaJson"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"dataType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"STRING"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"dataType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"STRING"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"dataType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"STRING"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"topic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"dataType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"STRING"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dateTimeFieldSpecs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"dataType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LONG"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1:MILLISECONDS:EPOCH"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"granularity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1:MILLISECONDS"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The config file.
&lt;/h3&gt;

&lt;p&gt;Next one is the table configuration and transport format. See &lt;a href="https://github.com/fithisux/visualize-streamlit-pinot-example/blob/main/scripts/wikipedia_events_realtime_table_config.json" rel="noopener noreferrer"&gt;https://github.com/fithisux/visualize-streamlit-pinot-example/blob/main/scripts/wikipedia_events_realtime_table_config.json&lt;/a&gt; for the details.&lt;/p&gt;

&lt;p&gt;I will just focus on this snippet&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"transformConfigs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"columnName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"transformFunction"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"JSONPATH(metaJson, '$.domain')"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"columnName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"topic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"transformFunction"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"JSONPATH(metaJson, '$.topic')"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is necessary, so as to grab the fields from the JSON payload of the Apache Kafka message. So, fields topic and domain are &lt;strong&gt;computed fields&lt;/strong&gt;, and for this reason we need explicitly expose the &lt;code&gt;metaJson&lt;/code&gt; column. &lt;/p&gt;

&lt;h3&gt;
  
  
  Our first query
&lt;/h3&gt;

&lt;p&gt;With the compose file and streamer app up and running we will construct our table in Apache Pinot.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;podman run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--network&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;pinot-advanced &lt;span class="nt"&gt;-v&lt;/span&gt; ./scripts/wikipedia_events_schema.json:/scripts/wikipedia_events_schema.json &lt;span class="nt"&gt;-v&lt;/span&gt; ./scripts/wikipedia_events_realtime_table_config.json:/scripts/wikipedia_events_realtime_table_config.json apachepinot/pinot:latest-25-ms-openjdk AddTable &lt;span class="nt"&gt;-schemaFile&lt;/span&gt; /scripts/wikipedia_events_schema.json &lt;span class="nt"&gt;-tableConfigFile&lt;/span&gt; /scripts/wikipedia_events_realtime_table_config.json &lt;span class="nt"&gt;-controllerHost&lt;/span&gt; pinot-controller &lt;span class="nt"&gt;-exec&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We mount &lt;code&gt;./scripts&lt;/code&gt; on a purpose built container that will use schema and table config in order to create the table.&lt;/p&gt;

&lt;p&gt;You can view the table by navigating to Pinot Controller &lt;a href="http://localhost:9000/#/query" rel="noopener noreferrer"&gt;locally here&lt;/a&gt; and run your first query&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="k"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;wikievents&lt;/span&gt; &lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a sample of what you should expect&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuyhtlb555qul99t7o164.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuyhtlb555qul99t7o164.png" alt="Sample query execution" width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Running the dashboard
&lt;/h2&gt;

&lt;p&gt;Deviating from the sample Streamlit app provided by &lt;a href="https://startree.ai/" rel="noopener noreferrer"&gt;Startree&lt;/a&gt;, but similar in spirit we provide a Dashboard. Before delving into the code base let's clarify the business logic of the dashboard. We run a sampling query that works on a window from the sampling time, 1 minute back into the past. In this window we sample three important quantities:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The number of changes that happened&lt;/li&gt;
&lt;li&gt;The different users that committed these changes &lt;/li&gt;
&lt;li&gt;The different domains where this change took place. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Our dashboard will carry the current sample, and a window back in time of the 30 latest samples. For visualization we will will record the sample, and we will plot the 30 samples buffer as a visual summary. Our dashboard will be implemented with the &lt;a href="https://panel.holoviz.org/tutorials/basic/index.html" rel="noopener noreferrer"&gt;Panel python package&lt;/a&gt; in a notebook. Is used VScodium for convenience. It is advised to create a virtual environment, install the dependencies there and then use it as a kernel for executing the &lt;a href="https://github.com/fithisux/visualize-streamlit-pinot-example/blob/main/panel-dashboard-app/dashboard.ipynb" rel="noopener noreferrer"&gt;notebook&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fydp0eqdhezwrsyzq5hse.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fydp0eqdhezwrsyzq5hse.png" alt="VScodium setup" width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How is the sample obtained is just an Apache Pinot query away:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
   &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;events1Min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;distinctcount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;users1Min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;distinctcount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;domains1Min&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;wikievents_REALTIME&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ago&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'PT1M'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ago&lt;/code&gt; function uses &lt;a href="https://en.wikipedia.org/wiki/ISO_8601" rel="noopener noreferrer"&gt;ISO 8601&lt;/a&gt; duration format to construct a bound for the window.&lt;/p&gt;

&lt;p&gt;This is our main building block. To implement our sampling logic here is the relevant notebook cell&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pinotdb&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;connect&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;localhost&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8099&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/query/sql&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scheme&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;list_of_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_changes&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        select 
                count(*) AS events1Min,
                distinctcount(user) AS users1Min,
                distinctcount(domain) AS domains1Min
        from wikievents_REALTIME
        where ts &amp;gt; ago(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;PT1M&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
        limit 1;
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;curs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;curs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;temp_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;curs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;curs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;temp_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sample_time&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Timestamp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;list_of_samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;temp_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;list_of_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;list_of_samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;temp_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;concat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;list_of_samples&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sample_time&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sample is returned as a dict, while the past buffer is concatenated to a pandas data frame. A sample execution follows&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;events1Min&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2216&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;users1Min&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;362&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;domains1Min&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sample_time&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Timestamp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2026-05-12 12:58:12.996165&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
    &lt;span class="n"&gt;events1Min&lt;/span&gt;  &lt;span class="n"&gt;users1Min&lt;/span&gt;  &lt;span class="n"&gt;domains1Min&lt;/span&gt;                &lt;span class="n"&gt;sample_time&lt;/span&gt;
 &lt;span class="mi"&gt;0&lt;/span&gt;        &lt;span class="mi"&gt;2216&lt;/span&gt;        &lt;span class="mi"&gt;362&lt;/span&gt;           &lt;span class="mi"&gt;80&lt;/span&gt; &lt;span class="mi"&gt;2026&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;05&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;58&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;12.996165&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next cell sets up the reactivity of our data&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Necessary for reactive pandas
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;panel&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pn&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hvplot.pandas&lt;/span&gt; 

&lt;span class="n"&gt;pn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extension&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;sample_df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_changes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;table_changes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;samples_df_rx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;## Extract Data
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_table_changes&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;sample_df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_changes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;table_changes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_df&lt;/span&gt;
    &lt;span class="n"&gt;samples_df_rx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;samples_df&lt;/span&gt;

&lt;span class="n"&gt;pn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_periodic_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;update_table_changes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See documentation of Panel library &lt;a href="https://panel.holoviz.org/tutorials/basic/build_streaming_dashboard.html" rel="noopener noreferrer"&gt;here&lt;/a&gt;. The most important statement is the last one that sets up 1 minute periodicity of updates for our feeds to plots.&lt;/p&gt;

&lt;p&gt;The next cells create a dashboard with the absolute defaults. No effort to tinker with CSS is taken. I will not spend time on the Panel components. The documentation is very thorough. What is remarkable though, is that you can directly serve the notebook with Panel. From you activated virtual environment run&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;panel serve .&lt;span class="se"&gt;\d&lt;/span&gt;ashboard.ipynb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and you can navigate to the appropriate url &lt;a href="http://localhost:5006/dashboard" rel="noopener noreferrer"&gt;http://localhost:5006/dashboard&lt;/a&gt; to visit your dashboard&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4f5ydhh5e4z5kjj82tlw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4f5ydhh5e4z5kjj82tlw.png" alt="Wikipedia changes dashboard" width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Epilogue
&lt;/h2&gt;

&lt;p&gt;In the above article we gave an example of an end-to-end dashboard backed by Apache Pinot streaming table. The original stream comes from an Apache Kafka topic. The stream captures the Wikipedia page edits and is customarily used for streaming tutorials. We gave a quick description of Apache Kafka and Apache Pinot setup, how to ingest the page edits and how to visualize them. RedPanda can be used instead of Apache Kafka. See the related &lt;a href="https://github.com/fithisux/visualize-streamlit-pinot-example/blob/main/README.md" rel="noopener noreferrer"&gt;Readme.md&lt;/a&gt; for the necessary, but minimal, changes. As always the code is &lt;a href="https://github.com/fithisux/visualize-streamlit-pinot-example" rel="noopener noreferrer"&gt;provided&lt;/a&gt;. If you find something is not clear, a bug, or have any suggestion, do not hesitate to post on the comments. I hope you enjoyed it.&lt;/p&gt;

</description>
      <category>kafka</category>
      <category>pinot</category>
      <category>visualization</category>
    </item>
    <item>
      <title>Your ILP solver license has expired. Now what?</title>
      <dc:creator>Agile Developer</dc:creator>
      <pubDate>Mon, 04 May 2026 11:08:52 +0000</pubDate>
      <link>https://dev.to/agileactors/your-ilp-solver-license-has-expired-now-what-1b93</link>
      <guid>https://dev.to/agileactors/your-ilp-solver-license-has-expired-now-what-1b93</guid>
      <description>&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;h3&gt;
  
  
  A nasty surprise
&lt;/h3&gt;

&lt;p&gt;Last summer while trying to deliver a feature for one of our customers, I encountered a nasty situation. The software we were developing, depended on a production grade license of &lt;a href="https://www.gurobi.com/" rel="noopener noreferrer"&gt;Gurobi&lt;/a&gt;. People were on vacations except of my team and some unrelated staff, so developing the feature was in principle blocked. As I learnt due to some other situations, research stuff being participating in conferences, they could not update the license. These are the people who had the final saying. Still the situation for me was very uncomfortable, because this feature would be delayed a lot. Months before I had cautioned that the sole dependency on a closed source solution was a bad practice when there were free open source solutions like &lt;a href="https://highs.dev/" rel="noopener noreferrer"&gt;HiGHS&lt;/a&gt;. Gurobi is the leading player in the field with a very performant product that offers many conveniences. Actually, much more performant than the open source solutions in our case. But license disruptions could happen and users of the feature would be in a difficult situation. &lt;br&gt;
In summary the feature amounted to the following workflow. Users could parametrize a process in a Web GUI. These parameters are translated to an ILP (Integer Linear Programming) problem which subsequently is solved and results are returned back to the WebGUI. We followed the standard approach of sending these parameters as a REST payload to a server. The server would do the translation to the ILP. Having also done the solution, the results are sent back.&lt;/p&gt;

&lt;p&gt;You can get a taste of that &lt;a href="https://medium.com/@konpsar/evolving-a-flask-celery-example-into-an-api-for-linear-programming-problems-944d045d477e" rel="noopener noreferrer"&gt;here&lt;/a&gt; &lt;/p&gt;
&lt;h3&gt;
  
  
  The plan
&lt;/h3&gt;

&lt;p&gt;Having some time available I decided to evaluate the possibility of providing an alternate implementation of the solution part instead of mocking it. It was important since performance considerations were also in scope. The first attempt bombed because the code was not clean. It was written by researchers after all. I was lucky enough to have some of their notebooks with outputs for comparison. So, given this opportunity, I went ahead to clean up their code considerably (and fix a number of serious bugs, yay!!!). This post focuses on the bringing up of the alternative and not the other parts of the feature that were equally important. But first let's outline the plan of attack I decided upon. We are talking about a Python code base.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cleanup the code so that the ILP problem is clear. Given the previous attempt of a colleague who worked on the cleanup before, I was able to further the cleanup, attach types, and make sense of the code. I will not get into more details, but it was not very pleasant.&lt;/li&gt;
&lt;li&gt;Given the Gurobi code, and the fact that there is an interchange format for ILP problems, called &lt;a href="https://en.wikipedia.org/wiki/MPS_(format)" rel="noopener noreferrer"&gt;MPS&lt;/a&gt;, the workaround here was to serialize the Gurobi formulation to an MPS file, load it and solve the ILP with HiGHS. It involved some work, mostly writing a bunch of adapters and understanding how HiGHS works. This was the path of least resistance and worked fine. Acknowledging the bottleneck of moving the huge MPS file across the network instead of the way smaller set of parameters, as the original plan was, I hid the file generation within the computation server.&lt;/li&gt;
&lt;li&gt;While not having the best solution, I was more confident. The whole feature was progressing after all. I decided to give a shot in the re-implementation with HiGHS which would bring me in parity with the original plan. This would eliminate the serialization/deserialization of a big file. It was now easier than I anticipated.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Obviously I will not be able to share the code, but I will use a toy example to highlight the principles.&lt;/p&gt;
&lt;h2&gt;
  
  
  Highlights of the porting
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As a toy example I will use the famous "assignment problem". It is a very common and simple ILP problem, that pales in comparison to the ILP problem of the customer. However it is enough to highlight the main issues. I use this excellent &lt;a href="https://ics-websites.science.uu.nl/docs/vakken/stt/LectureNotesILP.pdf" rel="noopener noreferrer"&gt;reference&lt;/a&gt;. It is a good set of lectures for solving ILP problems. You can try to replicate what is presented here for the other problems. &lt;br&gt;
The typical assignment problem amounts to assigning &lt;strong&gt;M&lt;/strong&gt; people to &lt;strong&gt;N&lt;/strong&gt; jobs with every possible assignment, say &lt;strong&gt;job -&amp;gt; person&lt;/strong&gt; incurring a cost of &lt;strong&gt;C(job, person)&lt;/strong&gt;. The task is to find the minimum cost assignment. The constraints are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Every job must be assigned exactly one person &lt;/li&gt;
&lt;li&gt;Persons can be assigned to at most one job. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Obviously &lt;strong&gt;M&lt;/strong&gt; should be at least &lt;strong&gt;N&lt;/strong&gt; to cover all the jobs and &lt;strong&gt;M&lt;/strong&gt; should be at most &lt;strong&gt;N&lt;/strong&gt; to not leave people out. Our plan here is to solve this in three ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gurobi (Model in Gurobi and solve in Gurobi)&lt;/li&gt;
&lt;li&gt;Pseudo Gurobi (Model in Gurobi solve in HiGHS)&lt;/li&gt;
&lt;li&gt;HiGHS (Model in HiGHS and solve in HiGHS)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Code is &lt;a href="https://codeberg.org/fithisux/devto-ilp-article" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Gurobi approach
&lt;/h3&gt;

&lt;p&gt;First of all we will use named binary variables to refer to our potential assignments. If they take the value 1 after a solution, these assignments have been realized.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;gurobipy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;gp&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;gurobipy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GRB&lt;/span&gt;

&lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Env&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;job_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;var_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;GRB&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;var_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we need to have some assignment costs as we said previously.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;

&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;job_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
         &lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We selected random weights (fixing the random process by the seed for reproducibility) because if all the costs were the same an assignment of the form &lt;strong&gt;i&lt;/strong&gt; -&amp;gt; &lt;strong&gt;i&lt;/strong&gt; for every i, would be enough.&lt;/p&gt;

&lt;p&gt;Now it is time for the constraints and the objective which model exactly what we said in the previous subsection&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# all jobs must have an assignement
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;job_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addConstr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quicksum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;# all workers must have at least an assignement
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addConstr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quicksum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;job_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;# objective function
&lt;/span&gt;&lt;span class="n"&gt;objective&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quicksum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;job_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Njobs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setObjective&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objective&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;GRB&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MINIMIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This covers the first part, namely, the modeling of our problem. The second and last part is the solution.&lt;/p&gt;

&lt;p&gt;It is enough to invoke the process&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timeLimit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;200.0&lt;/span&gt; &lt;span class="c1"&gt;# seconds
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LogToConsole&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IntegralityFocus&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;optimize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rest of the code is just for displaying the solution. Not a big deal. What is the deal breaker is the following notification from the library&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Restricted license - for non-production use only - expires 2027-11-29
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means two things. The first is that we are working on borrowed time. The second has to do with the size of the problem we solve. If we set &lt;strong&gt;Njobs&lt;/strong&gt; = 100 we are greeted with a crash.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GurobiError: Model too large for size-limited license; visit https://gurobi.com/unrestricted for more information
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This work is in the &lt;a href="https://codeberg.org/fithisux/devto-ilp-article/src/branch/main/gurobipy_formulation.ipynb" rel="noopener noreferrer"&gt;gurobipy_formulation.ipynb&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Pseudo-Gurobi and HiGHS approaches
&lt;/h3&gt;

&lt;p&gt;In my case I was greeted with the "Unauthenticated" error because the license had expired and the exact error when I tried to run without the license. But not all is, lost. The solution, which is the selling point of Gurobi, is not working. However, the modelling part works perfectly. Armed with this knowledge I decided to follow the hybrid method. Model in Gurobi, solve in HiGHS. It is true that the documentation takes a bit to get used but I had to do only 2 changes. The first and more important is to swap the solution process. Because of the interoperability (an underappreciated concept haunting the Software Engineering business) it was painless. More specifically we swap this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timeLimit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;200.0&lt;/span&gt; &lt;span class="c1"&gt;# seconds
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LogToConsole&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IntegralityFocus&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;optimize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;highspy&lt;/span&gt;

&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;highspy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Highs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;highspy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Highs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;mymodel.mps&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;readModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;mymodel.mps&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Reading model file mymodel.mps returns a status of &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setOptionValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;time_limit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;solve&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Model has status &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getModelStatus&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple as that. The second change which understandably is HiGHS specific has to do with the pretty printing of the solutions.&lt;/p&gt;

&lt;p&gt;This work is in &lt;a href="https://codeberg.org/fithisux/devto-ilp-article/src/branch/main/pseudogurobipy_formulation.ipynb" rel="noopener noreferrer"&gt;pseudogurobipy_formulation.ipynb&lt;/a&gt; notebook.&lt;/p&gt;

&lt;p&gt;Now for the pure HiGHS approach we replace the model instantiation. In other words we swap&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;gurobipy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;gp&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;gurobipy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GRB&lt;/span&gt;

&lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Env&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;highspy&lt;/span&gt;
&lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;highspy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Highs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keep in mind, that this is the first part of the hybrid solution approach. Now we do not need the MPS file anymore. The solution process is simply a swap of this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timeLimit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;200.0&lt;/span&gt; &lt;span class="c1"&gt;# seconds
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LogToConsole&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IntegralityFocus&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;optimize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setOptionValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;time_limit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;solve&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Model has status &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getModelStatus&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What changes slightly is in the modeling. We have to define a utility function &lt;strong&gt;quicksum&lt;/strong&gt; to mimic and replace the provided utility function &lt;strong&gt;gp.quicksum&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The second change has to do with how we instantiate a variable. We swap&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;GRB&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;var_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker_index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addBinary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;var_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see there is an easy swapping. what was not easy was to cleanup and debug the modelling process which is not straightforward at all.&lt;/p&gt;

&lt;p&gt;This work is in the &lt;a href="https://codeberg.org/fithisux/devto-ilp-article/src/branch/main/highspy_formulation.ipynb" rel="noopener noreferrer"&gt;highspy_formulation.ipynb&lt;/a&gt; notebook.&lt;/p&gt;

&lt;h2&gt;
  
  
  Epilogue
&lt;/h2&gt;

&lt;p&gt;We show how a problem that seemed insurmountable had two solutions. Not ideal, but still solutions. While the license of a production ready commercial ILP solver expired, we can still employ slower processing so as to keep the business moving. Not only that, I had to carefully review my options and cleanup the code base to make it amenable for applying the workaround. In the process the code became cleaner, bug free and I re-evaluated some modelling approaches (I did not mention it previously). They were approved by the researchers. The end result narrowed quite a bit the memory and processing gap between the Gurobi and HiGHS approaches. Since then, we had renewed the license and the feature is delivered. This time, we are prepared for a possible outage. I hope you enjoyed the article.&lt;/p&gt;

&lt;p&gt;As always the code is &lt;a href="https://codeberg.org/fithisux/devto-ilp-article" rel="noopener noreferrer"&gt;provided&lt;/a&gt;. Feel free to open an issue if you see something wrong or add a comment.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>ilp</category>
      <category>python</category>
    </item>
    <item>
      <title>Cross-Cloud Pipeline with ADF &amp; STS: Architecture, Troubleshooting &amp; Costs</title>
      <dc:creator>Panagiotis</dc:creator>
      <pubDate>Tue, 31 Mar 2026 12:15:23 +0000</pubDate>
      <link>https://dev.to/agileactors/cross-cloud-pipeline-with-adf-sts-architecture-troubleshooting-costs-5bj5</link>
      <guid>https://dev.to/agileactors/cross-cloud-pipeline-with-adf-sts-architecture-troubleshooting-costs-5bj5</guid>
      <description>&lt;p&gt;Every data engineer eventually ends up staring at a problem that shouldn't exist. Data that needs to be somewhere it isn't. Two systems that should talk to each other but don't. A business requirement that assumes clouds are just different tabs in the same browser.&lt;/p&gt;

&lt;p&gt;Our version of this problem was simple to describe and genuinely interesting to solve: operational data lived in PostgreSQL on Azure, while the analytics team (data scientists, BI developers, the people who actually make decisions from data) had built everything in BigQuery on GCP. Nobody was migrating either side, so my job was to make them talk.&lt;/p&gt;

&lt;p&gt;What followed was one of those projects that starts as "a quick pipeline" and ends up teaching you more about cloud architecture, cross-service authentication, and silent failure modes than you expected. Every layer worked beautifully in isolation, but the problems lived exclusively in the spaces between services, in the handoffs, the assumptions, the error messages that pointed everywhere except at the actual cause.&lt;/p&gt;

&lt;p&gt;This is that story. The architecture, yes, but more so the debugging sessions that shaped it. If you're building anything that crosses cloud boundaries, the troubleshooting sections alone might save you a few weeks.&lt;/p&gt;




&lt;h2&gt;
  
  
  How We Got Here
&lt;/h2&gt;

&lt;p&gt;Companies rarely end up multi-cloud by design. It usually happens through acquisitions, through teams making independent vendor decisions, or through the gravitational pull of a tool that's genuinely best-in-class for its purpose.&lt;/p&gt;

&lt;p&gt;In our case, the operational side of the business had grown up on Azure, with infrastructure, networking, and identity all running on Microsoft. PostgreSQL on Azure's managed Flexible Server made sense because it's a solid managed database with clean VNet integration and no public endpoint, which is a feature, not a limitation.&lt;/p&gt;

&lt;p&gt;The analytics side had independently converged on Google Cloud. BigQuery is genuinely exceptional for analytical workloads, dbt had become the transformation layer, and Looker sat on top. The team had invested years building in this ecosystem, so migrating to Azure wasn't realistic, and nobody had the appetite for it either.&lt;/p&gt;

&lt;p&gt;So we had two clouds, both legitimate, both entrenched, and we needed a bridge.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi1riekj177ttukzjlcvw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi1riekj177ttukzjlcvw.png" alt="This means war" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Surprise: ADF Can't Write to BigQuery
&lt;/h2&gt;

&lt;p&gt;The natural starting point was Azure Data Factory, Microsoft's managed data integration service that has connectors for hundreds of sources and sinks, including a Google BigQuery connector right there in the UI.&lt;/p&gt;

&lt;p&gt;What the marketing materials don't lead with: the BigQuery connector in ADF is &lt;strong&gt;source-only&lt;/strong&gt;. You can read data &lt;em&gt;from&lt;/em&gt; BigQuery into Azure, but you cannot write &lt;em&gt;to&lt;/em&gt; it. Same story with &lt;strong&gt;Google Cloud Storage, which is also not a supported sink.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I remember the exact moment I discovered this. I had already designed half the pipeline in my head, envisioning a clean Copy Activity with source PostgreSQL and sink BigQuery, done by lunch. I opened the sink configuration dropdown, scrolled through every Azure-native option on offer, and scrolled again. BigQuery wasn't among them. I scrolled one more time, but no, I hadn't missed it.&lt;/p&gt;

&lt;p&gt;This is one of those discoveries that reshapes an entire project in a single moment. It's not a bug or a misconfiguration, it's a fundamental constraint of how ADF's connector ecosystem works, and once you accept it, everything downstream changes. The tempting response is frustration, because you've just lost the simplest possible architecture.&lt;/p&gt;

&lt;p&gt;The productive response is to ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;what can ADF write to natively?&lt;/em&gt; Azure Blob Storage, obviously.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;what can Google Cloud pull data from natively?&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where things got interesting.&lt;/p&gt;




&lt;h2&gt;
  
  
  Finding the Right Shape
&lt;/h2&gt;

&lt;p&gt;When you can't go direct, you look for managed services designed for the exact gap you're trying to cross.&lt;/p&gt;

&lt;p&gt;Google Cloud Storage Transfer Service is exactly that, a managed GCP service whose entire job is moving data between storage systems, including Azure Blob Storage. It authenticates with Azure using a SAS token, reads files from a Blob container, and writes them into a GCS bucket, all without VMs, custom code, or an ETL framework.&lt;/p&gt;

&lt;p&gt;Once you see it, the architecture snaps into place:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq85n28l8qlmm67y5rhcw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq85n28l8qlmm67y5rhcw.png" alt="The complete pipeline. Five managed services, zero custom connectors." width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Azure Data Factory extracts from PostgreSQL through a Self-Hosted Integration Runtime and stages data as Parquet files in Blob Storage. Storage Transfer Service then moves those files from Azure to GCS, acting as the cross-cloud bridge. BigQuery's Jobs API loads the Parquet into raw tables, and dbt Cloud deduplicates and transforms the raw data into clean, analytics-ready tables.&lt;/p&gt;

&lt;p&gt;Five hops, but no custom code in any of them. That's the design philosophy that made this project work: &lt;strong&gt;use each provider's own tools for what they're designed to do&lt;/strong&gt;, and design the handoffs between them carefully.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Out of the Private Network
&lt;/h2&gt;

&lt;p&gt;Before we could even think about cross-cloud transfers, we had a more immediate challenge: PostgreSQL was deployed as an Azure Flexible Server with VNet integration, meaning it sat inside a private Azure VNet on a delegated subnet with no public endpoint. This is by design, but it creates a chain of constraints that narrows your options considerably. Firstly, &lt;a href="https://learn.microsoft.com/en-us/azure/postgresql/network/concepts-networking-private-link" rel="noopener noreferrer"&gt;Azure does not support private endpoint creation for VNet-integrated Flexible Servers&lt;/a&gt;, so there was no way to expose the database through Private Link. That rules out more than just direct access, because ADF's Managed Virtual Network integration runtime &lt;a href="https://learn.microsoft.com/en-us/azure/data-factory/managed-virtual-network-private-endpoint" rel="noopener noreferrer"&gt;connects to data sources exclusively through managed private endpoints&lt;/a&gt;, which means it can only reach resources that support Private Link. No private endpoint on Postgres means no managed VNet runtime either. The only remaining option was a Self-Hosted Integration Runtime, a Windows VM deployed inside the same VNet and registered with ADF, acting as its private agent.&lt;/p&gt;

&lt;p&gt;Think of it less as a separate component and more as ADF's arm reaching inside the locked room. Conceptually elegant, though setup is where the surprises live.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Java Mystery
&lt;/h3&gt;

&lt;p&gt;Our first pipeline run against a real table failed with a cryptic error. The Copy Activity connected to PostgreSQL successfully (we could see it reading rows in the logs), but the moment it tried to write the first Parquet file to Blob Storage, it crashed with something about a JRE not being found, which was not exactly self-documenting.&lt;/p&gt;

&lt;p&gt;If you're not already expecting this, you'd spend your first hour looking at network rules, storage account permissions, or the SHIR registration itself, which is exactly what we did. We checked the linked service credentials, verified the Blob container existed, and tested with a CSV sink instead of Parquet. The CSV worked, which narrowed it down to something specific about the Parquet writer.&lt;/p&gt;

&lt;p&gt;Here's what was actually happening: &lt;strong&gt;ADF's Copy Activity uses a Java-based Parquet writer under the hood.&lt;/strong&gt; Our SHIR VM was a clean Windows Server image with no Java runtime. The SHIR installed fine, registered fine, and connected to PostgreSQL fine, but when it needed to write Parquet, it looked for a JRE, found nothing, and threw an error that only mentioned Java obliquely.&lt;/p&gt;

&lt;p&gt;The fix took five minutes (install OpenJDK 17 and restart the runtime service), but finding it took most of a morning. The frustrating part is that the error message doesn't say "install Java." You have to mentally connect "JRE not found" to "Parquet writing requires Java, and this VM doesn't have it." In hindsight it's obvious, but in the moment, with ten other possible causes competing for attention, it's not.&lt;/p&gt;

&lt;h3&gt;
  
  
  The DNS Ghost
&lt;/h3&gt;

&lt;p&gt;With Java installed, the next run hung for two minutes and timed out with a connection error to PostgreSQL. I knew the SHIR was inside the VNet and I could RDP in and ping other resources, so everything looked connected, yet the SHIR couldn't resolve the PostgreSQL hostname.&lt;/p&gt;

&lt;p&gt;Azure Flexible Server uses a private DNS zone for hostname resolution, meaning the hostname resolves to a private IP only if that DNS zone is properly linked to the VNet where the SHIR lives. Our VNet was there, the DNS zone was there, but the link between them wasn't. The portal showed the zone as "active," just not active &lt;em&gt;for our VNet&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The error from ADF was a plain connection timeout with nothing DNS-related in it. The debugging path that cracked it: I opened a command prompt on the SHIR VM and ran an nslookup against the PostgreSQL hostname, which returned the public Azure DNS answer instead of a private IP. That was the tell.&lt;/p&gt;

&lt;p&gt;Linking the DNS zone took thirty seconds, but the lesson is broader: in Azure's private networking model, connectivity and name resolution are two entirely different things. You can have full network connectivity and still fail because DNS doesn't resolve correctly, and the errors don't help you distinguish between the two.&lt;/p&gt;




&lt;h2&gt;
  
  
  Making the Extraction Incremental
&lt;/h2&gt;

&lt;p&gt;Full reloads were never an option because some tables had billions of rows and were growing constantly, making a complete load on every run expensive, slow, and fragile. So we went with watermark-based incremental extraction, tracking the maximum timestamp from the last successful run and extracting only newer rows.&lt;/p&gt;

&lt;p&gt;Sounds simple, but there's a subtle data loss scenario hiding in the most natural approach.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Watermark Race Condition
&lt;/h3&gt;

&lt;p&gt;The intuitive pattern goes like this: read the last watermark, extract all rows newer than that, then record the current maximum as the next starting point. Clean and simple, and broken in one specific case that took us a while to find.&lt;/p&gt;

&lt;p&gt;While the copy is running (say it takes eight minutes for a large table), new rows are being inserted into PostgreSQL with timestamps between the old watermark and the current moment. The copy finishes, captures the maximum timestamp from the data it extracted, and records that as the new watermark, but rows inserted &lt;em&gt;during&lt;/em&gt; the copy, after the query started reading that portion of the table, weren't in the batch. On the next run, they're below the new watermark, which means they're gone. Silently.&lt;/p&gt;

&lt;p&gt;The insidious part is the scale: you don't lose thousands of rows, just a handful per run, the ones that happened to be inserted in that narrow window. Row counts still look roughly right, dashboards still update, and everything appears healthy until someone runs a precise reconciliation and the numbers are off by a fraction of a percent. That's how we found it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; capture the current maximum &lt;em&gt;before&lt;/em&gt; the copy starts and use it as an upper bound. Your extraction becomes a bounded window containing everything between the old watermark and the pre-captured ceiling, with anything above that ceiling waiting for the next run. Nothing falls through.&lt;/p&gt;

&lt;p&gt;This pattern is in Microsoft's documentation, but it's not the first result when you search for "ADF incremental load." The first results show the simpler version, the one with the race condition. You have to dig deeper to find the bounded window variant, and by the time you're digging, you've usually already lost some data.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why Parquet Matters More Than You Think
&lt;/h3&gt;

&lt;p&gt;Parquet as the staging format goes beyond performance because it's what makes the whole pipeline schema-agnostic. Parquet embeds schema information inside the file itself, so when BigQuery receives a Parquet file, it reads the schema from the headers and creates the target table automatically. Adding a new table to the pipeline is a single configuration entry with no manual schema definitions and no migrations.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Schema drift works the same way: a new column appears in PostgreSQL, BigQuery adds it, old rows show null, and the pipeline doesn't need to know or care.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One wrinkle:&lt;/strong&gt; PostgreSQL has a richer type system than BigQuery, with spatial types, custom domains, and array columns that don't translate directly. What ADF does is quietly cast any incompatible type to plain text before writing the file, with no error and no warning. We didn't know it was happening until a data scientist asked why a column that should have been an array was showing up as a string. The lesson: when bridging type systems, always verify what arrives, not just what was sent.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Cross-Cloud Handoff
&lt;/h2&gt;

&lt;p&gt;Storage Transfer Service is elegant in theory, but getting it to work in production revealed a series of gotchas that the documentation glosses over. I'm going to walk through each one in the order we hit them, because the order matters: each looks like the previous problem until you realize it's something entirely different.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feiu9km7t59jv0pgf3kbc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feiu9km7t59jv0pgf3kbc.png" alt="Moving Data" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Firewall Problem
&lt;/h3&gt;

&lt;p&gt;We'd configured the Blob Storage account with firewall rules allowing only our VNet and known IPs, which is standard practice. Then we created the STS job, which started, ran for ten seconds, and failed with an authentication error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The actual problem:&lt;/strong&gt; Google's transfer agents connect from IP ranges that are large, dynamic, and change frequently, so you cannot whitelist them statically. The storage account needs to be open to all networks, with security coming from the SAS token instead: short-lived, read-only, HTTPS-only, and automatically rotated. The token is the lock, not the firewall. This requires a mental model shift, but it's actually more robust than maintaining a firewall against a moving target.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Permission Nobody Told You About
&lt;/h3&gt;

&lt;p&gt;With encoding fixed, STS could authenticate with Azure, but job creation failed with a FAILED_PRECONDITION error on the GCP side. It turns out STS verifies that the destination bucket exists, which requires a permission called &lt;code&gt;legacyBucketReader&lt;/code&gt;, an older role that doesn't overlap with the newer IAM roles the way you'd expect. We'd already granted &lt;code&gt;objectAdmin&lt;/code&gt; on the bucket, but that didn't matter, and the error message said nothing about which permission was missing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Project Number vs. Project ID
&lt;/h3&gt;

&lt;p&gt;When referencing secrets from an STS job, the configuration expects the project's numeric identifier, not the human-readable name. Using the name produces yet another FAILED_PRECONDITION error with no mention of the format. By this point, we'd developed a reflex: when STS throws FAILED_PRECONDITION, the problem is almost never what the error implies.&lt;/p&gt;




&lt;h2&gt;
  
  
  Automating the Credential Rotation
&lt;/h2&gt;

&lt;p&gt;SAS tokens expire, and a pipeline that works today but silently breaks in 90 days isn't production engineering, it's technical debt with a countdown timer.&lt;/p&gt;

&lt;p&gt;We solved this with an Azure Function on a weekly timer that generates a new SAS token, URL-decodes it (the hard-won lesson), and pushes the decoded token to both Azure Key Vault and GCP Secret Manager. STS then reads the latest version automatically on the next transfer. The function runs on a Consumption plan, and the monthly bill rounds to zero.&lt;/p&gt;

&lt;p&gt;One nuance worth mentioning: the Function itself needs credentials to write to GCP Secret Manager, which we handle with a GCP service account key stored in Azure Key Vault. Yes, there's a philosophical irony in storing a GCP credential in Azure to rotate an Azure credential into GCP. Welcome to multi-cloud.&lt;/p&gt;




&lt;h2&gt;
  
  
  Loading into BigQuery
&lt;/h2&gt;

&lt;p&gt;Once files land in GCS, the BigQuery Jobs API loads them into raw tables in append mode, so reruns are safe by design. The Jobs API works well, but it has one behavior that caught us off guard.&lt;/p&gt;

&lt;h3&gt;
  
  
  When "DONE" Doesn't Mean "Succeeded"
&lt;/h3&gt;

&lt;p&gt;BigQuery returns a status of DONE for both successful and failed jobs, with the difference being a separate error field that's only present on failure. This is documented, but it's the kind of API behavior you read once, think "that's odd," and then forget about until it bites you.&lt;/p&gt;

&lt;p&gt;Our initial implementation polled for DONE and moved on, and for weeks this worked because no jobs were failing. The pipeline hummed along, watermarks advanced, dashboards updated, and everything seemed healthy.&lt;/p&gt;

&lt;p&gt;Then one day a schema mismatch caused a load to fail: a column that had been integer upstream had changed to string, so the load job rejected the file. BigQuery returned DONE, our pipeline marked the run as successful, the watermark advanced, and the data simply wasn't in BigQuery.&lt;/p&gt;

&lt;p&gt;Nobody noticed for four days until a BI developer flagged that a dashboard was showing stale numbers. We traced it to the failed load and then to our status-checking logic. The fix took ten minutes (check the error field alongside the status), but recovering four days of missed data took considerably longer because the watermark had already advanced past the missing rows. We had to manually reset watermarks, re-extract, and re-load, exactly the kind of manual intervention the pipeline was designed to avoid.&lt;/p&gt;

&lt;p&gt;Always check both fields. BigQuery's error messages are specific and actionable when you actually look at them.&lt;/p&gt;




&lt;h2&gt;
  
  
  dbt: Making Sense of Append-Only Data
&lt;/h2&gt;

&lt;p&gt;Appending rows every run means duplicates accumulate, which is intentional because it keeps the loading layer simple and safe, but it also means raw tables can't be used directly for analytics. You need a deduplication layer, and that's where dbt comes in.&lt;/p&gt;

&lt;p&gt;dbt's incremental models handle exactly this. Configured with a unique key, each run generates a MERGE statement that updates changed rows and inserts new ones. The deduplication logic lives in a well-tested SQL model, version-controlled in Git, not in a fragile Python script or an ADF expression buried three menus deep.&lt;/p&gt;

&lt;p&gt;The result is a clean two-layer architecture. Raw tables hold every row ever loaded with ingestion timestamps, which is useful for debugging, auditing, and reprocessing. If something goes wrong downstream, you can always go back to the raw layer and replay. dbt silver tables hold deduplicated, partitioned, clustered data, the kind that analysts actually query. The complexity of the multi-cloud pipeline is invisible to data consumers.&lt;/p&gt;

&lt;p&gt;When something looks wrong in the analytics layer, you trace it through the dbt model to the raw load and see exactly what arrived and when. This audit trail doesn't seem important until the first time it saves you from a long debugging session.&lt;/p&gt;

&lt;p&gt;After all loads complete, ADF retrieves the dbt Cloud API token from Key Vault and triggers the transformation job automatically, so the entire pipeline runs end to end without human involvement.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Metadata-Driven Design
&lt;/h2&gt;

&lt;p&gt;The decision that paid off most disproportionately was making the pipeline entirely metadata-driven from day one. I almost didn't, because the first prototype was hardcoded for three tables, and the temptation to just keep adding tables manually was real. But the upfront investment in a configuration layer saved us weeks of work over the following months.&lt;/p&gt;

&lt;p&gt;Every table is a single row in a configuration table stored in Azure SQL Database, and that row tracks where data comes from, where it's going, how far the last run got, and what happened. ADF reads this table at the start of every run, with nothing hardcoded in the pipeline itself. Adding a new table means adding one row, with no pipeline changes, no GCP console work, and no manual STS job creation. On first run, ADF creates the STS transfer job automatically, BigQuery creates the target table from the Parquet schema, and data starts flowing.&lt;/p&gt;

&lt;p&gt;The same table doubles as the operational dashboard. The error column tells you what went wrong, the watermark tells you where each table stands, the timestamp tells you when each was last loaded, and the row count tells you if something loaded suspiciously fewer rows than expected. A single query gives you the health of every table in the pipeline at a glance.&lt;/p&gt;

&lt;p&gt;It also made the project easier to hand off, because everything about the pipeline's configuration lives in a table anyone can read, with no tribal knowledge buried in JSON that requires ADF Studio access to understand.&lt;/p&gt;

&lt;h3&gt;
  
  
  One More Thing: ADF's Nesting Limits
&lt;/h3&gt;

&lt;p&gt;ADF has a limitation that isn't widely documented: you cannot nest certain activity types inside other activities beyond a certain depth. We discovered this when trying to put a polling loop inside a conditional block, and while the pipeline validated fine in ADF Studio, at runtime ADF threw a validation error about unsupported nesting.&lt;/p&gt;

&lt;p&gt;The solution was to break the nested logic into a separate child pipeline connected via Execute Pipeline. The child contains the polling loop, isolated from any conditional wrapper, which means more pipelines to manage, but each one is simpler and the nesting constraint disappears.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Cost Reality
&lt;/h2&gt;

&lt;p&gt;Cross-cloud pipelines have a reputation for being expensive, but this one isn't, though you do need to account for a cost that's easy to overlook.&lt;/p&gt;

&lt;p&gt;The largest ongoing cost is the SHIR VM, which runs continuously. The Azure SQL Database runs on Basic tier at around €4/month, the Azure Function runs on Consumption for single-digit euros, and Blob Storage staging costs near zero because files are deleted after each load.&lt;/p&gt;

&lt;p&gt;The cost that catches most people off guard in multi-cloud architectures is the &lt;strong&gt;cross-cloud data transfer&lt;/strong&gt;. When STS pulls files from Azure Blob Storage, that data leaves Azure's network as egress to the public internet, which Azure charges at roughly $0.087/GB for the first 10 TB. On the GCP side, ingress into Cloud Storage is free, so you're only paying the Azure side of the transfer. For our workload of a dozen tables with incremental loads, this amounts to a few euros per month because we're only moving deltas, not full table dumps. If you were moving terabytes daily, though, this line item would dominate the bill, and you'd want to look into Azure ExpressRoute or Google Cloud Interconnect to bring those rates down significantly.&lt;/p&gt;

&lt;p&gt;On the GCP side beyond ingress, Storage Transfer Service is free for Azure-to-GCS transfers, and BigQuery load jobs are free as well since Google charges for storage and queries, not ingestion. The GCS staging bucket costs a few euros.&lt;/p&gt;

&lt;p&gt;Total for a dozen tables with incremental loads: well under €150 per month. The comparison that matters isn't against doing nothing, it's against a self-managed ETL tool on a VM, a Python script on a scheduler, or an Airbyte instance you're responsible for operating. Those trade low licensing cost for high operational burden, while managed services invert that trade-off.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Documentation Doesn't Tell You
&lt;/h2&gt;

&lt;p&gt;Looking back, a pattern emerges: the hardest problems were always at the boundaries between services. Within any single cloud service, the documentation is generally good, but at the handoffs, where Azure talks to GCP, where ADF talks to the SHIR, where BigQuery interprets what "done" means, the documentation assumes things will go smoothly.&lt;/p&gt;

&lt;p&gt;A summary of what actually bit us, roughly in order of encounter:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install Java on the SHIR VM before running any Parquet-based Copy Activity.&lt;/li&gt;
&lt;li&gt;Verify the Private DNS Zone is linked to the correct VNet before assuming connectivity works.&lt;/li&gt;
&lt;li&gt;Always use a bounded watermark window to prevent the incremental extraction race condition.&lt;/li&gt;
&lt;li&gt;URL-decode SAS tokens before storing them in Secret Manager.&lt;/li&gt;
&lt;li&gt;Open the storage account to all networks when using STS. The token is the security layer, not the firewall.&lt;/li&gt;
&lt;li&gt;Grant &lt;code&gt;legacyBucketReader&lt;/code&gt; to the STS service agent. Use numeric project IDs in secret references, not human-readable names.&lt;/li&gt;
&lt;li&gt;Check BigQuery's error field, not just the status.&lt;/li&gt;
&lt;li&gt;And split ADF logic across child pipelines to avoid nesting limits.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;None of these are difficult once you know them, but all of them are invisible until you hit them. The list above represents roughly two and a half weeks of cumulative debugging time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Clouds, One Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa57hs8v5i6qzirj8teyl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa57hs8v5i6qzirj8teyl.png" alt="Two Clouds, One Pipeline" width="800" height="287"&gt;&lt;/a&gt;&lt;br&gt;
The pipeline has been running in production for months without manual intervention. Watermarks advance automatically, new tables go live in minutes, SAS tokens rotate on schedule, dbt keeps the silver layer clean, and the configuration table is the single source of truth.&lt;/p&gt;

&lt;p&gt;The architecture isn't elegant in the way a single-cloud pipeline can be. There are five hops where a native solution might have two, there are IAM permissions to manage across two providers, and there are encoding quirks and API behaviors you have to learn once and then never forget.&lt;/p&gt;

&lt;p&gt;But it works, it's observable, it costs less per month than a team dinner, and it was built entirely from managed services the team already understood, with no new tools to learn, no new infrastructure to operate, and no new vendor relationships to manage.&lt;/p&gt;

&lt;p&gt;The hardest part wasn't the code, because there is almost no code. It was understanding what each managed service was designed to do, what it quietly assumed, and building the handoffs between them well enough that when something goes wrong, it fails loudly, not silently and slowly, weeks later, when the damage is already done.&lt;/p&gt;

&lt;p&gt;If this story has a thesis, it's this: the documentation for any individual cloud service is generally good, but the gaps are always in the spaces between services. That's where the interesting engineering happens, and it's where most of the debugging time goes. Plan for it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;That understanding is the actual deliverable. The pipeline is just what you get when you have it.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>azure</category>
      <category>googlecloud</category>
      <category>dataengineering</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Which endpoints are tested? Answered, instantly</title>
      <dc:creator>Georgios Pligoropoulos</dc:creator>
      <pubDate>Fri, 20 Mar 2026 13:30:43 +0000</pubDate>
      <link>https://dev.to/agileactors/which-endpoints-are-tested-answered-instantly-3enj</link>
      <guid>https://dev.to/agileactors/which-endpoints-are-tested-answered-instantly-3enj</guid>
      <description>&lt;p&gt;They told us it was impossible. They were wrong.&lt;/p&gt;

&lt;p&gt;And they kept asking the same anxious question...&lt;br&gt;
Which endpoints are tested?&lt;br&gt;
A question that usually shows up right when you are trying to enjoy that lunch break where you promised yourself you would not open a laptop.&lt;/p&gt;

&lt;p&gt;You want this answered now. Instantly. For hundreds of scenarios.&lt;br&gt;
So you open Swagger UI.&lt;br&gt;
You stare at the endpoints.&lt;br&gt;
You map an endpoint to whatever name the autogenerated client felt like giving it.&lt;br&gt;
You search.&lt;br&gt;
Multiple versions.&lt;br&gt;
Same method names.&lt;br&gt;
Different clients.&lt;br&gt;
...Of course!&lt;br&gt;
You filter results.&lt;br&gt;
Wrong client.&lt;br&gt;
Ignore that.&lt;br&gt;
Ignore this.&lt;br&gt;
Not a scenario.&lt;br&gt;
Still not a scenario.&lt;br&gt;
You finally find the right class.&lt;br&gt;
You count invocations.&lt;br&gt;
One. Two. Maybe three.&lt;br&gt;
Was that all of them?&lt;br&gt;
Now do it again.&lt;br&gt;
Every endpoint.&lt;br&gt;
Every version.&lt;br&gt;
Every Swagger file.&lt;/p&gt;

&lt;p&gt;Somewhere around here you realize you’re not testing anymore.&lt;br&gt;
And this could end here, as a sad story of a low budget.&lt;br&gt;
But every story has a moment where everything changes. The year is 2025 and LLMs are any developer's best pals, cheaply available.&lt;/p&gt;

&lt;p&gt;Frankly speaking, you can buy an electric drill. You can take the conscious decision to not care how the electric drill is built, and only care that it does its job well enough. An LLM could code the whole thing, teach us how to use it and even write this blog post for the tool as well, if we really wanted to.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 0: Symmetry Doesn't Happen
&lt;/h2&gt;

&lt;p&gt;Before we began, we confirmed that without any exceptions NSwag was already used consistently across the entire project. It is a library that parses the Swagger Json and generates C# classes and methods that correspond to endpoints. Because without a generated client, the same endpoint might be called in ten different ways across the codebase. Then your coverage question turns into archaeology. Who said obsession does not pay off?&lt;/p&gt;

&lt;p&gt;Symmetry in the code, a purely technical project, no business specific context .. sounds like the perfect recipe for automation Step by Step.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 1: Get Requests from Swagger
&lt;/h2&gt;

&lt;p&gt;The swagger json file looks like that&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"/HealthCheck"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"get"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"HealthCheck"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Method for health checking api version 1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Accept-Language"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"in"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"header"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"responses"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"200"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"OK"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read the NSwag configuration JSON. Inside it you will find the link to the Swagger JSON.&lt;/p&gt;

&lt;p&gt;Fetch the Swagger JSON, parse it, and collect all paths along with their HTTP methods (GET, POST, PUT, etc.). In the current implementation, the key is simply Method + Path.&lt;/p&gt;

&lt;p&gt;Yes, you could go further and track different scenario variants by parameter combinations. But if your first milestone is "every request is covered at least once", that extra complexity is just glitter on a fire alarm.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Forget about Regex and bring a magician on Board
&lt;/h2&gt;

&lt;p&gt;Today's magician is Microsoft's Code Analysis aka Roslyn:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;PackageReference&lt;/span&gt; &lt;span class="na"&gt;Include=&lt;/span&gt;&lt;span class="s"&gt;"Microsoft.CodeAnalysis"&lt;/span&gt; &lt;span class="na"&gt;Version=&lt;/span&gt;&lt;span class="s"&gt;"4.11.0"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;PackageReference&lt;/span&gt; &lt;span class="na"&gt;Include=&lt;/span&gt;&lt;span class="s"&gt;"Microsoft.CodeAnalysis.CSharp"&lt;/span&gt; &lt;span class="na"&gt;Version=&lt;/span&gt;&lt;span class="s"&gt;"4.11.0"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;PackageReference&lt;/span&gt; &lt;span class="na"&gt;Include=&lt;/span&gt;&lt;span class="s"&gt;"Microsoft.CodeAnalysis.CSharp.Workspaces"&lt;/span&gt; &lt;span class="na"&gt;Version=&lt;/span&gt;&lt;span class="s"&gt;"4.11.0"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;PackageReference&lt;/span&gt; &lt;span class="na"&gt;Include=&lt;/span&gt;&lt;span class="s"&gt;"Microsoft.CodeAnalysis.Workspaces.MSBuild"&lt;/span&gt; &lt;span class="na"&gt;Version=&lt;/span&gt;&lt;span class="s"&gt;"4.11.0"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We are building something beautiful, and beauty needs structure. Roslyn allows you to navigate the codebase properly and search with semantics.&lt;/p&gt;

&lt;p&gt;A quick &lt;code&gt;Cmd+F&lt;/code&gt; search through the client reveals there is a comment called Operation Path that matches the url and the Method name is just above.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;virtual&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Threading&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;GetProductDetailsResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;ProductDetailsAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;accept_Language&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;productId&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ArgumentNullException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"productId"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;client_&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;disposeClient_&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;request_&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Net&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HttpRequestMessage&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accept_Language&lt;/span&gt; &lt;span class="p"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;request_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;TryAddWithoutValidation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Accept-Language"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;ConvertToString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accept_Language&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Globalization&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CultureInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvariantCulture&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
            &lt;span class="n"&gt;request_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Method&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Net&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HttpMethod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"GET"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;request_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Accept&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Net&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MediaTypeWithQualityHeaderValue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text/plain"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;urlBuilder_&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;StringBuilder&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IsNullOrEmpty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_baseUrl&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="n"&gt;urlBuilder_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_baseUrl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="c1"&gt;// Operation Path: "Product/productDetails/{productId}"&lt;/span&gt;
            &lt;span class="n"&gt;urlBuilder_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Product/productDetails/"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So Regex and pray or Roslyn and play? Either way, now you have an automated mapping from Swagger request to NSwag-generated method name.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: The Brute Force Part
&lt;/h2&gt;

&lt;p&gt;Don't get me wrong, humans are great, but humans are not meant to do the same search 600 times.&lt;/p&gt;

&lt;p&gt;Finding where those client methods are called throughout the repository seems like a &lt;code&gt;Cmd+Shift+F&lt;/code&gt; of the &lt;code&gt;ProductDetailsAsync&lt;/code&gt; ?&lt;br&gt;
Well.. do you remember that time that you picked the username definitely-not-taken that you were sure to be unique but it wasn't after all ? It is one of those times!&lt;br&gt;
You soon realize that the method is named exactly the same among versions, which you didn't think of, plus the method happens to be invoked inside the autogenerated code itself, and your luck is so great that some library happens to use the same method name for a completely different reason.&lt;/p&gt;

&lt;p&gt;Let code analysis scan the entire solution. Iterate every project, every C# document (.cs file), and collect invocations of any kind.&lt;br&gt;
If the string representation of an invocation matches one of the method names you collected, keep it.&lt;/p&gt;

&lt;p&gt;What you want out of each invocation, and can get thanks to Roslyn, is a couple of things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filepath: in which file we find this invocation&lt;/li&gt;
&lt;li&gt;Containing Class: Looking at the ancestors in the syntax tree which is the first Class that we encounter&lt;/li&gt;
&lt;li&gt;Line &amp;amp; Column Number: To be able to pinpoint it exactly in the file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And, the most useful of all, the Definition of the Method:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filepath: Where the file is found or dummy string if outside of the project&lt;/li&gt;
&lt;li&gt;Definition Class: The class that defines the method that was invoked&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Voila! With the dictionary &lt;code&gt;method name -&amp;gt; all the invocation info&lt;/code&gt; you can now start filtering, filtering, filtering to ensure that only the ones involved in the scenarios of the suite are included.&lt;/p&gt;

&lt;p&gt;In other words:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep only invocations that belong to the current NSwag client (and the correct API version)&lt;/li&gt;
&lt;li&gt;Exclude invocations inside NSwag-generated code.&lt;/li&gt;
&lt;li&gt;Exclude calls from places unrelated to scenarios, so the numbers reflect real test coverage.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 4: Show it to the World
&lt;/h2&gt;

&lt;p&gt;As the fan of the CPU slows down, count, export to CSV and if you feel like showing off, plot the statistics into a bar chart.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Request&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GET /MeinePost&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GET /order/parcelStamp/size&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GET /order/parcelStamp/config&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;POST /AddressValidation&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgn56a2z52mf5azdckz47.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgn56a2z52mf5azdckz47.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Step: Remove the blindfold
&lt;/h2&gt;

&lt;p&gt;The call to action becomes obvious. If a request has a zero count, it is not involved in any scenario at all.&lt;/p&gt;

&lt;p&gt;From experience, covering each request at least once is the first meaningful milestone. Once you hit that milestone, the conversation can become creative and interesting: deeper scenario variants, data combinations, edge cases, and all the fun stuff.&lt;/p&gt;

&lt;h2&gt;
  
  
  Eagle's Eye View
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fivctrgik09hnzqsupll1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fivctrgik09hnzqsupll1.png" alt=" " width="800" height="796"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  You can, but should you ?
&lt;/h2&gt;

&lt;p&gt;The coding of the project was faster than the writing of this blog post. Think about it. Efficiency was at its peak. But when speed increases, something else must give, and it’s neither computing power nor electricity.&lt;/p&gt;

&lt;p&gt;You used the drill but you did not learn how to build a drill, didn't you ?&lt;br&gt;
For sure you learned how prompt engineering can construct the entire project but merely understanding what you see does not mean that you actually learned how to do it.&lt;br&gt;
Learning requires what the education industry now calls productive struggle and there is a great TED talk explaining it, if you want to know more: &lt;a href="https://www.youtube.com/watch?v=YBH8rQv4aTQ" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=YBH8rQv4aTQ&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You will not believe how much I am hesitating of writing a suggestion here, as the temptation to not follow it myself is real, but here it is: Give the LLM work that you already know how to do yourself and it is just boring and slow to do on your own. Just don't let the LLM think on your behalf.&lt;/p&gt;

&lt;p&gt;There's no going back. Choose wisely.&lt;/p&gt;

&lt;p&gt;"Which endpoints are tested?" Answered instantly.&lt;/p&gt;

&lt;p&gt;"Why do they matter?" That meeting is still on Monday.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;At Agile Actors, we thrive on challenges with a bold and adventurous spirit. We confront problems directly, using cutting-edge technologies in the most innovative and daring ways. If you’re excited to join a dynamic learning organization where knowledge flows freely and skills are refined to excellence, consider joining our exceptional team. Let’s conquer new frontiers together. Check out our &lt;a href="https://apply.workable.com/agileactors/" rel="noopener noreferrer"&gt;openings&lt;/a&gt; and choose the Agile Actors Experience!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>automation</category>
      <category>productivity</category>
      <category>testing</category>
    </item>
    <item>
      <title>Building Intelligent, Metadata-Driven Pipelines with Azure Data Factory</title>
      <dc:creator>Sotiria Vernikou</dc:creator>
      <pubDate>Tue, 18 Nov 2025 12:35:43 +0000</pubDate>
      <link>https://dev.to/agileactors/building-intelligent-metadata-driven-pipelines-with-azure-data-factory-4ebf</link>
      <guid>https://dev.to/agileactors/building-intelligent-metadata-driven-pipelines-with-azure-data-factory-4ebf</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In today’s data-driven landscape, organizations are increasingly relying on automated, scalable, and intelligent data pipelines to streamline their analytics workflows. Among the many tools available, Azure Data Factory (ADF) stands out as a powerful orchestrator for building robust ETL processes. But when paired with metadata-driven design and integrated with services like Logic Apps, SharePoint, and Azure SQL Pools, ADF transforms from a simple data mover into a dynamic engine capable of handling complex ingestion scenarios with precision and resilience.&lt;/p&gt;

&lt;p&gt;This article explores how to master metadata-driven pipelines in Azure Data Factory, using a real-world scenario where Excel files are ingested from a dedicated SharePoint folder into a SQL pool. The workflow is designed to be intelligent and fault-tolerant: it archives successfully ingested files, flags and reroutes erroneous data, and sends automated alerts when failures occur. At the heart of this system lies a metadata-driven approach that allows the pipeline to adapt dynamically to different file structures and destinations—without hardcoding logic for each case.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv0hpo0h8zfaojfgf70o6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv0hpo0h8zfaojfgf70o6.png" alt=" " width="800" height="299"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The process begins with a SharePoint scan from Logic App, which acts as the entry point to the workflow. As soon as a new Excel file lands in the designated folder, a Logic App springs into action. This app not only initiates the pipeline but also extracts critical metadata from the file name (such as sheet identifiers and target table mappings—using predefined rules stored in a SQL pool). This metadata is essential for guiding the ingestion process and ensuring that each file is routed correctly.&lt;/p&gt;

&lt;p&gt;Once the metadata is retrieved, the Logic App coordinates the movement of the file to a Storage Account, leveraging connectors that ensure secure and efficient data transfer. From there, Azure Data Factory takes over as the ingestion engine. It reads the metadata to determine which sheet to process and which SQL table to target. Using its powerful Copy Data activity, ADF performs upserts and deduplication, ensuring that only clean, unique records make it into the SQL pool.&lt;/p&gt;

&lt;p&gt;But what happens when things go wrong? Whether it’s a malformed file, missing metadata, or invalid data types, the system is designed to respond gracefully. ADF returns detailed error messages to the Logic App, which then triggers an automated email alert to notify stakeholders of the issue. Simultaneously, the problematic file is moved to a dedicated error folder for further inspection, preserving the integrity of the pipeline and preventing bad data from contaminating the SQL pool.&lt;/p&gt;

&lt;p&gt;After successful ingestion, the Logic App completes the cycle by archiving the processed files, ensuring that the SharePoint folder remains clean and ready for new uploads. This not only improves operational hygiene but also provides a historical trail for auditing and compliance purposes.&lt;/p&gt;

&lt;p&gt;By combining the strengths of Azure Data Factory, Logic Apps, SharePoint, and SQL pools, this architecture exemplifies how metadata-driven design can elevate traditional ETL workflows into intelligent, self-adjusting systems. Whether you're a data engineer looking to optimize your pipelines or an architect designing scalable solutions, mastering this approach will empower you to build resilient, maintainable, and future-proof data workflows in the Azure ecosystem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Power Behind the Pipeline: A Synergistic Use of Azure Tools
&lt;/h2&gt;

&lt;p&gt;Behind every seamless data pipeline lies a thoughtful orchestration of technologies, each chosen not just for its capabilities, but for how well it integrates into the broader architecture. In our case, the pipeline is more than a sum of its parts—it’s a carefully choreographed dance between automation, intelligence, and resilience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔗 SharePoint&lt;/strong&gt;&lt;br&gt;
We begin with SharePoint, not just because it's widely adopted, but because it offers a user-friendly interface for business users to drop files without needing to understand the backend. It acts as the gateway—simple, accessible, and secure—where data enters the system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚙️ Logic Apps&lt;/strong&gt;&lt;br&gt;
Logic Apps are the unsung heroes of this architecture. They don’t just automate—they orchestrate. Like a conductor guiding an orchestra, Logic Apps ensure that each service plays its part at the right time. From detecting new files to coordinating metadata queries and triggering ingestion, they bring harmony to what could otherwise be a chaotic process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📦 Azure Storage Account&lt;/strong&gt;&lt;br&gt;
Rather than ingesting directly from SharePoint, we use Azure Storage as a buffer zone. This design choice is strategic—it decouples the source from the ingestion engine, allowing for better control, scalability, and error handling. It’s the staging ground where data is prepped before entering the SQL pool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🚀 Azure Data Factory&lt;/strong&gt;&lt;br&gt;
Azure Data Factory is where the heavy lifting happens. But it’s not just a brute-force tool—it’s intelligent. Guided by metadata, it adapts to different file structures, performs upserts, and ensures deduplication. It’s the engine room of the pipeline, transforming raw input into structured, usable data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧠 SQL Pool&lt;/strong&gt;&lt;br&gt;
The SQL pool serves a dual purpose. It’s the brain, holding metadata that guides the pipeline’s decisions, and it’s the vault, storing the final, cleaned data. This duality makes it central to the pipeline’s adaptability and long-term value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📧 Office 365&lt;/strong&gt;&lt;br&gt;
Finally, Office 365 steps in as the messenger. When things go wrong—or right—it ensures that the right people know. Through automated emails, it closes the feedback loop, turning a technical process into a transparent experience for stakeholders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the Metadata-Driven Pipeline: A Step-by-Step Breakdown
&lt;/h2&gt;

&lt;p&gt;To implement a resilient and metadata-driven ingestion pipeline in Azure, we orchestrate a combination of &lt;strong&gt;SharePoint&lt;/strong&gt;, &lt;strong&gt;Logic Apps&lt;/strong&gt;, &lt;strong&gt;Azure Data Factory&lt;/strong&gt;, and &lt;strong&gt;SQL Pools&lt;/strong&gt;. This section walks through each component and its role in the end-to-end process.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fogo0fi851xqd9cm6nsbr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fogo0fi851xqd9cm6nsbr.png" alt=" " width="800" height="606"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. File Upload and Triggering the Workflow&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The journey begins when a user uploads an Excel (.xls) file to a dedicated SharePoint folder. This folder acts as the monitored entry point for the ingestion pipeline.&lt;/p&gt;

&lt;p&gt;A Logic App is configured to run on a daily schedule, scanning the folder for new files. This trigger ensures that the workflow is initiated automatically without manual intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Metadata Extraction and Workflow Initialization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once a new file is detected, the Logic App:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extracts metadata from the file name, such as sheet identifiers and target table names.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Queries the SQL pool to retrieve additional metadata, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Expected sheet number&lt;/li&gt;
&lt;li&gt;Target table schema&lt;/li&gt;
&lt;li&gt;Validation rules&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;This metadata-driven approach allows the pipeline to dynamically adapt to different file structures and destinations, reducing the need for hardcoded logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Moving the File to Azure Storage&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Logic App then moves the file from SharePoint to a Storage Account, using the Storage Account connector. This step decouples the ingestion process from SharePoint and prepares the file for processing by Azure Data Factory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Data Ingestion via Azure Data Factory&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Azure Data Factory (ADF) is the core engine responsible for ingesting the data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It reads the metadata from the SQL pool to determine the correct sheet and target table.&lt;/li&gt;
&lt;li&gt;Using the Copy Data activity, ADF ingests the data from the Storage Account into the SQL pool.&lt;/li&gt;
&lt;li&gt;The pipeline performs upserts and deduplication, ensuring data integrity and avoiding duplicates.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the data fails validation (e.g., wrong format, missing fields), ADF returns an error to the Logic App.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Error Handling and Notifications&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Upon receiving an error from ADF, the Logic App:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sends an automated email to the relevant stakeholders via Office 365, detailing the failure and its cause.&lt;/li&gt;
&lt;li&gt;Moves the problematic file to a dedicated error folder in SharePoint for further inspection.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures that bad data is quarantined and does not contaminate the SQL pool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Archiving Successfully Ingested Files&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For files that are successfully ingested:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Logic App moves them to an archive folder in SharePoint.&lt;/li&gt;
&lt;li&gt;This keeps the working folder clean and provides a historical trail for auditing and compliance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;7. Monitoring and Feedback Loop&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Finally, the Logic App queries the pipeline status from Azure Data Factory and includes this information in the notification email. This feedback loop ensures transparency and allows users to track the success or failure of each ingestion run.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Why Metadata-Driven Pipelines Matter
&lt;/h2&gt;

&lt;p&gt;By leveraging metadata stored in SQL pools and orchestrating services like Logic Apps and Azure Data Factory, this architecture achieves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scalability:&lt;/strong&gt; Easily handles new file types and destinations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resilience:&lt;/strong&gt; Automatically detects and handles errors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maintainability:&lt;/strong&gt; Reduces hardcoded logic and manual intervention.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transparency:&lt;/strong&gt; Keeps stakeholders informed through automated notifications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach is ideal for organizations looking to build intelligent, automated, and future-proof data pipelines in Azure.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>azure</category>
      <category>logicapps</category>
    </item>
    <item>
      <title>A Complete Guide to Building Enterprise-Grade AI Assistants on Google Cloud (No-Code)</title>
      <dc:creator>Valia Vlachopoulou</dc:creator>
      <pubDate>Wed, 15 Oct 2025 09:40:10 +0000</pubDate>
      <link>https://dev.to/agileactors/a-complete-guide-to-building-enterprise-grade-ai-assistants-on-google-cloud-no-code-29ha</link>
      <guid>https://dev.to/agileactors/a-complete-guide-to-building-enterprise-grade-ai-assistants-on-google-cloud-no-code-29ha</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Enterprises are under pressure to deliver AI solutions quickly, but the demand for talent and the complexity of integrations often slow progress. This has led to the rise of low-code platforms, which empower teams to design and deploy applications visually, reduce development time, and connect seamlessly to existing systems.&lt;/p&gt;

&lt;p&gt;Google Cloud is aligning closely with this shift. Its &lt;strong&gt;AI Applications&lt;/strong&gt; provide a low-code environment for building AI systems and &lt;strong&gt;Conversational Agents&lt;/strong&gt; that can ground responses in enterprise data and take real actions through APIs. The platform offers data stores for uploading documents, pre-built connectors for popular enterprise tools (like Jira, ServiceNow, and SharePoint), and OpenAPI support for integrating custom backends—all inside a single ecosystem. This integration enables organizations to build agentic AI systems that are fast to deploy, secure, and governed — all within a low-code environment seamlessly embedded into daily workflows.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Agents that can reason and act, grounded in enterprise data sources like PDFs, CRMs, ticketing, or HR systems.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A single cohesive ecosystem rather than a patchwork of disconnected tools.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Built-in security, scalability, and logging across the stack, because everything runs within Google Cloud.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this article, I’ll walk you through building a &lt;strong&gt;three-agent system&lt;/strong&gt; using Google Cloud’s &lt;strong&gt;no-code tooling — connected to real PDFs&lt;/strong&gt;, &lt;strong&gt;a ticket API&lt;/strong&gt;, and exposed through &lt;strong&gt;Slack&lt;/strong&gt; with &lt;strong&gt;Cloud Logging as the observability layer&lt;/strong&gt;. You’ll see how quickly you can go from blank project to fully functional, grounded enterprise chatbot team, all inside the same cloud ecosystem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Agentic System
&lt;/h2&gt;

&lt;p&gt;Before we start building, let’s understand the architecture of the agentic system we’ll implement. The setup simulates a small enterprise IT helpdesk built with Google Cloud’s Conversational Agents, featuring one Supervisor Agent and two Specialized Agents, each connected to its own data source and responsible for distinct tasks.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;PDF Retriever Agent&lt;/strong&gt; handles policy-related questions by retrieving grounded information from two key documents: the VPN Policy Template and the Database Credentials Standard (SANS, April 2025). These files are stored in a &lt;strong&gt;Data Store tool&lt;/strong&gt;, which indexes the PDFs so the agent can extract relevant policy sections and summarize them into clear, contextual answers.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;API Caller Agent&lt;/strong&gt; manages ticket-related operations using an &lt;strong&gt;OpenAPI tool&lt;/strong&gt; connected to a mock ticketing API implemented in Google Cloud Functions. The API exposes simple endpoints to create and check support tickets, allowing the agent to simulate realistic IT helpdesk interactions during the conversation.&lt;/p&gt;

&lt;p&gt;At the center of this workflow is the &lt;strong&gt;Supervisor Agent&lt;/strong&gt;, the brain of the system that &lt;strong&gt;interprets user intent&lt;/strong&gt; and &lt;strong&gt;delegates each request to the correct specialized agent&lt;/strong&gt;. When a user asks a question or submits a request, the Supervisor routes it either to the PDF Retriever (for policy guidance) or the API Caller (for ticket operations). Each worker performs its task and responds directly to the user, after which the Supervisor automatically regains control to confirm completion and offer further help.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnykkexet9ui99hxit6fc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnykkexet9ui99hxit6fc.png" alt="System Flow"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's build the Agent!&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started: Set Up Your Google Cloud Project
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Create a Google Cloud Platform Account
&lt;/h3&gt;

&lt;p&gt;Before creating your AI application, you’ll need to set up a new Google Cloud environment.&lt;br&gt;
If you don’t already have one, go to &lt;a href="https://accounts.google.com/InteractiveLogin/signinchooser?continue=https%3A%2F%2Fconsole.cloud.google.com%2F&amp;amp;followup=https%3A%2F%2Fconsole.cloud.google.com%2F&amp;amp;osid=1&amp;amp;passive=1209600&amp;amp;service=cloudconsole&amp;amp;ifkv=ARZ0qKJ5dLT16AWdxJo6Db6DnQkbTsMOUnJyOvFVepGeD4DqeEEZ9yFt0HZRBYZWSc4Hf3WVtmgKiw&amp;amp;theme=mn&amp;amp;ddm=0&amp;amp;flowName=GlifWebSignIn&amp;amp;flowEntry=ServiceLogin" rel="noopener noreferrer"&gt;Google Cloud Console&lt;/a&gt;&lt;br&gt;
and sign in with your Google account.&lt;/p&gt;
&lt;h3&gt;
  
  
  Create a New Project
&lt;/h3&gt;

&lt;p&gt;In the top navigation bar, click the project dropdown → “&lt;em&gt;New Project&lt;/em&gt;”.&lt;br&gt;
Give your project a descriptive name and select your billing account (if prompted).&lt;br&gt;
Choose an organization or leave it under “&lt;em&gt;No organization&lt;/em&gt;” if you’re testing.&lt;/p&gt;

&lt;p&gt;Click &lt;strong&gt;Create&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foajdblbxsf5drkd5xb79.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foajdblbxsf5drkd5xb79.png" alt="Google Cloud Project"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Enable APIs and Integrations
&lt;/h3&gt;

&lt;p&gt;After creating your project, the next step is to enable the necessary APIs that power your Conversational Agents and integrations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.&lt;/strong&gt; &lt;strong&gt;Enable the AI Applications API&lt;/strong&gt;&lt;br&gt;
    In the Google Cloud Console&lt;/p&gt;

&lt;p&gt;→ Use the search bar at the top&lt;/p&gt;

&lt;p&gt;→ Type “&lt;em&gt;AI Applications&lt;/em&gt;”.&lt;br&gt;
    &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiuaz0pov9e1fbnr3lfsc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiuaz0pov9e1fbnr3lfsc.png" alt="AI applications"&gt;&lt;/a&gt;&lt;br&gt;
    Select AI Applications API from the results and click "&lt;em&gt;Enable&lt;/em&gt;" to activate it for your project.&lt;br&gt;
    &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdtxvle65bh1u3bdj1zle.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdtxvle65bh1u3bdj1zle.png" alt="AI applications API"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2.&lt;/strong&gt; &lt;strong&gt;Enable the Dialogflow API&lt;/strong&gt;&lt;br&gt;
    Go back to the API Library.&lt;br&gt;
    → Search for "&lt;em&gt;Dialogflow API&lt;/em&gt;"&lt;br&gt;
    → Click "&lt;em&gt;Enable&lt;/em&gt;".&lt;br&gt;
    &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx78lzd1dx7nlx2wwmb9a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx78lzd1dx7nlx2wwmb9a.png" alt="Dialogflow API"&gt;&lt;/a&gt;&lt;br&gt;
    Dialogflow is required for integrating your conversational agents with chat platforms (e.g. Slack, Google Chat).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3.&lt;/strong&gt; &lt;strong&gt;Set Up Slack Integration (Optional)&lt;/strong&gt;&lt;br&gt;
    If you intend to make your agent accessible directly within Slack, you can configure the integration as an optional step.&lt;br&gt;
Before proceeding, ensure you have:&lt;br&gt;
    - A Slack account&lt;br&gt;
    - Access to a Slack workspace&lt;/p&gt;
&lt;h3&gt;
  
  
  Agent Architecture
&lt;/h3&gt;

&lt;p&gt;In Google Cloud’s Conversational Agents, playbooks come in two flavors: &lt;strong&gt;routine playbooks&lt;/strong&gt; and &lt;strong&gt;task playbooks&lt;/strong&gt;.&lt;br&gt;
A &lt;strong&gt;routine playbook&lt;/strong&gt; manages the overall flow of a conversation, while a &lt;strong&gt;task playbook&lt;/strong&gt; performs a specific, well-defined function before handing control back.&lt;br&gt;
In our system, we’ll combine both — a routine playbook to coordinate the conversation and task playbooks to handle specialized actions.&lt;/p&gt;

&lt;p&gt;This modular approach keeps the design clean, scalable, and easy to maintain — each agent focuses on its own responsibility while working together as one system.&lt;/p&gt;

&lt;p&gt;We’ll build the agentic system in three layers:&lt;br&gt;
&lt;strong&gt;Tools&lt;/strong&gt; → Data Store (for PDFs) and OpenAPI (for the Ticket API)&lt;br&gt;
&lt;strong&gt;Task Playbooks&lt;/strong&gt; → PDF Retriever and API Caller Agent&lt;br&gt;
&lt;strong&gt;Routine Playbook&lt;/strong&gt; → Supervisor Agent that coordinates everything&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfw39rojxgnoszybha2f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfw39rojxgnoszybha2f.png" alt="Agentic System Building"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Let’s Create the Agentic System!
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Create a New Conversational Agent
&lt;/h3&gt;

&lt;p&gt;In the Google Cloud Console, go to &lt;em&gt;AI Applications&lt;/em&gt; → &lt;em&gt;Conversational Agents&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;→ Click &lt;strong&gt;Create an Agent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;→ Choose &lt;strong&gt;Build your own&lt;/strong&gt; to start from scratch.&lt;/p&gt;

&lt;p&gt;→ Give your agent a clear name (e.g., IT Assistant), pick your preferred location, set the correct time zone, and choose your default language.&lt;/p&gt;

&lt;p&gt;→ Finally, under Agent type, select &lt;strong&gt;Start with Playbook&lt;/strong&gt;.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5l4suouiypx763mmr55u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5l4suouiypx763mmr55u.png" alt="Create New Agent"&gt;&lt;/a&gt;&lt;br&gt;
Once the agent is created, you’ll be redirected to the &lt;em&gt;Default Generative Playbook&lt;/em&gt; page — this is your routine playbook, which will become the Supervisor Agent in our system.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9m3l75xwsurdhvp4349p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9m3l75xwsurdhvp4349p.png" alt="Default Generative Playbook"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For now, we’ll pause here. Click the ← arrow in the top-left corner to return to the main agent view.&lt;br&gt;
The Supervisor Agent should be created last — after we first build the tools and the task playbooks (PDF Retriever and API Caller) that depend on them.&lt;/p&gt;
&lt;h3&gt;
  
  
  Setting Up the Tools
&lt;/h3&gt;

&lt;p&gt;In this system, we’ll connect two tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;Data Store&lt;/strong&gt; to index and retrieve information from PDF policy documents.&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;OpenAPI&lt;/strong&gt; tool to handle ticket-related operations such as creating and checking IT tickets.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Data Store&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In the left sidebar, select &lt;strong&gt;Tools&lt;/strong&gt;, then click Create → Fill the fields as follows:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool Name&lt;/strong&gt;: ITPolicyDocs&lt;br&gt;
&lt;strong&gt;Type&lt;/strong&gt;: Data Store&lt;br&gt;
&lt;strong&gt;Description&lt;/strong&gt;: Searches the organization’s IT policy PDFs (e.g., Database Credentials Standard, VPN Policy Template) to provide grounded answers to user questions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy1wt8z9fwuzeyysrdxid.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy1wt8z9fwuzeyysrdxid.gif" alt="last data store"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, it’s time to create the Data Store by indexing and ingesting the policy documents. From the Tools page, select &lt;em&gt;Cloud Storage&lt;/em&gt; (unstructured data) since the source materials are PDFs stored in a &lt;a href="https://cloud.google.com/storage/docs/creating-buckets" rel="noopener noreferrer"&gt;Bucket&lt;/a&gt;. Open the &lt;strong&gt;Advanced options&lt;/strong&gt; to gain finer control over the ingestion and indexing process.&lt;/p&gt;

&lt;p&gt;Import your documents from Cloud Storage and move to the configuration screen. Set the name of your Data Store — for example, policies_store_1 — and apply the following recommended settings based on &lt;a href="https://cloud.google.com/generative-ai-app-builder/docs/parse-chunk-documents?_gl=1*1cn7wbq*_ga*NjQyODkyMDMuMTcyNjIyMzQzOA..*_ga_WH2QY8WWF5*czE3NjAwNDU2NjAkbzU5JGcxJHQxNzYwMDQ1NjgxJGozOSRsMCRoMA.." rel="noopener noreferrer"&gt;Vertex AI Search guides&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.&lt;/strong&gt; Parser → &lt;strong&gt;Layout parser&lt;/strong&gt;: Best suited for PDFs and DOCX files, this parser maintains the original document layout and hierarchy, which helps the model retrieve information more accurately in retrieval-augmented generation (RAG) workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2.&lt;/strong&gt; &lt;strong&gt;Document Chunking&lt;/strong&gt; → Keep the default chunk size of 500, which fits well with the moderate section length and structure of the policy documents, ensuring context is preserved without fragmenting the content. Enable “&lt;em&gt;Include ancestor headings in chunks&lt;/em&gt;” to retain section headers, ensuring contextual grounding even when retrieving content from mid-document.&lt;/p&gt;

&lt;p&gt;Once indexing begins, return to your tool configuration and select the newly created Data Store. Under Tool Settings, click “&lt;em&gt;Customize&lt;/em&gt;”  to adjust the grounding parameters. In the Grounding section, set the Lowest score allowed (grounding threshold) to &lt;strong&gt;Medium&lt;/strong&gt; — this ensures that only sources with moderate-to-high confidence are used, improving reliability while avoiding overly strict filtering.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3k8xm74mqxah9hs1xoj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3k8xm74mqxah9hs1xoj.png" alt="Configuration"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Other settings such as the &lt;em&gt;Rewriter&lt;/em&gt; and &lt;em&gt;Summarization model&lt;/em&gt; (here using gemini-2.0-flash-001) can remain at their default values, as they already provide concise, high-quality summarizations of retrieved content. This configuration ensures your agent gives grounded, trustworthy answers directly from your IT policy PDFs. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAPI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In the left sidebar, go to Tools → Create, then fill the fields:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool Name&lt;/strong&gt;: Ticket API&lt;br&gt;
&lt;strong&gt;Type&lt;/strong&gt;: OpenAPI&lt;br&gt;
&lt;strong&gt;Description&lt;/strong&gt;: Use createTicket to open a new IT request (required fields: summary, description, priority, requester). Use getTicket to check the status of an existing ticket by ID (required fields: ticketID).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffems2ki9oyii933sog1t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffems2ki9oyii933sog1t.png" alt="OpenAPI Tool"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For demo purposes, this tool connects to a mock ticketing service I implemented with Google Cloud Functions and deployed via Cloud Run. This lightweight setup simulates a simple helpdesk system, allowing agents to create and check ticket statuses as if they were interacting with a real backend. The function is exposed as a REST API through Cloud Run and defined using an OpenAPI YAML specification, making it easy to integrate directly into Google Cloud’s Conversational Agents as a tool. Although it doesn’t persist to a database, it stores tickets in memory to mimic realistic interactions. When a ticket is created, the API returns a generated ID (for example, IT-3B7A12) with status "Open". A status check returns the ticket ID, summary, description, and current status. This gives us a reliable, controlled environment to demonstrate real API calls inside Conversational Agents.&lt;/p&gt;

&lt;p&gt;In the "&lt;em&gt;Schema&lt;/em&gt;" section, choose YAML and paste an OpenAPI spec like the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;openapi: 3.0.0
info:
  title: Ticket API
  version: 1.0.0
servers:
  - url: https://&amp;lt;YOUR-URL&amp;gt;     # e.g., https://ticketapi-xxxxx.run.app
paths:
  /tickets:
    post:
      operationId: createTicket
      summary: Create a ticket
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [summary, description, priority, requester]
              properties:
                summary: { type: string }
                description: { type: string }
                priority: { type: string, enum: [Low, Medium, High] }
                requester: { type: string, format: email }
      responses:
        "200":
          description: Ticket created
          content:
            application/json:
              schema:
                type: object
                properties:
                  id: { type: string }
                  status: { type: string }
  /tickets/{id}:
    get:
      operationId: getTicket
      summary: Get ticket status
      parameters:
        - name: id
          in: path
          required: true
          schema: { type: string }
      responses:
        "200":
          description: Ticket status
          content:
            application/json:
              schema:
                type: object
                properties:
                  id: { type: string }
                  status: { type: string }
                  summary: { type: string }
                  description: { type: string }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's build our task playbooks!&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Task Playbooks
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1.&lt;/strong&gt; &lt;strong&gt;PDF Retriever&lt;/strong&gt;&lt;br&gt;
From the left sidebar, navigate to &lt;em&gt;Playbooks&lt;/em&gt; → &lt;em&gt;Create&lt;/em&gt;, and select &lt;strong&gt;Task Playbook&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We’ll begin with the first specialized agent — the one responsible for handling policy-related queries using the PDF documents.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Playbook name&lt;/strong&gt;: PDF Retriever&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Goal&lt;/strong&gt;: Answers IT policy questions by retrieving grounded passages from the uploaded PDFs. Always cite the policy title or section when possible.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next, connect the Data Store tool you created earlier (ITPolicyDocs) so this playbook can search and retrieve content from the indexed policy PDFs.&lt;br&gt;
This connection happens through the playbook’s instructions, where we explicitly reference the tool to guide the agent’s retrieval behavior.&lt;/p&gt;

&lt;p&gt;Now, add the following &lt;strong&gt;instructions&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are the PDF Retriever Agent. Your role is to handle IT policy queries by consulting the organization’s PDF policy documents.
Always search the attached PDFs (e.g., Database Credentials Standard, VPN Policy Template) to find relevant passages using the tool ${TOOL:ITPolicyDocs}.
- If the user’s question is ambiguous or missing context, ask clarifying questions.
Search the data store and answer based only on returned content.
Quote or paraphrase the relevant passage.
Keep responses concise and in plain language.
When possible, mention the document title/section.
If the policy clearly allows or denies, state that plainly.
- Output expectations
Provide the grounded answer text plus lightweight metadata (e.g., source_title, section, and optionally a proposed_action like create_ticket with collected fields if the user asked for escalation).
Once you answer to the user, update the parameter $route_to_supervisor=True
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, define an output parameter so the playbook can signal back to the Supervisor when it finishes responding:&lt;/p&gt;

&lt;p&gt;Go to the &lt;em&gt;Parameters tab&lt;/em&gt; → &lt;em&gt;Create Output Parameter&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Fill the fields as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Parameter name&lt;/strong&gt;: route_to_supervisor&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type&lt;/strong&gt;: Boolean&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Description&lt;/strong&gt;: Give control back to the Supervisor.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures that each time the PDF Retriever provides an answer, the parameter is set to &lt;em&gt;True&lt;/em&gt;, allowing the Supervisor Agent to automatically regain control of the conversation flow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2.&lt;/strong&gt; &lt;strong&gt;API Caller Agent&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Next, let’s create the second task playbook — the one that handles ticket-related operations by interacting with the mock Ticket API.&lt;/p&gt;

&lt;p&gt;From the left sidebar, go to to &lt;em&gt;Playbooks&lt;/em&gt; → &lt;em&gt;Create&lt;/em&gt;→ &lt;em&gt;Task Playbook&lt;/em&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Playbook name&lt;/strong&gt;: API Caller Agent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Goal&lt;/strong&gt;: Handles IT support ticket operations by calling the Ticket API to create new requests or check ticket status.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once the playbook is created, connect it to the OpenAPI tool you previously built (Ticket API). This allows the agent to interact directly with the mock ticketing backend through the predefined endpoints.&lt;br&gt;
Now, add the &lt;strong&gt;instructions&lt;/strong&gt; that define how the playbook will use the API tool to perform ticket operations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are the API Worker Agent. Your role is to handle all ticketing operations for the IT Helpdesk system. Always use the tool ${TOOL:Ticket API}.
-Use the createTicket action whenever the user instructs you to open a new support ticket.
-Always supply the required fields:
summary: a short title of the request
description: detailed explanation of the request
priority: Low, Medium, or High
requester: the requester’s email address

If required fields are missing or invalid, ask the user for them (you own follow-ups and validation).
Validate priority is one of Low/Medium/High.
Validate requester looks like an email.
For getTicket, require a ticket id; if absent, ask.
- After calling the tool ${TOOL:Ticket API}, return the ticket ID and its initial status.
-Use the getTicket action whenever you are asked to check the progress of an existing ticket.
-Provide the ticket id.
Never invent or guess ticket IDs or fields — only use what is provided.
After calling the tool, return the ID, current status, and any available details (summary and description).
Once you answer to the user, update the parameter $route_to_supervisor=True.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, define the same boolean output parameter "&lt;em&gt;route_to_supervisor&lt;/em&gt;" to ensure that, after each interaction, control returns to the Supervisor.&lt;/p&gt;

&lt;p&gt;This ensures that once the API Caller Agent completes a task, the conversation flow automatically returns to the Supervisor Agent, maintaining centralized control and continuity in the user experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Routine Playbook
&lt;/h3&gt;

&lt;p&gt;Next, we will set up the Supervisor Agent, — the core routine playbook that controls the overall flow of the conversation. This agent acts as the orchestrator, greeting the user, understanding intent, delegating tasks, and regaining control after each task completes.&lt;/p&gt;

&lt;p&gt;Go back to "&lt;em&gt;Playbooks&lt;/em&gt;"  and open the "&lt;em&gt;Default Routine Playbook&lt;/em&gt;" we skipped earlier.&lt;br&gt;
Fill the fields as follows:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Playbook name&lt;/strong&gt;: Supervisor Agent&lt;br&gt;
&lt;strong&gt;Goal&lt;/strong&gt;: You are the Supervisor Agent. You own the conversation shell (greeting and closing) and delegate every user request to the correct task playbook. You do not answer policy or ticket details yourself.&lt;/p&gt;

&lt;p&gt;Now, add the following &lt;strong&gt;instructions&lt;/strong&gt; that define the agent’s behavior and control logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Greet the user on the first turn.
- Intent routing (every turn):
- If the user asks about IT rules, acceptable use, VPN, credentials, or policies → delegate to ${PLAYBOOK:PDF Retriever}.
- If the user wants to open a ticket or check ticket status → delegate to ${PLAYBOOK:API Caller Agent}.
- Do not ask follow-up questions for handling missing information or validate details.
- Do not include any summary of previous conversation history.
- After any worker playbook finishes and the parameter $route_to_supervisor=True immediately take back control and ask:
- “Anything else I can help you with? 🙂”
- If the user indicates they’re done, say goodbye politely and end.
- Tone: concise, professional, friendly
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, let’s connect the Supervisor Agent to the rest of the system!&lt;br&gt;
Go to the &lt;strong&gt;Parameters&lt;/strong&gt; tab and click &lt;strong&gt;Add new read parameter&lt;/strong&gt;.&lt;br&gt;
Fill in the fields as follows:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parameter name&lt;/strong&gt;: route_to_supervisor&lt;br&gt;
&lt;strong&gt;Type&lt;/strong&gt;: Boolean&lt;br&gt;
&lt;strong&gt;Description&lt;/strong&gt;: Control to the supervisor&lt;/p&gt;

&lt;p&gt;This parameter mirrors the output parameter you created earlier in both the PDF Retriever and API Caller Agent playbooks.&lt;br&gt;
By reading this value from the &lt;strong&gt;session memory&lt;/strong&gt;, the Supervisor knows exactly when a worker has completed its task and when it should take back control of the conversation. Once the parameter &lt;em&gt;route_to_supervisor&lt;/em&gt; becomes &lt;em&gt;True&lt;/em&gt;, the Supervisor automatically resumes interaction, prompting the user with: “Anything else I can help you with? 🙂”&lt;/p&gt;

&lt;p&gt;This step closes the loop in your agentic workflow — ensuring smooth handoffs between agents and keeping the overall experience consistent and natural.&lt;/p&gt;

&lt;h3&gt;
  
  
  Toggle Simulator
&lt;/h3&gt;

&lt;p&gt;You can now test the overall conversational flow using the &lt;strong&gt;Toggle Simulator&lt;/strong&gt;, accessible from the top navigation bar. This built-in tool allows you to preview and validate interactions between your agents directly within the Conversational Agents interface. It provides a real-time view of how intents are detected, which playbook is triggered, and how parameters, such as &lt;em&gt;route_to_supervisor&lt;/em&gt;, are passed between agents. Thus, the Toggle Simulator also serves as an effective debugging environment — allowing you to inspect conversation states, verify routing logic, and observe when each tool is invoked, which inputs are provided, and what outputs are returned during the interaction.&lt;/p&gt;

&lt;p&gt;When starting a conversation in the Toggle Simulator, you can define the starting node of your agentic system. &lt;strong&gt;By default&lt;/strong&gt;, the conversation begins with the &lt;strong&gt;Routine Playbook&lt;/strong&gt;, which in this case is the Supervisor Agent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkjdcxkltgly7vjg38xq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkjdcxkltgly7vjg38xq.png" alt="Start the conversation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Additionally, the simulator allows you to experiment with different AI models to evaluate performance and response quality. For this example, select &lt;strong&gt;Gemini 2.5 Flash&lt;/strong&gt;, which offers fast and contextually accurate responses.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fid8i1c92pa67lbrfwkgs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fid8i1c92pa67lbrfwkgs.png" alt="Greeting Supervisor"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For instance, you can evaluate the system by submitting a query such as:&lt;/p&gt;

&lt;p&gt;“&lt;em&gt;Am I allowed to use my personal computer to connect to the company VPN?&lt;/em&gt;”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5bny0gn60xl1j1ij9py.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5bny0gn60xl1j1ij9py.png" alt="Allowance on personal computer"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this case, the &lt;strong&gt;Supervisor Agent identifies the intent&lt;/strong&gt; as policy-related and &lt;strong&gt;delegates&lt;/strong&gt; the query to the &lt;strong&gt;PDF Retriever Agent&lt;/strong&gt;, which searches the VPN Policy Template document.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdme4ek4uas5go3xrd9ld.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdme4ek4uas5go3xrd9ld.png" alt="PDF Retriever Delegation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;PDF Retriever&lt;/strong&gt; invokes the &lt;strong&gt;ITPolicyDocs tool&lt;/strong&gt;, searches the indexed VPN Policy Template, and returns a grounded, policy-based answer. Once the answer is delivered, the PDF Retriever completes its execution with &lt;em&gt;State: OK&lt;/em&gt;, indicating a successful run, and sets the output parameter &lt;em&gt;route_to_supervisor=True&lt;/em&gt;, signaling the Supervisor Agent to regain control of the conversation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvjq1m7dcn9yyut4m2c81.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvjq1m7dcn9yyut4m2c81.png" alt="Control Back of the conversation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Supervisor then resumes interaction smoothly, prompting the user with “&lt;em&gt;Anything else I can help you with?&lt;/em&gt; 🙂” — demonstrating the seamless orchestration between agents within the system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up some examples
&lt;/h3&gt;

&lt;p&gt;According to &lt;a href="https://cloud.google.com/dialogflow/cx/docs/concept/playbook/example" rel="noopener noreferrer"&gt;Google Cloud’s documentation&lt;/a&gt; examples act as training cues that help the model understand the types of user inputs it should recognize and how to respond effectively. They guide the playbook in interpreting intent, selecting the right tools, and maintaining an appropriate tone and context throughout the conversation.&lt;br&gt;
A practical advantage of Google Cloud’s Conversational Agents platform is that you can add examples directly from the Toggle Simulator. After testing an interaction, simply click “&lt;em&gt;Save as example&lt;/em&gt;” to capture the full conversational flow — including the user’s input, the playbook transitions, and the model’s response. This feature allows you to link real interaction data to the relevant playbook, turning it into a reference example that improves the model’s understanding of similar future queries.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wps69v32j5kultdmq4m.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wps69v32j5kultdmq4m.gif" alt="Insert Examples"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If something doesn’t go as expected during testing — for instance, if a playbook routes incorrectly or a response needs refinement — you can inspect and adjust the full sequence of messages, tool calls, and playbook states directly in the simulator. Once you’ve configured the flow to behave exactly as intended you can save it as an example for the specific playbook. This makes it easy to fine-tune your agent iteratively, ensuring that future runs follow the corrected behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Configuration: Generative AI Settings
&lt;/h2&gt;

&lt;p&gt;Google Cloud’s Conversational Agents offer flexible configuration options for fine-tuning how your agents generate, process, and manage responses.&lt;/p&gt;

&lt;p&gt;Under &lt;strong&gt;Settings → Generative AI&lt;/strong&gt;, you can adjust model behavior and generation parameters to align with your organization’s conversational goals.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;In the &lt;strong&gt;Generative Model Selection&lt;/strong&gt; section, you can choose from available Gemini models (for instance, gemini-2.5-flash), define input and output token limits, and set the temperature, which controls creativity versus precision. Lower temperature values (close to 0) produce more deterministic, consistent outputs, while higher values introduce greater variation and expressiveness in responses.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;strong&gt;Context Token Limits&lt;/strong&gt; option determines how much conversation history is preserved between turns — useful for maintaining long-term context in multi-step workflows without exceeding model constraints.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4liau2586gg6jn4mekac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4liau2586gg6jn4mekac.png" alt="Generative AI configuration"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Beyond generative tuning, the &lt;strong&gt;General tab&lt;/strong&gt; under the same menu provides safety and compliance controls. Here you can define &lt;strong&gt;banned phrases&lt;/strong&gt;, preventing the model from generating or processing specific terms in both prompts and responses. This helps ensure content safety and brand compliance, especially in enterprise deployments. You can also customize &lt;strong&gt;safety filters&lt;/strong&gt;, configuring how strictly the system blocks sensitive or harmful content categories such as hate speech or explicit language.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tmq3yocyldm0kc2vcv6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tmq3yocyldm0kc2vcv6.png" alt="Safety configuration"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Logging
&lt;/h2&gt;

&lt;p&gt;Monitoring and evaluating your agent’s performance is a crucial part of maintaining a reliable conversational system. Google Cloud’s Conversational Agents platform provides two ways to track and analyze interactions: &lt;strong&gt;Conversation History&lt;/strong&gt; and &lt;strong&gt;Cloud Logging&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the top navigation bar:&lt;/p&gt;

&lt;p&gt;→ Open &lt;strong&gt;Settings&lt;/strong&gt;&lt;br&gt;
→ Select &lt;strong&gt;Logging Settings&lt;/strong&gt;&lt;br&gt;
→ Click on "&lt;em&gt;Enable conversation history&lt;/em&gt;" and "&lt;em&gt;Enable Cloud Logging&lt;/em&gt;".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0e3kztp7mgopezc7cyx0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0e3kztp7mgopezc7cyx0.png" alt="Enable Logging"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conversation History&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Conversation History automatically captures every exchange between users and your agents. You can review full transcripts right in the Conversation History panel — perfect for debugging, validating flow logic, or simply seeing how users engage with your agents over time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv4yeeglfyiz3kl52vvu7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv4yeeglfyiz3kl52vvu7.png" alt="Conversation history"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cloud Logging&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Enable Cloud Logging to export detailed query and debugging data to Google Cloud’s &lt;strong&gt;Logs Explorer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This integration provides deeper visibility into your agentic system’s behavior — including request timing, playbook transitions, tool invocations, and message trends. With Cloud Logging, you can perform analytics, identify common user intents, and monitor system performance metrics across all conversations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14hgbf7urg97x5xk9dw0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14hgbf7urg97x5xk9dw0.png" alt="Loggs Explorer"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Slack integration
&lt;/h2&gt;

&lt;p&gt;To make your conversational agent accessible directly from your organization’s Slack workspace, you can integrate it using Google Cloud’s Slack integration feature.&lt;/p&gt;

&lt;p&gt;To set it up, we will follow Google Cloud’s official guide:&lt;br&gt;
👉 &lt;a href="https://cloud.google.com/dialogflow/cx/docs/concept/integration/slack" rel="noopener noreferrer"&gt;Integrate Dialogflow with Slack&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.&lt;/strong&gt; Prerequisites&lt;/p&gt;

&lt;p&gt;A Slack account and a Slack workspace where you can install custom apps. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2.&lt;/strong&gt; Create the Slack app (from a manifest)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F96qxixrfyptigc16len2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F96qxixrfyptigc16len2.png" alt="App Manifest"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to &lt;a href="https://api.slack.com/apps" rel="noopener noreferrer"&gt;Slack Apps&lt;/a&gt; and create a new app from an app manifest. &lt;/li&gt;
&lt;li&gt;Use the manifest structure shown in Google’s doc as a template, ensuring these parts are present:

&lt;ul&gt;
&lt;li&gt;Bot token scopes (e.g., app_mentions:read, chat:write, im:read, im:write, im:history, incoming-webhook).&lt;/li&gt;
&lt;li&gt;Event subscriptions with a Request URL (you’ll paste the URL generated by Google Cloud in step 4).&lt;/li&gt;
&lt;li&gt;Bot events like app_mention and message.im.&lt;/li&gt;
&lt;li&gt;Keep Socket Mode disabled (per the example).
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;display_information:
  name: Conversational Agents (Dialogflow CX)
  description: Conversational Agents (Dialogflow CX) integration
  background_color: "#1148b8"
features:
  app_home:
    home_tab_enabled: false
    messages_tab_enabled: true
    messages_tab_read_only_enabled: false
  bot_user:
    display_name: CX
    always_online: true
oauth_config:
  scopes:
    bot:
      - app_mentions:read
      - chat:write
      - im:history
      - im:read
      - im:write
      - incoming-webhook
settings:
  event_subscriptions:
    request_url: https://dialogflow-slack-4vnhuutqka-uc.a.run.app
    bot_events:
      - app_mention
      - message.im
  org_deploy_enabled: false
  socket_mode_enabled: false
  token_rotation_enabled: false
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3.&lt;/strong&gt; Install the app to your workspace and copy:&lt;br&gt;
In your App:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Bot User OAuth Token (Slack: Install App → OAuth Tokens for Your Workspace).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Signing Secret (Slack: Basic Information → App Credentials). &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(If you’re curious about Slack scopes in general, &lt;a href="https://docs.slack.dev/tools/python-slack-sdk/tutorial/understanding-oauth-scopes/?utm_source=chatgpt.com" rel="noopener noreferrer"&gt;Slack’s developer docs&lt;/a&gt; explain how scopes map to bot capabilities.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4.&lt;/strong&gt; Connect Slack inside Google Cloud (Conversational Agents)&lt;/p&gt;

&lt;p&gt;In the Conversational Agents console, open your agent and find in the left bar "&lt;em&gt;Integrations&lt;/em&gt;".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkr8o9z4fy92ff4b2bqzp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkr8o9z4fy92ff4b2bqzp.png" alt="Integrations"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click &lt;strong&gt;Slack&lt;/strong&gt; → &lt;strong&gt;Connect&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Paste the &lt;strong&gt;Access token&lt;/strong&gt; (your Slack Bot User OAuth Token) and &lt;strong&gt;Signing token&lt;/strong&gt; (Slack Signing Secret) from step 3.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Choose your environment deployed the agent (e.g. Draft)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;Start&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Copy the generated Webhook URL. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5.&lt;/strong&gt; Point Slack to your agent&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Return to your Slack app and open &lt;em&gt;Event Subscriptions&lt;/em&gt; → &lt;em&gt;Enable Events&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Paste the Webhook URL you copied from step 4 into Request URL and save. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;6.&lt;/strong&gt; Configure Incoming Webhooks and Channel Access&lt;/p&gt;

&lt;p&gt;In your Slack App configuration page, go to &lt;strong&gt;Features → Incoming Webhooks → Webhook URLs for Your Workspace&lt;/strong&gt;.&lt;br&gt;
Here, you can add Webhook URLs for the specific channels or direct messages (DMs) where you want your bot to communicate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi8ercjd8f3ogeygkpx97.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi8ercjd8f3ogeygkpx97.png" alt="Webhook channels"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In public or private channels, the bot will respond whenever it is mentioned by name, ensuring it only engages when prompted, while in DMs, it can respond directly to user queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7.&lt;/strong&gt; Customize your Agent&lt;br&gt;
You can personalize your agent’s appearance and behavior in Slack to better reflect your organization’s branding and communication style.&lt;br&gt;
From the Slack app configuration page, navigate to &lt;em&gt;Features&lt;/em&gt; → &lt;em&gt;App Home&lt;/em&gt;, where you can adjust the display name, bot icon, and description shown in your workspace.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9es5tc4ic0q6cv859rxf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9es5tc4ic0q6cv859rxf.png" alt="Customization"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8.&lt;/strong&gt; Test the integration&lt;/p&gt;

&lt;p&gt;In Slack, mention the bot in a channel or DM the bot to start chatting.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd3kq05a41j6ocvubwpre.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd3kq05a41j6ocvubwpre.gif" alt="slack"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the above conversation, the user initiates a chat with a greeting, and the IT Assistant — acting as the Supervisor Agent — responds politely, ready to assist.&lt;/p&gt;

&lt;p&gt;The user then asks a policy-related question about what the company policy states when database credentials may have been exposed. The Supervisor detects this as a policy inquiry and routes the request to the PDF Retriever, which searches the Database Credentials Standard document. The retriever provides a grounded answer explaining that credentials must not be stored in clear text or in web-accessible locations, citing the relevant policy section.&lt;/p&gt;

&lt;p&gt;Once the policy response is delivered, the Supervisor Agent resumes control of the conversation and courteously asks if further help is needed. The user then requests to create a ticket for review. Recognizing this as an operational task, the Supervisor delegates the request to the API Caller Agent, which interacts with the mock ticketing API. The API processes the input details — summary, description, requester, and priority — and responds with a generated ticket ID and an open status.&lt;/p&gt;

&lt;p&gt;Finally, the Supervisor politely confirms the ticket creation and ends the interaction after the user says goodbye.&lt;/p&gt;

&lt;p&gt;This example demonstrates the end-to-end flow of intent detection, delegation, and seamless orchestration between the agents — from grounded policy retrieval to action execution through the OpenAPI integration. It also highlights how the system operates smoothly within Slack, where users can interact naturally with the IT Assistant in their everyday workspace without leaving the chat environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building agentic systems inside Google Cloud’s AI Applications is more than just a technical exercise — it’s a glimpse into the next evolution of enterprise automation. In this walkthrough, we saw how easy it is to design, orchestrate, and deploy a multi-agent helpdesk system using no-code tools — integrating policy retrieval, ticket creation, and chat-based interaction, all within a single, governed cloud environment.&lt;/p&gt;

&lt;p&gt;The resulting architecture — one Supervisor Agent coordinating multiple specialized playbooks — provides a powerful blueprint for scalable enterprise AI systems. It allows organizations to design modular, transparent workflows where every agent serves a clear purpose, grounded in data and capable of performing real actions through APIs.&lt;/p&gt;

&lt;p&gt;What makes this approach especially impactful is that everything happens within the same ecosystem: data security, access control, observability, and scalability are built-in through Google Cloud’s infrastructure. You can test, debug, and monitor your entire system with tools like Cloud Logging and Conversation History, or even deploy it directly to Slack for real-world usage with your team — no complex deployment pipeline required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next Steps and Opportunities&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;While this no-code setup covers the full lifecycle of a conversational system, advanced teams can take it further by blending low-code flexibility with custom logic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Add custom actions or logic through Cloud Functions or Cloud Run — for example, to validate inputs, enrich data from other APIs, or trigger workflows in external tools like Jira or ServiceNow.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Integrate structured data sources, such as BigQuery for even richer, context-aware responses.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use Cloud Logging and BigQuery exports to build analytics dashboards — tracking usage, intent distribution, and success rates over time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Implement advanced integrations — such as email responders, or internal portals — to expand where and how users can access your AI assistant.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At its core, Google Cloud’s low-code AI platform allows enterprises to prototype fast and scale safely, bridging the gap between no-code experimentation and full-scale production AI. Whether you’re automating IT requests, HR inquiries, or customer service operations, this approach gives your teams the flexibility to innovate — without waiting on a long development cycle.&lt;/p&gt;

&lt;p&gt;The next step? Start experimenting with your own data and APIs — and turn your organization’s workflows into intelligent, conversational systems.&lt;/p&gt;

&lt;p&gt;👉 For further reading, explore Google Cloud’s &lt;a href="https://cloud.google.com/dialogflow/cx/docs/concept/playbook/best-practices" rel="noopener noreferrer"&gt;Best Practices&lt;/a&gt; for playbooks to design reliable, maintainable, and scalable agentic architectures.&lt;/p&gt;

&lt;p&gt;At Agile Actors, we thrive on challenges with a bold and adventurous spirit. We confront problems directly, using cutting-edge technologies in the most innovative and daring ways. If you’re excited to join a dynamic learning organization where knowledge flows freely and skills are refined to excellence, come join our exceptional team. Let’s conquer new frontiers together. Check out our &lt;a href="https://apply.workable.com/agileactors/" rel="noopener noreferrer"&gt;openings&lt;/a&gt; and choose the Agile Actors Experience!&lt;/p&gt;

</description>
      <category>nocode</category>
      <category>googlecloud</category>
      <category>tutorial</category>
      <category>ai</category>
    </item>
    <item>
      <title>From Pipelines to Product: My Journey from Data Engineer to Data Product Owner</title>
      <dc:creator>Panagiotis</dc:creator>
      <pubDate>Tue, 14 Oct 2025 07:58:26 +0000</pubDate>
      <link>https://dev.to/agileactors/from-pipelines-to-product-my-journey-from-data-engineer-to-data-product-owner-53n1</link>
      <guid>https://dev.to/agileactors/from-pipelines-to-product-my-journey-from-data-engineer-to-data-product-owner-53n1</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09zz55qlxnlmekz5ise0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09zz55qlxnlmekz5ise0.jpg" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most career transitions happen quietly: one project ends, another begins, and slowly a new title appears on your LinkedIn. Mine didn’t. Mine started with a single, uncomfortable question in a demo meeting:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;“Okay… and what do you want me to do with that?”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That question revealed a blind spot in my work as a data engineer and set me on a journey I didn’t expect — from building technically flawless pipelines to owning the vision of a data platform as a product. This is the story of how I moved from the comfort of code to the ambiguity of human needs, and what I learned along the way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Haunting Question of 'Why'&lt;/strong&gt;&lt;br&gt;
We were showcasing our latest work to the client's logistics leadership—a dynamic heatmap tracking parcel congestion across logistic centers in near real-time. We had built it using a streaming pipeline that ingested tens of thousands of scan events per minute. The UI was sleek, the data was fresh, and the latency was under 15 minutes. It was, by every engineering measure, a win.&lt;/p&gt;

&lt;p&gt;As we walked through the interface, I zoomed into the a distribution center. “You can see here,” I said proudly, “we’re detecting a 43% spike in inbound volume over baseline for this time of day.”&lt;/p&gt;

&lt;p&gt;There was a pause. Then one of the senior ops managers leaned forward and asked, “Okay... and what do you want me to do with that?”&lt;/p&gt;

&lt;p&gt;That one question knocked the wind out of me. He wasn’t being dismissive—he was being honest. In that moment, I realized the painful truth: we hadn’t built a decision-support tool—we had built a statistics mirror. It was technically elegant but operationally incomplete.&lt;/p&gt;

&lt;p&gt;I had given him the signal, but not the meaning. I had shown him something interesting, but not something useful. The spike was real, the data was right—but I hadn’t connected it to the decisions he was responsible for: rerouting vans, calling in night shift early, delaying outbound dispatches. To him, the number was noise until it came packaged with a recommendation or an alert.&lt;/p&gt;

&lt;p&gt;That question—“What do you want me to do with that?”—echoed in my mind for weeks. It marked a shift in my thinking: from delivering outputs to enabling outcomes. From answering what, to relentlessly chasing the so what.&lt;/p&gt;

&lt;p&gt;In a different environment, the feedback might have been logged as a feature request for "v2.0." But our culture values impact over output. That manager's question wasn't a critique; it was an invitation to solve a deeper problem.&lt;/p&gt;

&lt;p&gt;As a data engineer, I had built my career on the bedrock of how. I found contentment in the elegant logic of a well-designed pipeline. Yet, that forecast dashboard marked a turning point. It wasn't enough for the data to be fast and correct; I needed it to be meaningful. The "why" behind the request was no longer a background detail—it was becoming the only thing that mattered. That obsession with purpose marked the beginning of my transition to Data Platform Product Owner—a journey from the certainty of code to the ambiguity of human needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A Culture of Curiosity, Not Just Code&lt;/strong&gt;&lt;br&gt;
My transition is made possible by the exceptional dynamic I share with my employer, Agile Actors. I’ve heard tales from peers where career paths are rigid, but my experience was the opposite. I was the beneficiary of a dual culture that saw its people as evolving investments.&lt;/p&gt;

&lt;p&gt;This wasn't just a poster on the wall. During a planning session, we were reviewing a list of upcoming data pipeline tasks, mostly prioritized by technical effort. As I looked through it, I found myself asking, “Which of these will actually help someone on the business side in the next couple of months?”&lt;/p&gt;

&lt;p&gt;Rather than a bold challenge, it was simply a quiet question which shifted the discussion. We ended up rethinking the priorities, reached out to a few internal users for input, and adjusted our plan based on real impact rather than just complexity. My Agile Actors Chapter Lead heard about this, and instead of seeing it as scope creep, he saw it as me embodying our value of 'continuous improvement'. He went beyond acknowledgment, setting up a meeting to discuss my development path, seeing an opportunity for me to create more value for our client by moving closer to the business.&lt;/p&gt;

&lt;p&gt;This support system was crucial and my chapter leader became my advocate. When Agile Actors sponsored my PSPO certifications, it wasn't an exception; it was an extension of a belief that investing in an employee’s curiosity pays the highest dividends. They weren't just training a data engineer; they were cultivating a future leader who could bridge the gap between their technical teams and their client's business goals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;From Building Pipelines to Charting a Product Vision&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This unwavering support transformed a personal ambition into a clear career path. My mentors introduced me to a revolutionary concept for a centuries-old postal service: treating our entire data platform as an internal product.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyr6oow45x45wzsx3yld.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyr6oow45x45wzsx3yld.jpg" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Traditionally, we were seen as a service team—implementing requests, building pipelines, fixing bugs. But the “platform as a product” mindset changed everything. Our infrastructure, tools, and datasets weren’t just technical assets—they were products with internal customers: analysts, data scientists, developers, and decision-makers across the business. My new job was to be the Product Owner for this data platform.&lt;/p&gt;

&lt;p&gt;One of my first major initiatives was the development of a reusable ingestion framework to power our Databricks lakehouse. Until then, bringing in a new data source meant writing custom Spark code, managing brittle workflows, and duplicating logic across teams.&lt;/p&gt;

&lt;p&gt;We flipped that model. We built a framework that allowed data engineers to onboard new sources using only configuration files—defining schema mappings, update frequency, and quality rules in YAML, with minimal code. It abstracted away complexity and gave teams a standard, governed, and scalable way to land their data in the lake.&lt;/p&gt;

&lt;p&gt;Beyond the framework, the product delivered an ecosystem: documentation, onboarding guides, reusable templates, and SLAs that teams could trust. What used to take weeks could now be done in a few hours. At its core, the difference was cultural, not only technical.&lt;br&gt;
We gave teams autonomy, while ensuring consistency and quality across the platform.&lt;/p&gt;

&lt;p&gt;Soon, I was creating roadmaps for feature rollouts, prioritizing enhancements based on internal feedback, and aligning delivery with cross-functional use cases. The shift from the technical how to the strategic why felt like stepping back from coding individual pipelines to shaping the way our entire organization worked with data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I Kept, What I Learned&lt;/strong&gt;&lt;br&gt;
Moving from engineering to product wasn't about erasing my past; it was about building upon it.&lt;/p&gt;

&lt;p&gt;What I Kept:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Systems Thinking:&lt;/em&gt;&lt;/strong&gt; The ability to see the entire data ecosystem—from a mail carrier's handheld scanner to the final delivery confirmation—was invaluable for understanding downstream consequences.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Problem Decomposition:&lt;/em&gt;&lt;/strong&gt; Breaking down a massive problem like "improve delivery efficiency" into logical, manageable steps is the same skill used to design a complex data pipeline.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;A Respect for Quality:&lt;/em&gt;&lt;/strong&gt; Obsession with data integrity became a secret weapon in discussions about building robust, reliable data products that the business could trust.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What I Had to Learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Stakeholder Management:&lt;/em&gt;&lt;/strong&gt; My world expanded to include logistics, sales, finance, and executive leadership. I had to learn their languages and negotiate compromises.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;The Art of Saying 'No':&lt;/em&gt;&lt;/strong&gt; The Head of Regional Distribution wanted a real-time dashboard tracking every single delivery truck on a map, refreshed every second. My engineering gut knew it was feasible. But my new Product Owner brain had to ask why. After interviewing the dispatchers, I discovered they didn't need a flashy map; they needed a reliable alert when a truck was projected to be more than 30 minutes late. We built the simpler, more valuable alerting system instead. Saying 'no' to the 'wow' feature in favor of the 'working' feature was terrifying, but it was the right call.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Embracing Ambiguity:&lt;/em&gt;&lt;/strong&gt; I had to get comfortable making decisions with incomplete information, moving forward to learn and iterate rather than waiting for the "perfect" answer.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Finding Rhythm in the Chaos with Scrum&lt;/strong&gt;&lt;br&gt;
When Agile Actors offered to sponsor my Professional Scrum Product Owner (PSPO) certification, I was skeptical. I associated Scrum with rigid project management rituals. The training was a revelation. It was an empirical framework designed to deliberately navigate ambiguity.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpy4sn6ouqw1wktc427r4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpy4sn6ouqw1wktc427r4.png" alt=" " width="572" height="567"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Concerning a data product, "value" can be elusive. It's an insight that prevents a sorting machine from breaking down, an automated process that optimizes a delivery route to save fuel, or a model that improves address correction. The PSPO training taught me to make this concrete. I learned to define a clear Product Goal (our north star) and break it down into tangible Sprint Goals.&lt;/p&gt;

&lt;p&gt;This transformed our work. Our Sprint Goal was no longer "build a pipeline," but something like: "Provide the 'Address Quality' team with a reliable daily source of truth for returned mail, so they can validate their new correction algorithm."&lt;/p&gt;

&lt;p&gt;The Sprint Retrospective became the embodiment of our dual-company growth mindset. In one retro, we realized our planning was failing because the client's subject matter expert was only available on Thursdays. To solve this, our Agile Actors team proposed a new "Co-creation Wednesday" meeting. It wasn't in the Scrum guide, but it was our adaptation to make the framework succeed in our unique client environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trading the Keyboard for a Compass&lt;/strong&gt;&lt;br&gt;
The most challenging part was internal. My confidence came from my hands-on ability to solve problems. I remember a critical project where the team was wrestling with a nasty performance bug in our dbt models processing scanner data from the hubs. The build was taking three hours instead of thirty minutes. My fingers itched to dive into the Jinja macros and start debugging. I felt a pang of anxiety, a fear of losing my technical credibility.&lt;/p&gt;

&lt;p&gt;My chapter leader said, "You’re proving you can still handle the technical work. But the team doesn’t need another set of hands—they need someone to set direction and show them where to focus."&lt;/p&gt;

&lt;p&gt;That was a breakthrough. I had to learn to lead through influence, not instruction. My value was no longer in the code; it was in the clarity of the vision. I had to empower the engineering team and then get out of their way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A New Definition of 'Done'&lt;/strong&gt;&lt;br&gt;
 Today, my work starts with a need for data and ends with someone being able to act on it confidently. My definition of "done" has evolved. It’s no longer writing a custom pipeline to bring in a single source; it’s a new dataset flowing into the lakehouse through our ingestion framework with nothing more than a configuration file. It’s an engineer onboarding a system in hours instead of weeks, or an analyst querying consistent, well-documented data without worrying about hidden transformations. It’s a data scientist running experiments on fresh, trusted data because the platform makes quality and availability a given.&lt;/p&gt;

&lt;p&gt;I’ve shifted from building pipelines myself to enabling others to move faster, safer, and with more autonomy. “Done” is no longer code that works — it’s a platform that empowers. It’s a data scientist deploying a new address validation algorithm in minutes instead of weeks because our platform is robust. I've shifted from completing tasks to enabling outcomes.&lt;/p&gt;

&lt;p&gt;Becoming a Data Product Owner didn’t erase my engineering roots—it gave them purpose. The journey was a personal transformation, made possible by the unique partnership between a consultancy that invests in its people and a client that trusts them to solve real problems. I learned that the most powerful growth happens when you have the courage and the support to build not just the right thing, but the right thing together.&lt;/p&gt;

&lt;p&gt;Looking back, the hardest part wasn’t learning product frameworks or stakeholder management. It was letting go of the idea that my value was in the code I could write. My value became the clarity I could bring, the questions I could ask, and the outcomes I could enable for others.&lt;/p&gt;

&lt;p&gt;That shift — from outputs to outcomes, from what to why — changed not only my career, but also the way I see impact in any technical role.&lt;/p&gt;

&lt;p&gt;For anyone standing at a similar crossroads, my advice is simple: stay curious, ask the uncomfortable questions, and don’t be afraid to trade your keyboard for a compass. The right environment will see that curiosity not as scope creep, but as leadership in the making.&lt;/p&gt;

&lt;p&gt;At Agile Actors, we thrive on challenges with a bold and adventurous spirit. We confront problems directly, using cutting-edge technologies in the most innovative and daring ways. If you’re excited to join a dynamic learning organization where knowledge flows freely and skills are refined to excellence, come join our exceptional team. Let’s conquer new frontiers together. Check out our &lt;a href="https://apply.workable.com/agileactors/" rel="noopener noreferrer"&gt;openings&lt;/a&gt; and choose the Agile Actors Experience!&lt;/p&gt;

</description>
      <category>dataplatform</category>
      <category>dataproduct</category>
      <category>analytics</category>
      <category>scrum</category>
    </item>
    <item>
      <title>WebdriverIO Visual Click Service</title>
      <dc:creator>Thanos Tsiamis</dc:creator>
      <pubDate>Fri, 29 Aug 2025 06:56:06 +0000</pubDate>
      <link>https://dev.to/agileactors/webdriverio-visual-click-service-bpe</link>
      <guid>https://dev.to/agileactors/webdriverio-visual-click-service-bpe</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Automating user interfaces has come a long way—but there are still situations where traditional methods fall flat. One of the biggest challenges arises when working with canvas-based applications, where no DOM elements exist for key interactive components. This makes it nearly impossible for standard test frameworks to simulate interactions like clicks, taps, or hovers using selectors alone.&lt;/p&gt;

&lt;p&gt;This blog post introduces a novel solution to this problem: the wdio-visual-click-service, a new plugin for WebdriverIO that allows test scripts to interact with UI components using image matching instead of DOM queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;In modern UI automation, developers and software engineers in test often run into limitations when trying to interact with components that don’t expose reliable DOM selectors—especially in canvas-based interfaces like lottery games, drawing tools, or dynamic third-party widgets. Traditional approaches using CSS or XPath selectors fall short in these scenarios.&lt;/p&gt;

&lt;p&gt;Consider a fictional arcade game called Whack a Guacamole. It's a lighthearted twist on the classic whack-a-mole—but with avocados instead of moles.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4x18gltzsdq4essg5vj.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4x18gltzsdq4essg5vj.jpg" alt="WhackaGuacaMole"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Avocados pop up at random positions.&lt;/p&gt;

&lt;p&gt;Your objective is to click on as many avocados as possible before time runs out.&lt;/p&gt;

&lt;p&gt;Occasionally, a pufferfish appears as a trap—clicking it penalizes you with -10 points.&lt;/p&gt;

&lt;p&gt;Simple concept. Complex automation.&lt;/p&gt;

&lt;p&gt;When you inspect the DOM while the game is running, you’ll notice something alarming for any automation engineer: no individual HTML elements represent the avocados or the pufferfish. All visual components are drawn directly onto the canvas using JavaScript’s rendering context.&lt;/p&gt;

&lt;p&gt;Standard testing tools like WebdriverIO rely on querying the DOM to locate elements. In the case of Guacamole, trying to write a selector such as:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$('img[src="avocado.png"]')&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;…will yield nothing.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;That’s because the avocado isn’t an &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; or a &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; —it’s just a group of pixels rendered directly on the canvas.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Question
&lt;/h2&gt;

&lt;p&gt;How can we verify click functionality or automate interactions with components that don’t exist in the DOM at all?&lt;/p&gt;

&lt;p&gt;This is where the &lt;strong&gt;wdio-visual-click-service&lt;/strong&gt; (VCS) comes in. Instead of relying on the DOM, this service uses visual data—scanning the screen for a reference image and simulating a click at the detected location.&lt;/p&gt;

&lt;h3&gt;
  
  
  What It Supports
&lt;/h3&gt;

&lt;p&gt;The VCS supports two image-matching engines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;OpenCV: For robust, multi-scale template matching using grayscale comparison&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Pixelmatch (via Jimp): A lighter, pixel-by-pixel fallback engine&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Usage
&lt;/h3&gt;

&lt;p&gt;Once the plugin is installed, it automatically registers a new browser command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clickByMatchingImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;referenceImagePath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;?);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You do not need to register this manually in a hook. Just enable the service in your &lt;code&gt;wdio.conf.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;WebdriverIO&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;visual-click&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, in your test, call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clickByMatchingImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./images/avocado.png&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The plugin takes care of everything else—from taking a screenshot to matching it with the reference image, to simulating the click.&lt;/p&gt;

&lt;h2&gt;
  
  
  Under the Hood: How It Works
&lt;/h2&gt;

&lt;p&gt;The wdio-visual-click-service defines a WebDriverIO service that registers a new command in the &lt;code&gt;before()&lt;/code&gt; lifecycle hook. This command—clickByMatchingImage—can be invoked in your test scripts to locate a reference image on screen and perform a click at the match location.&lt;/p&gt;

&lt;p&gt;The plugin attempts to load the &lt;code&gt;@u4/opencv4nodejs&lt;/code&gt; module. If OpenCV is available, it uses it for precise and scalable image recognition. If not, it gracefully falls back to a lighter image comparison engine using Jimp and Pixelmatch.&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenCV Engine: Scalable, Precise Matching
&lt;/h3&gt;

&lt;p&gt;When OpenCV is available, the plugin uses template matching to scan the screenshot for the reference image.&lt;/p&gt;

&lt;p&gt;At a high level, the process works as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A screenshot of the browser viewport is captured.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The reference image (e.g., an avocado) is resized to multiple scales (e.g., 1.0, 0.9, 0.8) to account for potential visual differences in size.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s the key snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;matched&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;grayScreenshot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;matchTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;resizedRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TM_CCOEFF_NORMED&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;maxVal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;maxLoc&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;matched&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;minMaxLoc&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This does two important things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;matchTemplate()&lt;/code&gt; produces a correlation map—a matrix where each cell contains a similarity score representing how well the reference matches that region of the screenshot.&lt;br&gt;
&lt;code&gt;cv.TM_CCOEFF_NORMED&lt;/code&gt; is the matching method used. It stands for Normalized Cross-Correlation Coefficient, which gives a match score between -1 and 1. A score of 1 means a perfect match.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;minMaxLoc()&lt;/code&gt; then retrieves the best match from that matrix. &lt;code&gt;maxVal&lt;/code&gt; the confidence score of the best match and &lt;code&gt;maxLoc&lt;/code&gt; the top-left coordinate where that best match was found.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If maxVal exceeds the confidence threshold (e.g., 0.7), the plugin computes the center point of the match and simulates a click at that location.&lt;br&gt;
This process is repeated across different scales of the reference image, ensuring reliable matches even if the UI is resized or rendered differently.&lt;/p&gt;
&lt;h3&gt;
  
  
  Pixelmatch Fallback Engine: Lightweight but Effective
&lt;/h3&gt;

&lt;p&gt;If OpenCV is not available, the plugin falls back to a custom pixel comparison engine built on Jimp and Pixelmatch.&lt;/p&gt;

&lt;p&gt;This approach involves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Iteratively cropping and comparing regions of the screenshot with the reference image&lt;/li&gt;
&lt;li&gt;Using a configurable stride to balance performance and granularity&lt;/li&gt;
&lt;li&gt;Calculating a match confidence as the ratio of identical pixels&lt;/li&gt;
&lt;li&gt;Refining the match by scanning a smaller area near the best initial result&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Though not as fast or robust as OpenCV, this fallback engine still provides accurate results for most use cases—particularly when the screen resolution and content are relatively stable.&lt;/p&gt;
&lt;h2&gt;
  
  
  Click Accuracy: Handling Screen Resolution
&lt;/h2&gt;

&lt;p&gt;Whether using OpenCV or the fallback engine, the final match coordinates are adjusted based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The current device pixel ratio (DPR)&lt;/li&gt;
&lt;li&gt;The browser viewport dimensions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is handled by the internal &lt;code&gt;clickAt(x, y)&lt;/code&gt; function, which scales coordinates appropriately and simulates the click using WebDriver's native pointer actions. It ensures that the click is placed exactly where a human would expect it—regardless of display density or zoom level.&lt;/p&gt;
&lt;h2&gt;
  
  
  Configuration Options
&lt;/h2&gt;

&lt;p&gt;To give Software Engineers in Test flexibility and precision, the clickByMatchingImage command supports an optional options object. This allows you to control how aggressively and accurately the service searches for a match. Here's what you can configure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clickByMatchingImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;images/avocado.png&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;scales&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.75&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  scales:
&lt;/h3&gt;

&lt;p&gt;Control Matching Resilience to Size Changes&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Type: number[]&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Default: [1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3]&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The scales array determines how many different sizes of the reference image are tried during the matching phase. This is particularly useful when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The same UI element may appear larger or smaller depending on screen size or resolution.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The canvas is rendered at different sizes in different test environments (e.g., mobile vs. desktop).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The browser zoom level or device pixel ratio affects the apparent size of the image.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By default, the plugin tries 1.0 (full size), then scales down in steps as low as 0.3. This wide range ensures high robustness but may increase execution time. If you know what size to expect, you can limit the array to just a few values for faster tests:&lt;br&gt;
e.g.&lt;br&gt;
&lt;code&gt;scales: [1.0, 0.95, 0.9]  // Faster but still tolerant to slight resizing&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This level of configurability helps tailor matching performance to your environment's predictability.&lt;/p&gt;
&lt;h3&gt;
  
  
  confidence:
&lt;/h3&gt;

&lt;p&gt;Set the Minimum Match Quality&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Type: number&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Default: 0.7&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The confidence setting determines the minimum similarity score required for a match to be accepted. The score ranges from 0 to 1, where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;1 means a perfect match&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;0 means no similarity&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This threshold is critical for avoiding false positives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A higher value like 0.9 ensures that only highly accurate matches are accepted—ideal for static, predictable UIs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A lower value like 0.6 can help in visually noisy or dynamically styled applications, where minor differences (e.g., shadows, gradients, or anti-aliasing) could otherwise block the match.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's how it might look in use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clickByMatchingImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;images/target.png&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.85&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the best match on screen doesn’t reach the specified confidence, the command will throw an error—indicating that no satisfactory match was found.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-World Examples
&lt;/h3&gt;

&lt;p&gt;In a lottery scratch card UI where card pieces appear in slightly different positions and sizes due to animation, you'd want a broader scale range (e.g., scales: [1.0, 0.95, 0.9, 0.85]) and a moderate confidence (confidence: 0.75).&lt;/p&gt;

&lt;p&gt;For a CAPTCHA click test, where visual accuracy is paramount, you'd use a tighter scale range and a high confidence threshold (confidence: 0.9) to avoid false clicks.&lt;/p&gt;

&lt;p&gt;In a responsive game like Whack a Guacamole, where avocados may scale down on smaller screens, a wider scale range is essential, but confidence could remain at a medium level depending on how stylized the visuals are.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;Automating canvas-based interfaces has long been a gap in the test automation landscape. With the introduction of the wdio-visual-click-service, you can now simulate human-like interactions in scenarios where DOM-based selectors fail. Whether you’re testing mini-games, dynamic visualizations, or embedded third-party tools, this plugin offers a powerful new way to bring reliability and precision to your tests.&lt;/p&gt;

&lt;p&gt;The future of UI automation isn’t just in the DOM—it’s on the screen. And with visual matching, you’re one step closer to full coverage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Repository
&lt;/h2&gt;

&lt;p&gt;You can find the source code, installation instructions, and usage examples in the GitHub repository:&lt;br&gt;
&lt;a href="https://github.com/webdriverio-community/wdio-visual-click-service" rel="noopener noreferrer"&gt;wdio-visual-click-service&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In addition, the Whack a Guacamole game example shown above can be found &lt;a href="https://github.com/webdriverio-community/wdio-visual-click-service/tree/main/example" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you're passionate about solving hard problems, building tools like this, and working with top-tier engineers Agile Actors is hiring! Check out our &lt;a href="https://apply.workable.com/agileactors/" rel="noopener noreferrer"&gt;open positions&lt;/a&gt; and join the team.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Hands-on Monitoring and Alerting guide for Azure resources</title>
      <dc:creator>Gregory Savvidis</dc:creator>
      <pubDate>Wed, 18 Jun 2025 09:04:25 +0000</pubDate>
      <link>https://dev.to/agileactors/hands-on-monitoring-and-alerting-guide-for-azure-resources-3a12</link>
      <guid>https://dev.to/agileactors/hands-on-monitoring-and-alerting-guide-for-azure-resources-3a12</guid>
      <description>&lt;p&gt;When talking about software quality and detecting flaws early, what immediately comes to mind is writing tests and enforcing them as soon as possible in the CI/CD process. Overall, quality is about ensuring reliability throughout the entire implemented solution. This can be tightly coupled with monitoring resources, tracking performance and setting up early alerting mechanisms. By proactively detecting issues like high CPU usage, memory leaks, or slow response times, teams can prevent failures before they impact users.&lt;/p&gt;

&lt;p&gt;In this article we are going to focus on other aspects of quality that do not necessarily require writing and executing tests, but instead utilize metrics and logs provided by the Azure Portal directly and visualize them on an Azure Workbook as an interactive and customizable data visualization tool within Azure Portal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting the scene
&lt;/h2&gt;

&lt;p&gt;Imagine you're part of a DevOps team responsible for maintaining an application hosted on Azure. Before going to production you would like to be in a position to early detect slowdowns and occasional service disruptions. Without a clear picture of the system's health and performance, it's difficult to pinpoint the cause and respond quickly. This lack of visibility and proactive alerting leads to longer downtime and frustrated customers. To address this, we need a robust monitoring and alerting strategy using Azure's built-in tools - starting with identifying where the problem lies, setting up monitoring for relevant metrics and building alerting rules that help us react before users are affected.&lt;/p&gt;

&lt;p&gt;Let's say we're responsible for maintaining an Orders API, which handles incoming HTTP requests from a web frontend app to process customer orders. It's hosted on Azure App Service and backed by an Azure SQL Database while Application Insights and/or Log Analytics workspace is enabled. Recently, support tickets have reported that requests to the &lt;code&gt;/submit-order&lt;/code&gt; endpoint occasionally take too long or fail, especially during high traffic periods.&lt;/p&gt;

&lt;p&gt;To diagnose and resolve this, we want to answer the following questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is the API experiencing high response times or failures?&lt;/li&gt;
&lt;li&gt;What's causing the slowdown - CPU/memory pressure, database latency, or something else?&lt;/li&gt;
&lt;li&gt;Would it be useful to set up alerts notifying us as soon as performance degrades?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our approach will follow these steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Monitor metrics&lt;/strong&gt; to understand the API's real-time performance (e.g., response time, request count, error rate)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable Diagnostic Logs&lt;/strong&gt; to capture deeper insights into failures and long-term trends using Log Analytics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use KQL Queries&lt;/strong&gt; to investigate patterns and detect anomalies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create a Workbook&lt;/strong&gt; to visualize the data in a centralized, interactive dashboard&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Define Alerts&lt;/strong&gt; with thresholds that will notify us when performance degrades or errors spike.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This structured approach ensures we're not just reacting to problems, but actively detecting and preventing them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitor metrics
&lt;/h2&gt;

&lt;p&gt;To begin with troubleshooting the performance issues on &lt;code&gt;/submit-order&lt;/code&gt; endpoint, we start by examining the available metrics provided by the Azure App Service that hosts our Orders API. These metrics give us a snapshot of how the application is performing in real time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Navigate to Metrics in Azure Portal
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to the Azure Portal&lt;/li&gt;
&lt;li&gt;In the search bar, type and select your &lt;strong&gt;App Service&lt;/strong&gt; (e.g., orders-api-prod)&lt;/li&gt;
&lt;li&gt;In the left-hand menu under &lt;strong&gt;Monitoring&lt;/strong&gt;, click &lt;strong&gt;Metrics&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqpid6zy0lyyrydy9wypt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqpid6zy0lyyrydy9wypt.png" alt="Image description" width="261" height="617"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After clicking on Metrics, we can choose the one we want to monitor and see a graphical representation of it. For example, we can select from the dropdown the Response time and get the following graph:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fax9nkt5nmv3ybkvcf0d2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fax9nkt5nmv3ybkvcf0d2.png" alt="Image description" width="800" height="323"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Other metrics can be utilized to address user complaints and align with our system architecture. For example we can choose from the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Server response time&lt;/strong&gt; - Tells us how long it takes to respond to HTTP requests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Requests&lt;/strong&gt; - Shows the number of incoming requests. Spikes here may correlate with performance issues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP 5xx errors&lt;/strong&gt; - Indicates server-side errors, which can be tied to crashes or overload&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CPU Percentage&lt;/strong&gt; - Helps determine if the instance is under CPU pressure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory Working Set&lt;/strong&gt; - Tracks memory usage over time&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Monitor logs
&lt;/h2&gt;

&lt;p&gt;While metrics give us a real-time snapshot of the Orders API's performance, Application Insights and/or Log Analytics workspace logs provide a deeper and more granular view of what's actually happening inside the application. Logs can help answer questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which specific requests are failing and why?&lt;/li&gt;
&lt;li&gt;Are there specific error messages or exceptions being thrown?&lt;/li&gt;
&lt;li&gt;How is the backend database responding?&lt;/li&gt;
&lt;li&gt;What patterns can we identify over time?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Access and Explore Logs
&lt;/h2&gt;

&lt;p&gt;Once logging is enabled and data starts flowing into your workspace:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to your &lt;strong&gt;Log Analytics Workspace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click on &lt;strong&gt;Logs&lt;/strong&gt; (reference "Metrics section under Monitoring" image)&lt;/li&gt;
&lt;li&gt;In the query editor, you'll see several &lt;strong&gt;predefined tables&lt;/strong&gt; such as:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;AppRequests&lt;/code&gt; – HTTP request data (e.g., method, URL, duration)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AppExceptions&lt;/code&gt; – Exceptions thrown by your app&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AppTraces&lt;/code&gt; – Custom traces or log messages from your code&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AppDependencies&lt;/code&gt; – External calls, e.g., to databases or APIs&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In the query editor we use &lt;strong&gt;&lt;a href="https://learn.microsoft.com/en-us/kusto/query/?view=microsoft-fabric" rel="noopener noreferrer"&gt;Kusto Query Language (KQL)&lt;/a&gt;&lt;/strong&gt;, a read-only query language optimized for fast and efficient data exploration, enabling users to filter, aggregate and visualize large datasets easily.&lt;/p&gt;

&lt;p&gt;Here are a few useful KQL queries to start exploring what's happening behind the scenes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slow Requests to &lt;code&gt;/submit-order&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fah3crfnocyqrz4w6jy6h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fah3crfnocyqrz4w6jy6h.png" alt="Image description" width="800" height="207"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Count of Failed Requests:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcr9c6jx7nri9aro2ipll.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcr9c6jx7nri9aro2ipll.png" alt="Image description" width="600" height="136"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Top Exception Messages:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ferf89amsuw3p7idjf4fa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ferf89amsuw3p7idjf4fa.png" alt="Image description" width="594" height="128"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Configure Diagnostic Settings
&lt;/h2&gt;

&lt;p&gt;In case the AppExceptions table is not available or any other necessary tables, we can enable Diagnostic settings to send these logs to a specific &lt;strong&gt;Log Analytics Workspace&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To start capturing logs, we need to ensure our App Service is sending data to a &lt;strong&gt;Log Analytics Workspace&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to your &lt;strong&gt;Orders API App Service&lt;/strong&gt; in the Azure Portal&lt;/li&gt;
&lt;li&gt;Under &lt;strong&gt;Monitoring&lt;/strong&gt;, click &lt;strong&gt;Diagnostic settings&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Add diagnostic setting&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Give your setting a name and check:

&lt;ul&gt;
&lt;li&gt;Application Logging&lt;/li&gt;
&lt;li&gt;Request Logs&lt;/li&gt;
&lt;li&gt;Failed request tracing&lt;/li&gt;
&lt;li&gt;AppServiceHTTPLogs&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt; 5. Select &lt;strong&gt;Send to Log Analytics Workspace&lt;/strong&gt; and choose an existing workspace or create a new one&lt;br&gt;
 6. Click Save&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Note: Logs can differ depending on the resource type. For App Services, HTTP logs and application logs are particularly useful.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once the Diagnostic settings are set, the steps are identical with the previous case where we use KQL query on the Log Analytics workspace.&lt;/p&gt;

&lt;h2&gt;
  
  
  Workbooks
&lt;/h2&gt;

&lt;p&gt;Understanding metrics, logs, and queries is the first step in enabling Azure resource monitoring. Once this foundation is established, we can analyze individual resources by visiting them and monitoring their behavior. However, for a more comprehensive and centralized approach, it is essential to consolidate metrics and logs in a single, structured view.&lt;/p&gt;

&lt;p&gt;One of the visualization tools provided by the Azure Portal is Azure Workbooks. This feature allows users to analyze and visualize data from various Azure resources, logs, and metrics within a single, interactive interface.&lt;/p&gt;

&lt;p&gt;Creating an Azure Workbook is a straightforward process. Simply type &lt;em&gt;Azure Workbooks&lt;/em&gt; in the Azure Portal search bar, select the service, and click on the &lt;em&gt;Create&lt;/em&gt; button. From this point, users can choose to create either an empty Workbook or select from preconfigured templates that cater to common monitoring scenarios.&lt;/p&gt;

&lt;p&gt;Regardless of the option chosen, users can click on &lt;em&gt;Edit&lt;/em&gt; to customize the Workbook according to their requirements. Within the edit mode, clicking on the &lt;em&gt;Add&lt;/em&gt; button allows the inclusion of various visualization components&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkpd7924z6bnl2g4stqlp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkpd7924z6bnl2g4stqlp.png" alt="Image description" width="134" height="296"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As seen on the image above, we are able to utilize multiple options to make our Workbook meet our needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Text&lt;/strong&gt; - add markdown or HTML-based text to provide descriptions, explanations, or headers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query&lt;/strong&gt; - run Kusto Query Language (KQL) queries to fetch data from Log Analytics, Azure Resource Graph, or Application Insights&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parameters&lt;/strong&gt; - Define dropdowns, text inputs, or checkboxes to make Workbooks dynamic and interactive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Links &amp;amp; Tabs&lt;/strong&gt; - Add navigation links or tabs to switch between different sections of a Workbook&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metrics&lt;/strong&gt; - Fetch real-time Azure Metrics (e.g., CPU usage, memory utilization) and display them visually&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Group&lt;/strong&gt; - helps in organizing content logically, making the Workbook easier to read&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We can choose &lt;strong&gt;Metrics&lt;/strong&gt; where the predefined metrics (per resource) are available to be displayed or &lt;strong&gt;Query&lt;/strong&gt; where the same KQL query from before can be applied.&lt;/p&gt;

&lt;p&gt;Once the data is loaded we can choose the preferred visualization option:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Charts (area, bar, line, pie, scatter, time)&lt;/li&gt;
&lt;li&gt;Grids&lt;/li&gt;
&lt;li&gt;Tiles&lt;/li&gt;
&lt;li&gt;Stats&lt;/li&gt;
&lt;li&gt;Graphs&lt;/li&gt;
&lt;li&gt;Maps&lt;/li&gt;
&lt;li&gt;Text visualization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Creating custom Workbooks provides a graphical visualization of the resources both for tech and non tech people.&lt;/p&gt;

&lt;h2&gt;
  
  
  Alerting
&lt;/h2&gt;

&lt;p&gt;Creating Alert rules is a very easy process, as we can simply reuse the same metrics and/or queries that we have used on our Azure Workbook. Following these steps it will allow us to set up an alert:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click &lt;em&gt;Create Alert rule&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Under &lt;em&gt;Scope&lt;/em&gt;, select the Azure resource you want to monitor&lt;/li&gt;
&lt;li&gt;Under &lt;em&gt;Condition&lt;/em&gt;, define the metrics and queries condition that should trigger the alert&lt;/li&gt;
&lt;li&gt;Under &lt;em&gt;Actions&lt;/em&gt;, select or create an Action Group to define who gets notified&lt;/li&gt;
&lt;li&gt;Provide a name and severity level for the alert rule.&lt;/li&gt;
&lt;li&gt;Click &lt;em&gt;Create&lt;/em&gt; to finalize the alert rule&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5wltyvmozmk4zqqugq8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5wltyvmozmk4zqqugq8.png" alt="Image description" width="760" height="279"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In conclusion, effective Monitoring and Alerting in Azure is essential for maintaining visibility, performance, and security across cloud resources. Azure Workbooks provide a centralized and interactive way to visualize metrics and logs, enabling teams to analyze data efficiently. Meanwhile, Azure Alerts ensure proactive monitoring by automatically notifying the right people and triggering automated actions when predefined conditions are met. By leveraging Action groups, organizations can streamline alert management and ensure timely responses to potential issues.&lt;/p&gt;

&lt;p&gt;Combining these tools allows for a comprehensive monitoring strategy, where teams can track, analyze, and respond to system behavior in real time. With proper Workbook customization, Alert rule configuration, and Action group management, businesses can optimize performance, reduce downtime, and enhance overall cloud reliability.&lt;/p&gt;

&lt;p&gt;In case you are looking for a dynamic and knowledge-sharing workplace that respects and encourages your personal growth as part of its own development, we invite you to explore our current &lt;a href="https://apply.workable.com/agileactors/" rel="noopener noreferrer"&gt;job opportunities&lt;/a&gt; and be part of Agile Actors.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Kubernetes cluster Tenancy and OIDC Login made easy with Capsule, Keycloak and Kubelogin</title>
      <dc:creator>Stelios Mantzouranis</dc:creator>
      <pubDate>Thu, 05 Jun 2025 08:34:27 +0000</pubDate>
      <link>https://dev.to/agileactors/kubernetes-cluster-tenancy-and-oidc-login-made-easy-with-capsule-keycloak-and-kubelogin-376p</link>
      <guid>https://dev.to/agileactors/kubernetes-cluster-tenancy-and-oidc-login-made-easy-with-capsule-keycloak-and-kubelogin-376p</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As organizations pursue greater scalability and operational efficiency, microservices have become a preferred architectural approach. This shift often leads to development teams being organized around individual microservices, with each team owning and maintaining its specific service. These microservices are typically deployed within a shared Kubernetes cluster. &lt;/p&gt;

&lt;p&gt;However, this setup can introduce logistical challenges for cluster administrators. Team members often have varying levels of familiarity with Kubernetes concepts, and developer experience can differ significantly across teams. As a result, there is a growing need to isolate each team within its own partition of the cluster while still providing them with &lt;strong&gt;API&lt;/strong&gt; access (for &lt;strong&gt;kubectl **or **k9s&lt;/strong&gt;) to manage their workloads independently. &lt;/p&gt;

&lt;p&gt;In this article, you’ll learn how to partition a Kubernetes cluster into separate tenants and provide tenant administrators and users with &lt;strong&gt;API&lt;/strong&gt;&lt;br&gt;
access to their specific environments. This will be achieved using Capsule for multi-tenancy, Keycloak for user management, and &lt;strong&gt;kubelogin&lt;/strong&gt; for dynamic context creation. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setting up a development environment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before setting up the development environment, ensure that &lt;strong&gt;kubectl&lt;/strong&gt; and &lt;strong&gt;helm&lt;/strong&gt; are installed on your local machine. &lt;/p&gt;

&lt;p&gt;To test our solution, we’ll need a local development environment that simulates a Kubernetes cluster. There are several options available, but one of the most popular and user-friendly tools is &lt;strong&gt;Minikube&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;You can follow this guide to install Minikube: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://minikube.sigs.k8s.io/docs/start/?arch=%2Fwindows%2Fx86-64%2Fstable%2F.exe+download" rel="noopener noreferrer"&gt;Mac/Windows/Linux: Minikube official installation guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Optional&lt;/strong&gt;: It is recommended to use &lt;strong&gt;k9s&lt;/strong&gt; to easily view, edit and delete our cluster resources without typing &lt;strong&gt;kubectl&lt;/strong&gt; commands. You will find installation instructions &lt;a href="https://k9scli.io/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installing Dependencies to our Minikube cluster&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Installing Keycloak&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We can now start by deploying Keycloak, which will serve as our identity provider for managing users and authentication. &lt;/p&gt;

&lt;p&gt;We’ll use Bitnami’s &lt;strong&gt;Helm&lt;/strong&gt; chart for Keycloak, which makes the installation and configuration process straightforward.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create ns keycloak
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm install keycloak bitnami/keycloak -n keycloak --set auth.adminUser=admin --set auth.adminPassword=admin123 --set postgresql.enabled=true --set postgresql.auth.postgresPassword=admin123 --set postgresql.auth.username=keycloak --set postgresql.auth.password=keycloak123 --set postgresql.auth.database=keycloak
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After a while to verify installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n keycloak 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Installing Capsule&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://projectcapsule.dev/" rel="noopener noreferrer"&gt;Capsule&lt;/a&gt; is a Kubernetes multi-tenancy operator that helps isolate workloads between teams while sharing the same cluster. In this step, we’ll install Capsule and configure it to recognize three specific user groups the default one &lt;strong&gt;capsule.clastix.io&lt;/strong&gt;, &lt;strong&gt;group-a&lt;/strong&gt; and &lt;strong&gt;group-b&lt;/strong&gt;.    &lt;/p&gt;

&lt;p&gt;Save the following as &lt;strong&gt;capsule-values.yaml&lt;/strong&gt; &lt;br&gt;
This file contains the full configuration for Capsule. It defines security contexts, CRD behaviour, user group access, and more.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;global:
  jobs:
    kubectl:
       ttlSecondsAfterFinished: 60
manager:
  options:
    forceTenantPrefix: true
    capsuleUserGroups: ["capsule.clastix.io", "group-a", "group-b"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install Capsule with configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add projectcapsule https://projectcapsule.github.io/charts 
helm repo update   
kubectl create ns capsule-system 
helm install capsule projectcapsule/capsule -n capsule-system --version 0.7.4 -f capsule-values.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After a while to verify installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n capsule-system 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see the Capsule manager pod running in the capsule-system namespace. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Install kubelogin&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To create the kubectl contexts dynamically when authenticating via OIDC we will need to install kubelogin, which works as an add-on of our kubectl tool. The installation instructions can be found &lt;a href="https://github.com/int128/kubelogin" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setting up OIDC configuration with Minikube + Keycloak + Kube OIDC Login&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Install Ingress Controller in Minikube&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In order to provide an HTTPS secure &lt;strong&gt;OIDC_ISSUER_URL&lt;/strong&gt; to our Minikube cluster API Server, we will need first to configure our minikube installation with an ingress controller enabled.&lt;/p&gt;

&lt;p&gt;While the minikube cluster is up and running.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;minikube addons enable ingress
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After a while an ingress controller will be installed in our minikube cluster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Install mkcert and create a local certificate&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/FiloSottile/mkcert" rel="noopener noreferrer"&gt;Mkcert&lt;/a&gt; is a zero-config tool that will allow us to create a local certificate.&lt;/p&gt;

&lt;p&gt;After installing we will use it in order to create a certificate for keycloak.local.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkcert -cert-file tls.crt -key-file tls.key keycloak.local 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Reconfigure Keycloak to include Ingress configuration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With the certificate "at hand" we will update our keycloak installation to include ingress configuration.&lt;/p&gt;

&lt;p&gt;But first let us create a tls secret for the certificate&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create secret tls keycloak-tls --cert=tls.crt --key=tls.key --namespace=keycloak 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And afterwards update the existing keycloak configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade keycloak bitnami/keycloak -n keycloak --set auth.adminUser=admin --set auth.adminPassword=admin123 --set postgresql.enabled=true --set postgresql.auth.postgresPassword=admin123 --set postgresql.auth.username=keycloak --set postgresql.auth.password=keycloak123 --set postgresql.auth.database=keycloak --set ingress.enabled=true --set ingress.ingressClassName=nginx --set ingress.tls=true --set ingress.extraTls[0].hosts[0]=keycloak.local --set ingress.extraTls[0].secretName=keycloak-tls
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now our Keycloak server is exposed but our browser needs to somehow recognise the minikube ip as keycloak.local. That is achieved by editing the &lt;strong&gt;C:\Windows\System32\drivers\etc\hosts&lt;/strong&gt; file and adding a line in the following format "Minikube IP keycloak.local". You can get the &lt;strong&gt;minikube&lt;/strong&gt; ip by using the following command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;minikube ip 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After a brief moment you should be able to see keycloak login page in your browser at &lt;a href="https://keycloak.local" rel="noopener noreferrer"&gt;https://keycloak.local&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Create our test realm and user&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Since we can view our keycloak installation front-end, we will use it to create our first test user. (Remember that username is admin and password is admin123) But first we will need to create a test realm, in order to do that we will navigate as following &lt;strong&gt;Manage realms&lt;/strong&gt;&amp;gt;&lt;strong&gt;Create Realm&lt;/strong&gt;. Then fill out the form:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14k0pa84ui841ut5qxg3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14k0pa84ui841ut5qxg3.png" alt="Screenshot of keycloak realm creation" width="800" height="559"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Afterwards we wil navigate to &lt;strong&gt;Users&lt;/strong&gt;&amp;gt;&lt;strong&gt;Add User&lt;/strong&gt; and submit the creation form as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fljcrx8kp8ngong4j3pst.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fljcrx8kp8ngong4j3pst.png" alt="Screenshot of keycloak user creation" width="800" height="576"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ok so having done that we need to configure a password for our user, by nagivating to &lt;strong&gt;Users&lt;/strong&gt;&amp;gt;&lt;strong&gt;Our user&lt;/strong&gt;&amp;gt;&lt;strong&gt;Credentials&lt;/strong&gt;&amp;gt;&lt;strong&gt;Set Password&lt;/strong&gt; where we will add our password as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F12fwzr6an0n0qmva2c31.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F12fwzr6an0n0qmva2c31.png" alt="Screenshot of password creation in keycloak" width="710" height="397"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important Notice&lt;/strong&gt;: Keycloak is a very active project and these instructions may be outdated at time of reading.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Create a Kubernetes client&lt;/strong&gt;&lt;br&gt;
In Keycloak, a client represents an application or service that wants to authenticate users or access protected resources.&lt;br&gt;
Clients can be web applications, mobile apps, APIs, or any system that needs to integrate with Keycloak for authentication and authorization. Each client is configured with specific settings like redirect URIs, authentication flows, and access permissions that define how it can interact with Keycloak's identity and access management features.&lt;br&gt;
So we will create a client name Kubernetes. By clicking on &lt;strong&gt;Clients&lt;/strong&gt;&amp;gt;&lt;strong&gt;Create Client&lt;/strong&gt; we will create the client as follows page per page.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fighf348iarxhedr6i01i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fighf348iarxhedr6i01i.png" alt=" " width="800" height="325"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdh3rj8bkxkyu6vq1no5h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdh3rj8bkxkyu6vq1no5h.png" alt=" " width="800" height="273"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4nh8gdbwb6nt9ttx7o1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4nh8gdbwb6nt9ttx7o1.png" alt=" " width="800" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Create a Kubernetes client dedicated mapper&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A Keycloak mapper dedicated to one client is a configuration that defines how user data (like roles, attributes, or groups) is included in tokens only for a specific client. It customizes the token content that the client receives, without affecting others.&lt;br&gt;
First of all we need to navigate to &lt;strong&gt;Clients&lt;/strong&gt;&amp;gt;&lt;strong&gt;kubernetes&lt;/strong&gt;&amp;gt;&lt;strong&gt;Client scopes&lt;/strong&gt;&amp;gt;&lt;strong&gt;kubernetes-dedicated&lt;/strong&gt;&amp;gt;&lt;strong&gt;Configure a new mapper&lt;/strong&gt;. There will select &lt;strong&gt;group membership&lt;/strong&gt; and fill it out as follows:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp2uznv07utvx2vikw5oj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp2uznv07utvx2vikw5oj.png" alt="Screenshot of keycloak app" width="800" height="495"&gt;&lt;/a&gt;&lt;br&gt;
Afterwards we will repeat the process and select &lt;strong&gt;audience&lt;/strong&gt; and fill it out as follows:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frgtxa2brng6nvbmlwv0r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frgtxa2brng6nvbmlwv0r.png" alt="screenshot of keycloak app" width="800" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Test our user and client setup&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In order to execute this step we will need first to export some variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export KEYCLOAK=keycloak.local
export REALM=demo
export OIDC_ISSUER=${KEYCLOAK}/realms/${REALM}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then execute the command below. Keep in mind that you can find your &lt;strong&gt;CLIENT_SECRET&lt;/strong&gt; by navigating to &lt;strong&gt;Clients&lt;/strong&gt; &amp;gt; &lt;strong&gt;Kubernetes&lt;/strong&gt; &amp;gt; &lt;strong&gt;Credentials&lt;/strong&gt; and copy it to your clipboard.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -k -s https://${OIDC_ISSUER}/protocol/openid-connect/token \
     -d grant_type=password \
     -d response_type=id_token \
     -d scope=openid \
     -d client_id=kubernetes \
     -d client_secret=${OIDC_CLIENT_SECRET} \
     -d username=test \
     -d password=test | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The expected result is like the one below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"access_token":"**token gibberish**","not-before-policy":0,"session_state":"e9cfe1a8-5d84-41db-a2ef-0cac8aa7787d","scope":"openid email audience groups profile"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: It is critical that you see groups and audience in the request's response. We will leverage this info later for Capsule integration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8. Configure Minikube API Server to use our Keycloak server as its OIDC Issuer&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To authenticate our users based on Keycloak's response we will need to make our Kube API server to trust Keycloak.&lt;/p&gt;

&lt;p&gt;First things first we will need to create a custom directory in our &lt;strong&gt;minikube&lt;/strong&gt; node.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;minikube ssh -- sudo mkdir -p /var/lib/minikube/certs/custom
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that we will need to copy the tls.crt file ,that we used as a certificate, to our &lt;strong&gt;minikube&lt;/strong&gt; node.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;minikube cp /path/to/tls.crt /var/lib/minikube/certs/custom/tls.crt 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally we will restart our &lt;strong&gt;minikube&lt;/strong&gt; cluster with our new configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;minikube start --extra-config=apiserver.oidc-issuer-url=https://keycloak.local/realms/demo --extra-config=apiserver.oidc-username-claim=preferred_username --extra-config=apiserver.oidc-ca-file=/var/lib/minikube/certs/custom/tls.crt --extra-config=apiserver.oidc-groups-claim=groups --extra-config=apiserver.oidc-username-prefix=- --extra-config=apiserver.oidc-client-id=kubernetes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For more details on the matter of minikube oidc connect you can find information &lt;a href="https://minikube.sigs.k8s.io/docs/tutorials/openid_connect_auth/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;9. Connect to the cluster via kube oidc login&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now it is time to validate if we can login via kube oidc-login to our cluster via Keycloak.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl oidc-login setup --oidc-issuer-url=https://keycloak.local/realms/demo --oidc-client-id=kubernetes --oidc-client-secret=$OIDC_CLIENT_SECRET --certificate-authority=./tls.crt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you were prompted to visit localhost:8000 and authenticated with username test and password test. Then congrats you have succesfully connected your kubectl to the minikube cluster via Keycloak. That is great, but we are not done yet. Now it is time to setup the tenancy-side of things.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Configuring Cluster Tenancy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Back when we configured Capsule we specified 3 different &lt;strong&gt;capsuleUserGroups&lt;/strong&gt; in our YAML configuration (&lt;strong&gt;capsule-values.yaml&lt;/strong&gt;).&lt;br&gt;
These 3 groups are the key to partioning the cluster. So we will leverage them in order to complete our endeavour.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Create Keycloak User Groups&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;These 3 groups should not only be part of Capsule but also of Keycloak, therefore we will navigate to &lt;strong&gt;Groups&lt;/strong&gt;&amp;gt;&lt;strong&gt;Create Group&lt;/strong&gt;. We will create a group called &lt;strong&gt;capsule.clastix.io&lt;/strong&gt;. After creating the group we will click &lt;strong&gt;capsule.clastix.io&lt;/strong&gt; and create two &lt;strong&gt;child&lt;/strong&gt; groups one called &lt;strong&gt;group-a&lt;/strong&gt; and called &lt;strong&gt;group-b&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Create Capsule Tenants&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A tenant is &lt;strong&gt;Capsule&lt;/strong&gt;'s way of partitioning the cluster and designating partition (tenant) admins. More information about the kubernetes resource can be found &lt;a href="https://projectcapsule.dev/docs/tenants/" rel="noopener noreferrer"&gt;here&lt;/a&gt;. We will create two tenants one called &lt;strong&gt;group-a&lt;/strong&gt; and one called &lt;strong&gt;group-b&lt;/strong&gt;. Copy the code blocks below into a yaml file and then use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f /path/to/file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;---
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: group-a
spec:
  owners:
  - name: group-a
    kind: Group
---
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: group-b
spec:
  owners:
  - name: group-b
    kind: Group
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Create our tenant admins&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our test user has proven invaluable so far, but we will need to create two more users in our keycloak demo realm. You can follow the exact same process for our new users with the only addition being that you can make them join groups on the user creation form. Choose the group that corresponds to their name accordingly. The article will be referencing the two new users from now on as &lt;strong&gt;group-a-admin&lt;/strong&gt; and &lt;strong&gt;group-b-admin&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Login as group-a admin&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In order to login as group-a tenant admin initiate the OIDC Login process with the same command as before from your terminal. In the login page use the &lt;strong&gt;group-a&lt;/strong&gt; credentials to login. You will be prompted to run the following command by kubelogin.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl config set-credentials oidc \
  --exec-api-version=client.authentication.k8s.io/v1 \
  --exec-interactive-mode=Never \
  --exec-command=kubectl \
  --exec-arg=oidc-login \
  --exec-arg=get-token \
  --exec-arg="--oidc-issuer-url=https://keycloak.local/realms/demo" \
  --exec-arg="--oidc-client-id=kubernetes" \
  --exec-arg="--oidc-client-secret=mVBu9OyoBX6YPmuD0TgwZtNRHKjNAoc9" \
  --exec-arg="--certificate-authority=./tls.crt"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will setup user credentials for our oidc user which we can use to login as anybody that we have the credentials for. But before testing it we need to configure our kubectl context. Here is the kubectl command to configure it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl config set-context oidc@minikube --cluster='minikube'  --namespace='default' --user='oidc'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify the login by first changing your kubectl context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl config use-context oidc@minikube 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create ns test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the result was the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error from server (Forbidden): admission webhook "namespaces.projectcapsule.dev" denied the request: The namespace doesn't match the tenant prefix, expected group-a-test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then congrats you have managed to configure Tenancy in the cluster. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experimenting with our solution&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;First of all, let us start by creating a group-a tenant namespace.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create ns group-a-test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let us give a go to creating an nginx deployment in our new group-a namespace.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create deployment test-deployment --image=nginx -n group-a-test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Awesome now let us see if another tenant can interact with our nginx deployment in the group-a tenant. Use&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl oidc-login clean
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To remove your token and session from kubectl. And go to &lt;strong&gt;Sessions&lt;/strong&gt; in Keycloak in order to remove the existing group-a session.&lt;br&gt;
Now execute the following command which will prompt you to re-login.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Login as the group-b tenant admin and try the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl delete deployment test-deployment -n group-a-test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you get the following error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error from server (Forbidden): deployments.apps "test-deployment" is forbidden: User "group-b" cannot delete resource "deployments" in API group "apps" in the namespace "group-a-test"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tenancy has been successfully set up. Now the possibilities are endless you can create:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Many tenants and many admins.&lt;/li&gt;
&lt;li&gt;Make users part of many groups.&lt;/li&gt;
&lt;li&gt;Create tenant admins that are service accounts for automation pipelines.&lt;/li&gt;
&lt;li&gt;Create cluster wide admins groups.&lt;/li&gt;
&lt;li&gt;Create different roles that tenant owners will adopt to restrict permissions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This solution maybe be a little bit configuration-heavy but once setup it is as pliable as play-doh. So have fun experimenting!&lt;/p&gt;

&lt;p&gt;In case you are looking for an environment where learning and experimenting with new solutions is key, we invite you to explore our &lt;a href="https://apply.workable.com/agileactors/" rel="noopener noreferrer"&gt;current job opportunities&lt;/a&gt; and be part of Agile Actors.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>development</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>ECESCON 15 years later…</title>
      <dc:creator>Kostas Sidiropoulos</dc:creator>
      <pubDate>Wed, 28 May 2025 06:44:19 +0000</pubDate>
      <link>https://dev.to/agileactors/ecescon-15-years-later-1of9</link>
      <guid>https://dev.to/agileactors/ecescon-15-years-later-1of9</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx1ui1u99vb4457hndyo4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx1ui1u99vb4457hndyo4.png" width="700" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It’s been a few years (maybe not so few…) years since I entered the software development industry and started out my dynamic career that took me from developing software to creating tests that would challenge it to the breaking point! There have been many twists and turns in my journey, as expected, but one thing has remained constant throughout; my appreciation for the ever-evolving technologies that drive us forward.&lt;/p&gt;

&lt;p&gt;I can recall my student years filled with curiosity, working on different projects, and attending meetups and conferences to keep up with what was going on in the industry, one of which was EESTEC.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Allow me to take you on a journey on a timeline with my EESTEC experience!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;May 2007:&lt;/strong&gt; I had come across a new conference. It was the first time that Electrical and Computer Engineering Students Conference was taking place and I remember thinking how it was unlike any other. Ok, I had attended conferences mainly organised by the Hellenic Telecommunciations and Post Commission, but this was something different. It was organised by people like us, by students, who wanted to see how science and technology were evolving and wanted to see what is coming next for them — me included of course. It was a relatively small venue, where myself and a group of friends from university, I was a student at the National Technical University of Athens at the time were waiting in a conference room for the conference to start. The presentations were delivered by professors of the field, I can’t recall the topics but what I do remember is the feeling that the experience left me with!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Over a decade and a pandemic later…&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;April 2022:&lt;/strong&gt; It’s been a long time since that May, and I’m currently the Chapter Lead of Software Engineers in Test and Infrastructure at Agile Actors living out my early professional dreams of working on exciting projects, implementing the newest technologies and staying at the forefront of the trends. I’m sitting in the office and our communications officer, Reem tells me that ECESCon Patra is going to take place and we are sponsoring the event. Oh, dear! I travelled back 15 years. Without hesitation, I volunteered to be actively and on a Friday morning myself, Alexis, our Chapter Lead of Full stack in Java and Maria, part of our Talent Acquisition team we left for 3 days I was instantly wondering what new graduates are looking for, what should we focus on, but eventually everything came to me naturally. What was I looking for as an attendee 15 years ago? To see the future of the industry, what’s going on in the market, what the latest trends are and what technologies and disciplines to focus on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log day 1:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We arrived in the afternoon but got the chance to meet students and have very interesting discussion with. Most questions were in regard to internship opportunities, how can a graduate start their career and how can Agile Actors help. We got the opportunity to explain our unique model and received valuable feedback in the process. We felt their passion for programming and wanted to see how we could help them in their development, discussing what do they want to do and how we can help them start their journey and that was just the beginning, as we had a live demo of a developer’s typical day planned for the next day!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log day 2:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Today was the big day! We had decided to present something hands-on and what better than to start implementing a Java Spring Boot backend service from scratch. Alexis started implementing the service whilst I was implementing the tests. In what was a very real scenario, he started modelling DB entities at which point I had to stop him… Why? We had discussed nothing about requirements, specs, acceptance criteria — in actual fact we didn’t know WHAT to build. And that’s where our discussion started to give more insights on what we want to implement. This was also probably the most interesting part that someone could take away from the session.&lt;/p&gt;

&lt;p&gt;There is no framework or library that can do the trick and be the absolute solution to our problems. Nothing can replace communication and at the end of the day this is the key success factor for delivering and for having a good time in the process! Through a 3-hour session, we tried to put incorporate and present most of what happens during a typical day on the job.&lt;/p&gt;

&lt;p&gt;We got very interesting questions and started discussing with attendees. Suddenly I found myself in a time warp yet again, thinking about how at some point I was in their shoes joining the discussion on the presentation with questions and queries, eager to learn as much as I could. We went on and explained how our coaching and mentoring model works and focused a lot on our external coaching model which could potentially be a good fit for a lot of the graduates and open up career opportunities for them. This was of significant interest to the students, as a candidate is taken on by our engineers, who coach and mentor specific to an internal position, ultimately leading to an employment opportunity when successful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log day 3:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Final day and after a weekend filled with interesting discussions with students, it was about to time to return to base. What a great experience! Discussing with new joiners in the market and potential new colleagues was very interesting and refreshing. Would I go again? Definitely yes! In the meantime, I am waiting to see the attendees again, but now as new colleagues!&lt;/p&gt;

&lt;p&gt;If you want to join the discussion within our Team check out our openings &lt;a href="https://apply.workable.com/agileactors/" rel="noopener noreferrer"&gt;here&lt;/a&gt; and apply today!&lt;/p&gt;

&lt;p&gt;Til next time!&lt;/p&gt;

</description>
      <category>javascript</category>
    </item>
    <item>
      <title>Measure the Quality of Your Tests with Mutation Testing</title>
      <dc:creator>Kostas Sidiropoulos</dc:creator>
      <pubDate>Wed, 28 May 2025 06:43:58 +0000</pubDate>
      <link>https://dev.to/agileactors/measure-the-quality-of-your-tests-with-mutation-testing-1bcd</link>
      <guid>https://dev.to/agileactors/measure-the-quality-of-your-tests-with-mutation-testing-1bcd</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhp6lq6cpscl3c5icw1m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhp6lq6cpscl3c5icw1m.png" width="700" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Measure the Quality of Your Tests with Mutation Testing
&lt;/h1&gt;

&lt;p&gt;Unit testing is an integral part of the current development process. Especially with the introduction of Continuous Delivery and Continuous Deployment methods as well as the business need of delivering as quickly as possible, we need a way to deliver software quickly with the best possible quality.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Less time performing integration tests&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Protection against regression&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Executable documentation&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Less coupled code&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since starting implementing unit tests, one question that first rises is how can we measure the quality of our unit tests. So actually it’s more of a philosophical question.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If tests are the guardians, then who guards the guardians?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In other words, what metrics should I use to measure the quality of my tests and ideally how can I do it in an automated way?&lt;/p&gt;

&lt;h1&gt;
  
  
  Mutation Testing
&lt;/h1&gt;

&lt;p&gt;The most traditional way to do something like that is Code Coverage. This is the most commonly used method and its main purpose is to find how many lines and branches of our code our unit tests have executed. And here starts the problem.&lt;/p&gt;

&lt;p&gt;It does not check that your tests are actually able to detect faults in the executed code. It is therefore only able to identify code that is definitely not tested.&lt;/p&gt;

&lt;p&gt;For example, tests that are assertless (meaning that they only perform actions in SUT but no assertion, so they are not actual tests) can not be identified with code coverage tools. These are extreme cases (although some would disagree with that statement) but still there are other issues that might exist.&lt;/p&gt;

&lt;p&gt;Let’s take a look at the examples below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl8dqg5k23uks84tlf2es.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl8dqg5k23uks84tlf2es.png" width="474" height="161"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here are some unit tests that will check it:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqu9va7paeswzplsqau0b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqu9va7paeswzplsqau0b.png" width="429" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the above example although we check that calls to collaborator classes are made we do NOT check the return type. However code coverage tools will show 100% coverage because tests pass from the code but they do not actually test it! Even if it seems a straight forward example, in more complex cases, it could create issues.&lt;/p&gt;

&lt;p&gt;So, mutation testing comes to the rescue. It’s not a modern technique, it’s actually older than JUnit, but it was mainly into academia. Now it has started gaining popularity in the industry with some appropriate changes that have been done.&lt;/p&gt;

&lt;p&gt;Mutation testing is a simple technique to evaluate the quality of your unit tests. It is part of failure injection testing. Faults (or mutations) are automatically seeded into your code, then your tests are run. If your tests fail then the mutation is killed, if your tests pass then the mutation lives.&lt;/p&gt;

&lt;h1&gt;
  
  
  PITest
&lt;/h1&gt;

&lt;p&gt;&lt;a href="http://pitest.org" rel="noopener noreferrer"&gt;PITest&lt;/a&gt; is the most well known mutation testing tool for Java based applications. The main reasons to choose it are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  PIT is fast — can analyse in minutes what would take earlier systems days&lt;/li&gt;
&lt;li&gt;  PIT is easy to use — works with ant, maven, gradle and others&lt;/li&gt;
&lt;li&gt;  PIT is actively developed&lt;/li&gt;
&lt;li&gt;  PIT is actively supported&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Usage
&lt;/h1&gt;

&lt;p&gt;We should add PITest maven plugin in our pom.xml like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg8hbpo7t1vyzejzwhevu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg8hbpo7t1vyzejzwhevu.png" width="422" height="338"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In excludedClasses we add the classes that we don’t want to mutate. Also in excludedTestClasses we exclude the test that we don’t want to run (e.g. functional tests).&lt;/p&gt;

&lt;p&gt;In order to run mutation coverage, run &lt;em&gt;mvn org.pitest:pitest maven:mutationCoverage&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Stryker
&lt;/h1&gt;

&lt;p&gt;In JS and C# world the most famous tool is &lt;a href="https://stryker-mutator.io/" rel="noopener noreferrer"&gt;Stryker&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Stryker is easy to use&lt;/li&gt;
&lt;li&gt;  Stryker is actively developed&lt;/li&gt;
&lt;li&gt;  Stryker is actively supported&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Concluding automated tests are a great tool that can help our development team to deliver faster and with higher quality. But these tests should also be treated in the same way as production code and have clear quantitative metrics that show us their quality. Mutation testing techniques are the best ones we currently have.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffrorkc8jbu7r564ujpyb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffrorkc8jbu7r564ujpyb.png" width="700" height="132"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
    </item>
  </channel>
</rss>
