<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Romina Elena Mendez Escobar</title>
    <description>The latest articles on DEV Community by Romina Elena Mendez Escobar (@r_elena_mendez_escobar).</description>
    <link>https://dev.to/r_elena_mendez_escobar</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F719582%2F2d700dae-2335-4c2f-9a32-4435184a4f4f.jpeg</url>
      <title>DEV Community: Romina Elena Mendez Escobar</title>
      <link>https://dev.to/r_elena_mendez_escobar</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/r_elena_mendez_escobar"/>
    <language>en</language>
    <item>
      <title>Apple WWDC 2026: Apple Intelligence, Privacy-First AI, and the Future of Digital Experiences</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Wed, 10 Jun 2026 17:25:47 +0000</pubDate>
      <link>https://dev.to/r_elena_mendez_escobar/apple-wwdc-2026-when-ai-stops-being-a-feature-and-becomes-the-foundation-d6i</link>
      <guid>https://dev.to/r_elena_mendez_escobar/apple-wwdc-2026-when-ai-stops-being-a-feature-and-becomes-the-foundation-d6i</guid>
      <description>&lt;p&gt;At &lt;strong&gt;WWDC 2026 (its annual developers conference)&lt;/strong&gt;, &lt;strong&gt;Apple&lt;/strong&gt; announced a new generation of Apple Intelligence, a completely redesigned Siri, and a host of AI-powered features across its ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F29cy1fu27qc3br99iyb9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F29cy1fu27qc3br99iyb9.png" alt=" " width="800" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What was interesting was observing how &lt;strong&gt;Apple&lt;/strong&gt; is building a vision where design, privacy, accessibility, and artificial intelligence are no longer separate issues, but rather integrated into a single product strategy.&lt;br&gt;
Between the evolution of &lt;code&gt;Liquid Glass&lt;/code&gt;, an AI architecture that prioritizes local data processing, new parental control tools, and features that expand accessibility through natural language and visual understanding, &lt;strong&gt;WWDC 2026&lt;/strong&gt; offered several clues about where the next generation of digital products might be headed.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Design Is Never Finished: The Evolution of Liquid Glass
&lt;/h2&gt;

&lt;p&gt;Last year, Apple introduced Liquid Glass as one of the most ambitious visual changes in its ecosystem. This year, they did something that we often forget in technology: they iterated on it.&lt;br&gt;
Apple explicitly acknowledged feedback from users and developers, adjusting aspects related to legibility, contrast, visual depth, and customization. The addition of transparency controls and the refinement of core components reflects a practice we see in the most successful digital products: &lt;strong&gt;ship, observe, learn, evolve&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key updates in this iteration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;strong&gt;Improved legibility&lt;/strong&gt;: the system now diffuses complex content behind surfaces differently, creating greater separation and visual depth between interface layers&lt;/li&gt;
&lt;li&gt; 〰️ &lt;strong&gt;Customization slider&lt;/strong&gt;: users can now adjust Liquid Glass appearance from ultra-clear to fully tinted directly in Settings&lt;/li&gt;
&lt;li&gt; 〰️ &lt;strong&gt;Icon depth&lt;/strong&gt;: additional Liquid Glass layers are now integrated directly into app icon artwork for sharper definition in the dock and home screen&lt;/li&gt;
&lt;li&gt; 〰️ &lt;strong&gt;Structural consistency&lt;/strong&gt;: sidebars expand to window edges to reduce distractions while maintaining the characteristic glass refractions; window corners use a tighter radius for visual coherence
The evolution of Liquid Glass tells a broader story: design at this scale is not a launch event, it's a continuous discipline.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0rzxuie61khtlu1sv98k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0rzxuie61khtlu1sv98k.png" alt=" " width="800" height="670"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Privacy-First: An Architecture That Could Change How We Build AI
&lt;/h2&gt;

&lt;p&gt;This is arguably the most relevant part of WWDC 2026 for those of us who work in technology.&lt;/p&gt;

&lt;p&gt;Apple reinforced an idea it has been building for years: &lt;strong&gt;privacy in AI is not a marketing slogan… it's a non-negotiable architectural feature&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The new generation of Apple Intelligence combines on-device processing with Private Cloud Compute, allowing most personal information to remain on the device. This opens important questions for architects, data engineers, and developers — because for years we assumed AI required sending large amounts of data to centralized services to function. Apple is exploring a different path: bringing intelligence closer to where the data actually lives.&lt;/p&gt;

&lt;p&gt;The shift is conceptual as much as technical. Privacy stops being a constraint and becomes a design principle.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the architecture works
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Private Cloud Compute (PCC)&lt;/strong&gt; is the key piece for data engineers. Complex requests are sent to servers running Apple Silicon, with a guarantee that data is never stored or accessible (not even to Apple)  and that this promise can be verified by independent experts at any time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxb28qmrhrmnozbgox7c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxb28qmrhrmnozbgox7c.png" alt=" " width="800" height="670"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;•••&lt;/center&gt;

&lt;h3&gt;
  
  
  Real examples of this in practice
&lt;/h3&gt;

&lt;p&gt;〰️🧭 &lt;strong&gt;Safari&lt;/strong&gt;: unlike other AI-powered browsers, Safari does not share sensitive browsing history with anyone&lt;br&gt;
 〰️ 📱&lt;strong&gt;Phone app:&lt;/strong&gt;  the "Call Context" feature analyzes who is calling and searches for relevant information across your apps, but the entire process runs on-device with nothing shared externally&lt;br&gt;
 〰️🏞️ &lt;strong&gt;Image Playground:&lt;/strong&gt;  even when powerful cloud models generate photorealistic images, personal photos used as reference are never stored or shared&lt;br&gt;
For sectors like healthcare, finance, or legal — this model matters. On-device AI inference at this quality level removes one of the core objections to AI adoption: data residency and third-party access. When the compute moves to the device and the results are verifiable, many compliance conversations change.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Technology to Learn With, Not Just to Consume: Child Safety Reimagined
&lt;/h2&gt;

&lt;p&gt;One of the announcements that caught my attention most was the focus on children and teenagers.&lt;br&gt;
The conversation about kids and devices often reduces to a single question: are they good or bad? Apple proposes a more interesting frame. The question becomes how to use them safely and intentionally.&lt;br&gt;
What Apple introduced is not just parental controls — it's a framework for intentional digital access, built on recommendations from child development experts and organizations including the American Academy of Pediatrics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key tools announced
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;〰️🚸 &lt;strong&gt;Child Account&lt;/strong&gt;:  the first step: activates age-appropriate safeguards automatically, including adult site blocking and App Store media restrictions&lt;/li&gt;
&lt;li&gt;〰️? &lt;strong&gt;Ask to Browse&lt;/strong&gt;: extends the "Ask to Buy" model to web navigation; children request permission to visit new sites, parents review and approve from their own devices&lt;/li&gt;
&lt;li&gt; 〰️✔️ &lt;strong&gt;Contact approval&lt;/strong&gt;: parents must approve any new contact added within apps&lt;/li&gt;
&lt;li&gt; 〰️💬 &lt;strong&gt;Communication Safety&lt;/strong&gt;: proactive intervention before children see violent or graphic content in shared images and videos, in addition to existing nudity protection&lt;/li&gt;
&lt;li&gt; 〰️⏰ &lt;strong&gt;Time Allowances&lt;/strong&gt;: daily recommendations by category (games, social media, entertainment) based on the child's age and validated by the AAP — with parents retaining full control to adjust&lt;/li&gt;
&lt;li&gt; 〰️🗓️ &lt;strong&gt;Schedules&lt;/strong&gt;: define which apps are available during school hours to reduce distraction and support focus
The goal is not to take the device away. It's to make it useful, safe, and age-appropriate. A child can use educational apps during school time, creative apps in the afternoon, and have social apps gated for weekends.
### APIs for developers
Apple extended this safety ecosystem to third parties:&lt;/li&gt;
&lt;li&gt;〰️ Declared Age Range API — lets developers privately adapt their app experience to the child's verified age range, without exposing the actual age&lt;/li&gt;
&lt;li&gt; 〰️ Contact control resources — tools to ensure parents approve new contacts within third-party applications
The underlying principle resonates beyond child safety: the device is not the problem. The design of access is.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4. Siri AI, Visual Intelligence, and AI-Powered Accessibility
&lt;/h2&gt;

&lt;p&gt;The new Siri is probably the most visible announcement of the event — but its implications go well beyond a conversational assistant.&lt;br&gt;
Siri is no longer a command interface. It is a context-aware, multimodal agent that understands what you see, what you are doing, and what you have said before — across all your devices.&lt;br&gt;
The architecture diagram Apple shared illustrates this clearly:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrqb361df2z7qwns7n08.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrqb361df2z7qwns7n08.png" alt=" " width="800" height="673"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;•••&lt;/center&gt;

&lt;h3&gt;
  
  
  Visual Intelligence in Practice
&lt;/h3&gt;

&lt;p&gt;The examples Apple showed during the session represent architectural shifts in how humans interact with technology:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; 〰️ Point the camera at a restaurant bill → Siri splits it automatically using Apple Cash&lt;/li&gt;
&lt;li&gt; 〰️ Scan the food on your plate → get nutritional information in real time&lt;/li&gt;
&lt;li&gt; 〰️ Look at something in your environment → ask what it is and get context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo1nfweszw15u09bets0y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo1nfweszw15u09bets0y.png" alt=" " width="800" height="670"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  AI as Accessibility Infrastructure
&lt;/h3&gt;

&lt;p&gt;The new capabilities Apple introduced through voice interactions and visual understanding were primarily presented as productivity and assistance features. However, from my perspective, they also represent a meaningful step forward in accessibility.&lt;/p&gt;

&lt;p&gt;For decades, digital accessibility has often been approached as an additional layer added on top of products: screen readers, magnification tools, voice commands, or specialized interfaces designed for specific needs. Those tools remain essential, but what Apple appears to be demonstrating points in a different direction.&lt;/p&gt;

&lt;p&gt;When a system can understand what is happening on the screen, interpret what the camera sees, maintain conversational context, and respond through natural language, accessibility stops being a separate feature and becomes part of the interaction model itself.&lt;/p&gt;

&lt;p&gt;From the perspective of Universal Design, several principles begin to emerge naturally through these capabilities.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Flexibility in Use&lt;/strong&gt;: Users can interact with the same functionality in different ways depending on their preferences, abilities, or context. A person can type, speak, show an image, or combine multiple interaction methods to achieve the same outcome. The technology no longer imposes a single path; it adapts to different ways of engaging with it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Simple and Intuitive Use&lt;/strong&gt;: Natural language interactions reduce the need to learn complex navigation structures, extensive menus, or specific commands. Users can express their intent using their own words and receive contextual assistance without needing to understand how the system works behind the scenes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Perceptible Information&lt;/strong&gt;: Visual Intelligence can transform visual information into verbal or contextual information. Users can receive descriptions of objects, understand documents, identify elements in their surroundings, or obtain explanations about what appears in front of the camera. Information is no longer tied to a single sensory channel.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Low Physical Effort&lt;/strong&gt;: Many tasks that previously required multiple steps, manual searches, or switching between applications can now be completed through a conversation or a single image capture. Reducing the number of actions required to accomplish a task lowers the effort needed to interact with the system.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;center&gt;•••&lt;/center&gt;

&lt;h4&gt;
  
  
  Beyond Traditional Accessibility
&lt;/h4&gt;

&lt;p&gt;Accessibility becomes less about specialized accommodations and more about creating technology that understands people, their context, and their intent.&lt;br&gt;
Perhaps the most significant shift is that technology is no longer asking people to adapt to interfaces... because the interface is beginning to adapt to people.&lt;/p&gt;

&lt;center&gt;•••&lt;/center&gt;

&lt;h4&gt;
  
  
  Intelligence That Follows the User
&lt;/h4&gt;

&lt;p&gt;Another interesting aspect of the new Siri architecture is cross-device continuity.&lt;br&gt;
Conversations can now move seamlessly between iPhone, iPad, Mac, and Vision Pro while preserving context. A user can start a conversation on one device and continue it on another without losing the flow of the interaction.&lt;br&gt;
This may seem like a convenience feature, but it reflects a deeper design principle. The context window is no longer tied to a specific device. It follows the user.&lt;br&gt;
In summary Users do not think in terms of devices; they think in terms of tasks, goals, and conversations. Apple's approach suggests a future where intelligence becomes a persistent layer across the entire ecosystem rather than a capability attached to individual products.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. From Features to Platform
&lt;/h2&gt;

&lt;p&gt;Beyond the end-user experience, Apple also introduced a set of frameworks, APIs, and development tools that open Apple Intelligence to the broader ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧠 Foundation Models Framework&lt;/strong&gt;&lt;br&gt;
Developers can access Apple's foundation models directly from Swift, extend them with custom capabilities, and use the same programming model across different environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❤️ Core AI&lt;/strong&gt;&lt;br&gt;
Applications can run third-party models locally on Apple Silicon, taking advantage of hardware acceleration while keeping inference close to where the data resides.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔗 App Intents&lt;/strong&gt;&lt;br&gt;
App Intents continue to become a foundational integration layer. They allow Siri and Apple Intelligence to understand and interact with application functionality. Apps that expose meaningful intents become part of the broader intelligence ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🤖 Agentic Coding in Xcode&lt;/strong&gt;&lt;br&gt;
Apple introduced agentic development capabilities directly into Xcode, allowing developers to connect AI assistants to tools such as GitHub, Figma, and simulators as part of the development workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📱 Device Hub&lt;/strong&gt;&lt;br&gt;
A unified interface for testing applications across real and simulated devices, simplifying validation across Apple's growing ecosystem.&lt;/p&gt;

&lt;center&gt;•••&lt;/center&gt;

&lt;h3&gt;
  
  
  APIs Extending the Ecosystem
&lt;/h3&gt;

&lt;p&gt;Apple also expanded several APIs that allow developers to integrate these capabilities into their own applications.&lt;br&gt;
〰️ &lt;strong&gt;Image Playground API&lt;/strong&gt; enables AI-powered image generation directly within third-party applications.&lt;br&gt;
〰️ &lt;strong&gt;Declared Age Range API&lt;/strong&gt; allows applications to provide age-appropriate experiences while preserving user privacy.&lt;br&gt;
〰️ &lt;strong&gt;App Intents API&lt;/strong&gt; enables applications to become deeply integrated with Siri and Apple Intelligence.&lt;/p&gt;




&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;If I had to summarize WWDC 2026 in one sentence, I would say Apple presented a vision where design, privacy, intelligence, accessibility, and digital safety are no longer separate concerns. They become part of the same product architecture.&lt;/p&gt;

&lt;p&gt;In summary, WWDC 2026 highlights several important lessons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The evolution of &lt;strong&gt;Liquid Glass&lt;/strong&gt; showed that design is a continuous process of learning and refinement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apple Intelligence&lt;/strong&gt; demonstrated that AI can be built around privacy rather than treating it as a constraint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The new child safety features&lt;/strong&gt; reframed the conversation from restricting technology to designing intentional access.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual Intelligence&lt;/strong&gt; suggested a future where accessibility emerges naturally from systems that understand context, language, images, and user intent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For developers, the message was equally significant. Apple did not only introduce new user-facing experiences; it exposed the frameworks, APIs, and intelligence layers required to build on top of them.&lt;/p&gt;

&lt;p&gt;What I take away from this event is that we need to focus less on individual features and more on complete solutions that help people accomplish tasks across different contexts. The platform is no longer a standalone destination; it becomes part of a continuous experience that follows the user across devices and moments.&lt;/p&gt;

&lt;p&gt;As intelligence becomes more integrated into everyday interactions, personalization and security increasingly emerge as foundational design principles. Building products now requires thinking beyond a single application or device and considering how experiences evolve across an entire ecosystem.&lt;/p&gt;

&lt;p&gt;The most important takeaway is not any individual feature announced during WWDC. It is the direction these announcements collectively point toward: technology that becomes more contextual, more personal, more accessible, and more deeply integrated into people's daily lives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How will these principles reshape the way we design digital products, platforms, and enterprise architectures over the next decade?&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Reference
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Apple.&lt;/strong&gt; (2026, June 8). Apple WWDC 2026 June 8: Introducing Siri AI and more [Video]. YouTube. Retrieved June 9, 2026, from &lt;a href="https://www.youtube.com/watch?v=hF8swzNR1-o" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=hF8swzNR1-o&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>mobile</category>
      <category>ios</category>
      <category>news</category>
    </item>
    <item>
      <title>Serverless Research Paper Intelligence: Docling, Lambda Containers, and Amazon Bedrock</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Wed, 27 May 2026 13:25:25 +0000</pubDate>
      <link>https://dev.to/aws-builders/serverless-research-paper-intelligence-docling-lambda-containers-and-amazon-bedrock-5987</link>
      <guid>https://dev.to/aws-builders/serverless-research-paper-intelligence-docling-lambda-containers-and-amazon-bedrock-5987</guid>
      <description>&lt;h1&gt;
  
  
  1.🚀 Introduction
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Processing scientific PDFs is not as simple as extracting text&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Many papers include tables, multiple columns, formulas, figures, and structures that can easily break when we use traditional extractors.&lt;br&gt;
The problem becomes even bigger when those documents are private. We do not always want to depend completely on multimodal models to analyze them, and the cost can also grow quickly when we work with many files.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5em26jfi9wb7z5b57tq2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5em26jfi9wb7z5b57tq2.png" alt=" " width="799" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A few months ago, I attended &lt;strong&gt;PyData Berlin&lt;/strong&gt; and during one of the talks I discovered IBM &lt;strong&gt;Docling&lt;/strong&gt;, an open source project focused on intelligent document processing. What caught my attention the most was its ability to extract structured information from complex PDFs, especially scientific documents with tables, multiple columns, formulas, and layouts that are difficult to process with traditional tools.&lt;/p&gt;

&lt;p&gt;From that moment, I started thinking about how to bring this type of processing to the cloud in a simple and scalable way, while also keeping costs under control. Some current solutions for analyzing complex documents with generative AI rely heavily on multimodal models, but in scenarios where we work with large volumes of papers or private documents, cost and privacy can quickly become a problem.&lt;/p&gt;

&lt;p&gt;If you have read some of my previous articles, you have probably seen that I like to build content around a real use case. In this tutorial, I decided to work with scientific papers related to research on &lt;strong&gt;GLP-1 receptor agonists&lt;/strong&gt;, a class of medications widely studied for type 2 diabetes and obesity.&lt;/p&gt;

&lt;p&gt;These treatments are currently very popular because many people use them for weight loss purposes.&lt;/p&gt;


&lt;h2&gt;
  
  
  The objective of the tutorial
&lt;/h2&gt;

&lt;p&gt;The idea is not to build a generic search engine over the internet, but something much more interesting: a private knowledge base where you can query only your own research documents in a secure environment.&lt;br&gt;
To solve this, we are going to build an architecture based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📦 AWS Lambda Containers&lt;/li&gt;
&lt;li&gt;📑 Amazon Bedrock Knowledge Bases&lt;/li&gt;
&lt;li&gt;🐣 PDF processing with Docling&lt;/li&gt;
&lt;li&gt;🪣 Storage in Amazon S3&lt;/li&gt;
&lt;li&gt;✂️ Chunking strategies to improve information retrieval&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;During the tutorial, I will also show several real problems that I found while implementing this solution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ size limits in Lambda,&lt;/li&gt;
&lt;li&gt;〰️ timeouts caused by model downloads,&lt;/li&gt;
&lt;li&gt;〰️ Docker image optimization,&lt;/li&gt;
&lt;li&gt;〰️ scientific document processing,&lt;/li&gt;
&lt;li&gt;〰️ and architecture decisions to keep a serverless and low cost approach.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The final objective will be to transform a set of scientific papers into a knowledge base that can be queried using natural language. This will allow us to ask questions about adverse effects, clinical criteria, study results, and comparisons between different research papers.&lt;/p&gt;


&lt;h1&gt;
  
  
  2.🧪 Use case
&lt;/h1&gt;

&lt;p&gt;In this tutorial, we are going to work with a set of scientific papers related to research on &lt;strong&gt;GLP-1 receptor agonists (Glucagon-Like Peptide-1)&lt;/strong&gt;, a natural hormone involved in glucose regulation, insulin secretion, and the feeling of fullness.&lt;/p&gt;

&lt;p&gt;In recent years, different treatments based on this family of molecules have appeared, and a large number of clinical studies, academic papers, and research documents have been published. These documents are related to cardiovascular outcomes, weight loss, adverse effects, and inclusion or exclusion criteria in clinical trials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The objective of this use case&lt;/strong&gt; is not to build a search engine over the internet or use public information in real time. The idea is to &lt;code&gt;work with a private and curated set of scientific documents&lt;/code&gt;, simulating a scenario where researchers, medical teams, or research areas need to query only their own papers in a secure environment.&lt;/p&gt;

&lt;p&gt;For this &lt;strong&gt;MVP&lt;/strong&gt;, I am going to use 10 public papers as an example dataset, but the architecture is designed for scenarios where the documents can be private or belong to internal research processes.&lt;br&gt;
From these documents, we are going to build a knowledge base that allows queries using natural language, for example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ identify adverse effects reported in different studies,&lt;/li&gt;
&lt;li&gt;〰️ compare results between treatments,&lt;/li&gt;
&lt;li&gt;〰️ validate exclusion criteria in clinical trials,&lt;/li&gt;
&lt;li&gt;〰️ analyze cardiovascular outcomes,&lt;/li&gt;
&lt;li&gt;〰️ retrieve specific information across multiple scientific papers.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1&gt;
  
  
  3. 🏗️ Solution Architecture
&lt;/h1&gt;

&lt;p&gt;Before going into the theoretical concepts, we are going to describe the solution that we will build.&lt;/p&gt;

&lt;p&gt;This solution is based on a &lt;strong&gt;serverless architecture&lt;/strong&gt; that processes scientific papers in &lt;code&gt;PDF&lt;/code&gt; format and later uses them as input for an &lt;strong&gt;Amazon Bedrock Knowledge Base&lt;/strong&gt; to build a &lt;code&gt;RAG&lt;/code&gt; system.&lt;/p&gt;

&lt;p&gt;The architecture clearly separates the &lt;strong&gt;ingestion and processing flow&lt;/strong&gt; from the &lt;strong&gt;intelligent query flow&lt;/strong&gt;, while keeping the solution simple and scalable.&lt;/p&gt;

&lt;p&gt;The following blueprint shows how each component connects inside the complete pipeline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs0t4fwuk7qudsweekbwr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs0t4fwuk7qudsweekbwr.png" alt=" " width="800" height="664"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In summary, this pipeline processes &lt;code&gt;PDF&lt;/code&gt; files using a &lt;strong&gt;Python Docker image&lt;/strong&gt; with &lt;strong&gt;Docling&lt;/strong&gt;, running inside an &lt;strong&gt;AWS Lambda function based on a container image&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This Lambda function transforms the files into structured documents in &lt;code&gt;Markdown&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Then, these documents are stored in &lt;strong&gt;Amazon S3&lt;/strong&gt; and indexed by &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;, which generates &lt;code&gt;embeddings&lt;/code&gt; and allows semantic queries over the content.&lt;/p&gt;


&lt;h1&gt;
  
  
  4. 📑 Docling: structured document extraction
&lt;/h1&gt;

&lt;p&gt;One of the main challenges when working with scientific &lt;code&gt;PDFs&lt;/code&gt; is that they are not “simple” documents. They are full of &lt;strong&gt;tables&lt;/strong&gt;, &lt;strong&gt;columns&lt;/strong&gt;, &lt;strong&gt;formulas&lt;/strong&gt;, &lt;strong&gt;figures&lt;/strong&gt;, and complex &lt;code&gt;layouts&lt;/code&gt; that are not always preserved correctly when text is extracted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IBM Docling&lt;/strong&gt; is an open source library designed for &lt;code&gt;PDF&lt;/code&gt; extraction and document structuring. Its goal is not only to extract text, but also to convert complex documents into a &lt;strong&gt;structured representation&lt;/strong&gt; that can be used in artificial intelligence pipelines and &lt;code&gt;RAG&lt;/code&gt; systems.&lt;/p&gt;

&lt;p&gt;Instead of returning messy plain text, &lt;strong&gt;Docling&lt;/strong&gt; tries to preserve the structure of the document, including the &lt;strong&gt;reading order&lt;/strong&gt;, &lt;strong&gt;tables&lt;/strong&gt;, &lt;strong&gt;formulas&lt;/strong&gt;, &lt;strong&gt;images&lt;/strong&gt;, and other key elements of the content.&lt;/p&gt;

&lt;p&gt;The following image summarizes some of the key benefits of using Docling for complex document processing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flj24wb85j54tgnwe8uxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flj24wb85j54tgnwe8uxl.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Why use Docling?
&lt;/h2&gt;

&lt;p&gt;Traditional tools like &lt;code&gt;PyPDF&lt;/code&gt;, &lt;code&gt;PDFPlumber&lt;/code&gt;, or classic &lt;code&gt;OCR&lt;/code&gt; are usually enough for simple documents, but they can struggle when working with scientific papers that have complex &lt;code&gt;layouts&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In these cases, important information can be lost, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️&lt;strong&gt;table structure&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;column separation&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️&lt;strong&gt;relationship between text and figures&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️&lt;strong&gt;mathematical formulas&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Docling&lt;/strong&gt; appears as an alternative that tries to solve exactly these problems, generating a much more consistent output for later analysis.&lt;/p&gt;


&lt;h2&gt;
  
  
  Docling features
&lt;/h2&gt;

&lt;p&gt;Below, you can find the main features published by the library on &lt;strong&gt;Hugging Face&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🏷️ DocTags for Efficient Tokenization – Introduces DocTags an efficient and minimal representation for documents that is fully compatible with DoclingDocuments.&lt;/li&gt;
&lt;li&gt;🔍 OCR (Optical Character Recognition) – Extracts text accurately from images.&lt;/li&gt;
&lt;li&gt;📐 Layout and Localization – Preserves document structure and document element bounding boxes.&lt;/li&gt;
&lt;li&gt;💻 Code Recognition – Detects and formats code blocks including identation.&lt;/li&gt;
&lt;li&gt;🔢 Formula Recognition – Identifies and processes mathematical expressions.&lt;/li&gt;
&lt;li&gt;📊 Chart Recognition – Extracts and interprets chart data.&lt;/li&gt;
&lt;li&gt;📑 Table Recognition – Supports column and row headers for structured table extraction.&lt;/li&gt;
&lt;li&gt;🖼️ Figure Classification – Differentiates figures and graphical elements.&lt;/li&gt;
&lt;li&gt;📝 Caption Correspondence – Links captions to relevant images and figures.&lt;/li&gt;
&lt;li&gt;📜 List Grouping – Organizes and structures list elements correctly.&lt;/li&gt;
&lt;li&gt;📄 Full-Page Conversion – Processes entire pages for comprehensive document conversion including all page elements (code, equations, tables, charts etc.)&lt;/li&gt;
&lt;li&gt;🔲 OCR with Bounding Boxes – OCR regions using a bounding box.&lt;/li&gt;
&lt;li&gt;📂 General Document Processing – Trained for both scientific and non-scientific documents.&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  🏥 Practical example: processing a medical record with Docling
&lt;/h2&gt;

&lt;p&gt;In this example, we will use a &lt;strong&gt;synthetically generated clinical record&lt;/strong&gt; in &lt;code&gt;PDF&lt;/code&gt; format to show how Docling can extract and structure information from a healthcare document.&lt;/p&gt;

&lt;p&gt;All patient data, medical records, and clinical findings are completely fictional and were created only for educational purposes. No real patient information was used.&lt;/p&gt;

&lt;p&gt;This example represents a common use case in the healthcare industry, where medical documents need to be processed, structured, and prepared for analysis with AI.&lt;/p&gt;

&lt;p&gt;In the next steps, we will use Docling to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;strong&gt;load and convert the PDF&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;explore the document structure and identify sections&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;extract structured patient data into a pandas DataFrame&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;📌The following image shows part of the clinical record that we will process in this example.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff5ellk68vn0kb5ek5354.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff5ellk68vn0kb5ek5354.png" alt=" " width="773" height="688"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  📑 Loading and converting the PDF
&lt;/h3&gt;

&lt;p&gt;In this step, we load the clinical record &lt;code&gt;PDF&lt;/code&gt; using Docling's &lt;code&gt;DocumentConverter&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Docling automatically detects the document structure and exports the result in two formats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️&lt;strong&gt;Markdown&lt;/strong&gt;: a human readable output to preview the content&lt;/li&gt;
&lt;li&gt;〰️&lt;strong&gt;Dictionary&lt;/strong&gt;: programmatic access to text, tables, images, and metadata&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This structured output is what makes &lt;strong&gt;Docling&lt;/strong&gt; more powerful than a basic &lt;code&gt;PDF&lt;/code&gt; text extractor.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docling.document_converter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DocumentConverter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PdfFormatOption&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="n"&gt;converter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DocumentConverter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;converter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clinical_history_structured.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# export markdown
&lt;/span&gt;&lt;span class="n"&gt;data_markdown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;export_to_markdown&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# export dict
&lt;/span&gt;&lt;span class="n"&gt;data_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;export_to_dict&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;texts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;texts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🗂️ Exploring document sections
&lt;/h3&gt;

&lt;p&gt;Every clinical record is organized into sections. Here, we extract all the &lt;strong&gt;section headers&lt;/strong&gt; detected by &lt;strong&gt;Docling&lt;/strong&gt;, such as &lt;code&gt;Patient Identification&lt;/code&gt;, &lt;code&gt;Chief Complaint&lt;/code&gt;, and &lt;code&gt;Laboratory Results&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This gives us:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ A &lt;strong&gt;map of the document structure&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ The ability to &lt;strong&gt;target specific sections&lt;/strong&gt; for downstream processing
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;data_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;texts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;section_header&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;['CITYVIEW MEDICAL CENTER CLINICAL HISTORY AND RECORD',
 '1. PATIENT IDENTIFICATION',
 '4. PAST MEDICAL HISTORY',
 '5. MEDICATIONS',
 '6. ALLERGIES',
 '2. CHIEF COMPLAINT',
 '3. HISTORY OF PRESENT ILLNESS',
 '7. FAMILY HISTORY',
 '8. SOCIAL HISTORY',
 '9. REVIEW OF SYSTEMS',
 '10. PHYSICAL EXAMINATION',
 '12. LABORATORY RESULTS',
 '13. ASSESSMENT',
 '14. PLAN',
 '11. IMAGING']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🧩 Extracting patient data as a structured table
&lt;/h3&gt;

&lt;p&gt;Now we extract the content of the first section, &lt;strong&gt;Patient Identification&lt;/strong&gt;, by filtering the items that belong to &lt;code&gt;#/groups/0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docling&lt;/strong&gt; preserves the &lt;code&gt;key value&lt;/code&gt; layout of the original &lt;code&gt;PDF&lt;/code&gt;, so we can split the flat list into field names and values using Python slice notation.&lt;/p&gt;

&lt;p&gt;The result is a clean &lt;code&gt;pandas DataFrame&lt;/code&gt; ready for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ Analysis&lt;/li&gt;
&lt;li&gt;〰️ Storage&lt;/li&gt;
&lt;li&gt;〰️ Downstream AI processing
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Filter group 0
&lt;/span&gt;&lt;span class="n"&gt;group_0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;orig&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt; 
           &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;parent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;$ref&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;#/groups/0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="c1"&gt;# Convert the flat list into key value pairs
&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;group_0&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  
&lt;span class="n"&gt;values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;group_0&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;field&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fastxdp5eceu16vp235po.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fastxdp5eceu16vp235po.png" alt=" " width="533" height="306"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tables&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;export_to_dataframe&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4k6owfjn0fbxebg21c5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4k6owfjn0fbxebg21c5.png" alt=" " width="516" height="145"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h1&gt;
  
  
  5. ⚡ AWS Lambda
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;AWS Lambda&lt;/strong&gt; is a &lt;code&gt;serverless&lt;/code&gt; service that allows you to run code without managing infrastructure. It scales automatically and you only pay for what you use.&lt;br&gt;
It is commonly used for &lt;strong&gt;file processing&lt;/strong&gt;, &lt;strong&gt;service integration&lt;/strong&gt;, &lt;strong&gt;scheduled tasks&lt;/strong&gt;, and &lt;strong&gt;real time event processing&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;However, even though it is one of the most used services in the &lt;code&gt;serverless&lt;/code&gt; ecosystem, some limitations appear quickly when we start working with heavier &lt;code&gt;workloads&lt;/code&gt; or complex dependencies.&lt;br&gt;
Some of the main limitations are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;strong&gt;memory and CPU limits&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;maximum execution timeout&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;deployment package size restrictions&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;the need to use ZIP files or Layers for dependencies&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;cold starts in heavier workloads&lt;/strong&gt;
These restrictions mean that, in some cases, traditional Lambda is not enough to run &lt;code&gt;workloads&lt;/code&gt; such as intensive &lt;code&gt;PDF&lt;/code&gt; processing or libraries with large dependencies.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1&gt;
  
  
  6. 🐳 AWS Lambda Containers
&lt;/h1&gt;

&lt;p&gt;To solve part of these limitations, &lt;strong&gt;AWS Lambda&lt;/strong&gt; allows you to run functions using &lt;strong&gt;container images&lt;/strong&gt; instead of &lt;code&gt;ZIP&lt;/code&gt; packages.&lt;br&gt;
This approach allows you to package the function as a &lt;strong&gt;Docker image&lt;/strong&gt;, push it to &lt;strong&gt;Amazon Elastic Container Registry&lt;/strong&gt;, and run it directly from &lt;strong&gt;Lambda&lt;/strong&gt;.&lt;br&gt;
The main advantage is that it significantly increases the size limit, up to &lt;code&gt;10 GB&lt;/code&gt;. This makes it possible to include heavy dependencies, predownloaded models, or complex libraries like &lt;strong&gt;Docling&lt;/strong&gt; without needing workarounds with &lt;strong&gt;Layers&lt;/strong&gt;.&lt;br&gt;
In this project, this option is key because it allows us to run &lt;strong&gt;Docling&lt;/strong&gt; inside &lt;strong&gt;Lambda&lt;/strong&gt; without compromising dependencies or the &lt;code&gt;runtime&lt;/code&gt;.&lt;br&gt;
The following image summarizes the key benefits of using &lt;strong&gt;Lambda Containers&lt;/strong&gt; for this type of workload.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F96yap6duqm3g96601trt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F96yap6duqm3g96601trt.png" alt=" " width="799" height="444"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Deploying a Docling Lambda Container to AWS
&lt;/h2&gt;

&lt;p&gt;As we saw in the previous section, the limitations of traditional &lt;strong&gt;Lambda&lt;/strong&gt; make it difficult to run heavy libraries like &lt;strong&gt;Docling&lt;/strong&gt; using &lt;code&gt;ZIP&lt;/code&gt; packages or &lt;strong&gt;Layers&lt;/strong&gt;.&lt;br&gt;
To solve this, we are going to run &lt;strong&gt;AWS Lambda&lt;/strong&gt; from a &lt;strong&gt;container image&lt;/strong&gt;. This allows us to package &lt;strong&gt;Docling&lt;/strong&gt;, its dependencies, and its models inside a &lt;strong&gt;Docker image&lt;/strong&gt;, and deploy it using &lt;strong&gt;Amazon Elastic Container Registry&lt;/strong&gt; (&lt;code&gt;ECR&lt;/code&gt;).&lt;br&gt;
In this section, we are going to build the image, push it to &lt;strong&gt;AWS&lt;/strong&gt;, and use it inside &lt;strong&gt;Lambda&lt;/strong&gt; to process our scientific papers.&lt;br&gt;
The following image shows the deployment flow that we will follow step by step.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgp5p3jbntzcqfaabcpjv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgp5p3jbntzcqfaabcpjv.png" alt=" " width="799" height="444"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;Before starting, you need to have:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;AWS CLI&lt;/strong&gt; installed and configured &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker&lt;/strong&gt; installed with &lt;code&gt;buildx&lt;/code&gt; support &lt;/li&gt;
&lt;li&gt;Repository cloned locally &lt;/li&gt;
&lt;li&gt;Create &lt;strong&gt;Amazon S3&lt;/strong&gt; bucket named &lt;code&gt;docling-papers-tutorial&lt;/code&gt;, with the &lt;code&gt;PDFs&lt;/code&gt; that we are going to process already uploaded &lt;/li&gt;
&lt;li&gt;You also need an &lt;strong&gt;IAM user&lt;/strong&gt; with permissions to create images in &lt;strong&gt;ECR&lt;/strong&gt; and deploy &lt;strong&gt;Lambda&lt;/strong&gt; functions. In the repository, you will find &lt;code&gt;JSON&lt;/code&gt; files with the required policies inside &lt;code&gt;iam/user_policies&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Github repository&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/RominaElenaMendezEscobar" rel="noopener noreferrer"&gt;
        RominaElenaMendezEscobar
      &lt;/a&gt; / &lt;a href="https://github.com/RominaElenaMendezEscobar/docling-bedrock-research-rag" rel="noopener noreferrer"&gt;
        docling-bedrock-research-rag
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Serverless RAG pipeline for turning research papers into a private knowledge base using Docling, AWS Lambda Containers, Amazon S3, and Amazon Bedrock.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;p&gt;&lt;a href="https://www.buymeacoffee.com/r0mymendez" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/b96fd4ea89ea15fcec30a4f86382eef0bbd17454aa3a8d4de8c8c5e92b55cf6c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4275792532304d6525323041253230436f666665652d737570706f72742532306d79253230776f726b2d4646444430303f7374796c653d666c6174266c6162656c436f6c6f723d313031303130266c6f676f3d6275792d6d652d612d636f66666565266c6f676f436f6c6f723d7768697465" alt="Buy Me A Coffee"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Serverless Research Paper Intelligence: Docling, Lambda Containers, and Amazon Bedrock&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/RominaElenaMendezEscobar/docling-bedrock-research-rag/img/01-preview.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FRominaElenaMendezEscobar%2Fdocling-bedrock-research-rag%2FHEAD%2Fimg%2F01-preview.png" alt="01-preview"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;1.🚀 Introduction&lt;/h1&gt;
&lt;/div&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;The objective of the tutorial&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;The idea is not to build a generic search engine over the internet, but something much more interesting: a private knowledge base where you can query only your own research documents in a secure environment.
To solve this, we are going to build an architecture based on:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;📦 AWS Lambda Containers&lt;/li&gt;
&lt;li&gt;📑 Amazon Bedrock Knowledge Bases&lt;/li&gt;
&lt;li&gt;🐣 PDF processing with Docling&lt;/li&gt;
&lt;li&gt;🗑️ Storage in Amazon S3&lt;/li&gt;
&lt;li&gt;✂️ Chunking strategies to improve information retrieval&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;During the tutorial, I will also show several real problems that I found while implementing this solution:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;〰️ size limits in Lambda,&lt;/li&gt;
&lt;li&gt;〰️ timeouts caused by model downloads,&lt;/li&gt;
&lt;li&gt;〰️ Docker image optimization,&lt;/li&gt;
&lt;li&gt;〰️ scientific document processing,&lt;/li&gt;
&lt;li&gt;〰️and architecture decisions to keep a serverless and low cost approach.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The final objective will be to transform a set of scientific papers into…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/RominaElenaMendezEscobar/docling-bedrock-research-rag" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;





&lt;h3&gt;
  
  
  Building the Docker image
&lt;/h3&gt;

&lt;p&gt;Once the repository is cloned, we start by configuring the environment variables required for the deployment.&lt;/p&gt;

&lt;center&gt;•••••&lt;/center&gt;

&lt;h4&gt;
  
  
  Setup
&lt;/h4&gt;

&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file with your credentials:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_access_key
&lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_secret_key
&lt;span class="nv"&gt;AWS_DEFAULT_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Then export the variables:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s1"&gt;'^#'&lt;/span&gt; .env | xargs&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws sts get-caller-identity &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--query&lt;/span&gt; Account &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ECR_REPO_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;docling-lambda
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;docling-lambda
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;IMAGE_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;docling-lambda
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;⚠️ Remember to add &lt;code&gt;.env&lt;/code&gt; to your &lt;code&gt;.gitignore&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 1: Verify your AWS identity
&lt;/h4&gt;

&lt;p&gt;Before deploying, verify which &lt;strong&gt;AWS account&lt;/strong&gt; and &lt;strong&gt;IAM user&lt;/strong&gt; are currently configured in your environment.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws sts get-caller-identity
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;center&gt;•••••&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 2: Authenticate Docker with Amazon ECR
&lt;/h4&gt;

&lt;p&gt;This command generates a temporary &lt;strong&gt;ECR authentication token&lt;/strong&gt; and passes it to &lt;code&gt;docker login&lt;/code&gt;, so Docker can push images to your private &lt;strong&gt;ECR registry&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecr get-login-password &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="nv"&gt;$AWS_DEFAULT_REGION&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
  docker login &lt;span class="nt"&gt;--username&lt;/span&gt; AWS &lt;span class="nt"&gt;--password-stdin&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;$AWS_ACCOUNT_ID&lt;/span&gt;.dkr.ecr.&lt;span class="nv"&gt;$AWS_DEFAULT_REGION&lt;/span&gt;.amazonaws.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;⚠️ This token expires after 12 hours. Run this step again if you get authentication errors.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 3: Build the Docker image
&lt;/h4&gt;

&lt;p&gt;Now we build the Docker image from the &lt;code&gt;Dockerfile&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker buildx build &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--platform&lt;/span&gt; linux/amd64 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provenance&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--sbom&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-cache&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--load&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="nv"&gt;$IMAGE_NAME&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The most important flags are:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Flag&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--platform linux/amd64&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Forces the &lt;code&gt;x86_64&lt;/code&gt; architecture required by AWS Lambda. This is required if you are building on an Apple Silicon Mac, such as M1, M2, or M3.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--provenance=false&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Disables build attestation metadata, which can cause issues with Lambda image deployments.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--sbom=false&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Disables Software Bill of Materials generation, which can also cause issues with Lambda deployments.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--no-cache&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Builds the image from scratch, ignoring cached layers.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--load&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Loads the image into your local Docker daemon after building.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;-t $IMAGE_NAME&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tags the image with the selected image name.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 4: Tag the image for ECR
&lt;/h4&gt;

&lt;p&gt;Before pushing the image, we need to create a new tag that points to the full &lt;strong&gt;ECR repository URI&lt;/strong&gt;.&lt;br&gt;
Docker requires the image name to match the complete &lt;strong&gt;ECR URI&lt;/strong&gt; before it can push the image to the registry.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker tag &lt;span class="nv"&gt;$IMAGE_NAME&lt;/span&gt;:latest &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nv"&gt;$AWS_ACCOUNT_ID&lt;/span&gt;.dkr.ecr.&lt;span class="nv"&gt;$AWS_DEFAULT_REGION&lt;/span&gt;.amazonaws.com/&lt;span class="nv"&gt;$ECR_REPO_NAME&lt;/span&gt;:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;center&gt;•••••&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 5: Verify that the image exists locally
&lt;/h4&gt;

&lt;p&gt;Before pushing the image to &lt;strong&gt;ECR&lt;/strong&gt;, confirm that it exists in your local Docker environment.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker images
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The image should appear with both tags: the &lt;strong&gt;local tag&lt;/strong&gt; and the &lt;strong&gt;ECR tag&lt;/strong&gt;.&lt;/p&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 6: Push the image to ECR
&lt;/h4&gt;

&lt;p&gt;Now we push the image to your private &lt;strong&gt;ECR repository&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This step may take several minutes because the &lt;strong&gt;Docling image&lt;/strong&gt; is large due to the &lt;code&gt;ML&lt;/code&gt; models included inside the container.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker push &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nv"&gt;$AWS_ACCOUNT_ID&lt;/span&gt;.dkr.ecr.&lt;span class="nv"&gt;$AWS_DEFAULT_REGION&lt;/span&gt;.amazonaws.com/&lt;span class="nv"&gt;$ECR_REPO_NAME&lt;/span&gt;:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;center&gt;•••••&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 7: Update the Lambda function
&lt;/h4&gt;

&lt;p&gt;Run this step only if you need to update an existing &lt;strong&gt;Lambda function&lt;/strong&gt; with a new image version.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda update-function-code &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--function-name&lt;/span&gt; &lt;span class="nv"&gt;$LAMBDA_FUNCTION_NAME&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--image-uri&lt;/span&gt; &lt;span class="nv"&gt;$AWS_ACCOUNT_ID&lt;/span&gt;.dkr.ecr.&lt;span class="nv"&gt;$AWS_DEFAULT_REGION&lt;/span&gt;.amazonaws.com/&lt;span class="nv"&gt;$ECR_REPO_NAME&lt;/span&gt;:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This command tells &lt;strong&gt;AWS Lambda&lt;/strong&gt; to use the new image that you just pushed to &lt;strong&gt;ECR&lt;/strong&gt;.&lt;br&gt;
Lambda will pull the image from &lt;strong&gt;ECR&lt;/strong&gt; and deploy it automatically.&lt;/p&gt;


&lt;h1&gt;
  
  
  7. 🧯 Real problems during the deployment
&lt;/h1&gt;
&lt;h2&gt;
  
  
  What I had to solve to run Docling on Lambda
&lt;/h2&gt;

&lt;p&gt;Up to this point, the flow looks relatively simple: build the image, push it to &lt;strong&gt;ECR&lt;/strong&gt;, and deploy the &lt;strong&gt;Lambda function&lt;/strong&gt;.&lt;br&gt;
However, when working with heavy libraries like &lt;strong&gt;Docling&lt;/strong&gt;, several problems started to appear. These problems were related to the image size, the Lambda &lt;code&gt;runtime&lt;/code&gt;, and the download of models during execution.&lt;br&gt;
This section summarizes some of the real problems I found during the implementation and the solutions I finally applied.&lt;/p&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h2&gt;
  
  
  Reducing the image size
&lt;/h2&gt;

&lt;p&gt;One of the first problems I ran into was related to the &lt;strong&gt;Docker image size&lt;/strong&gt;. When working with libraries like &lt;strong&gt;Docling&lt;/strong&gt;, which include ML models and multiple heavy dependencies, the final image can grow considerably. &lt;br&gt;
To avoid issues during the build and push process, I added a cleanup step inside the &lt;code&gt;Dockerfile&lt;/code&gt; to remove temporary files, &lt;code&gt;__pycache__&lt;/code&gt; folders, and compiled &lt;code&gt;.pyc&lt;/code&gt; files.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clean up temporary files to reduce image size&lt;/span&gt;
RUN find /var/lang/lib/python3.12/site-packages &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-type&lt;/span&gt; d &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"__pycache__"&lt;/span&gt; &lt;span class="nt"&gt;-exec&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; + 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
find /var/lang/lib/python3.12/site-packages &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.pyc"&lt;/span&gt; &lt;span class="nt"&gt;-delete&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Although this may look like a small optimization, this type of cleanup helps reduce the final size of images with many Python dependencies.&lt;/p&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h2&gt;
  
  
  Avoiding timeouts and model downloads at runtime
&lt;/h2&gt;

&lt;p&gt;Another important problem appeared during the first executions of the &lt;strong&gt;Lambda function&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the version used in this project, &lt;strong&gt;Docling&lt;/strong&gt; tried to automatically download the models at startup if they were not found locally. This caused timeouts and also created another issue: the &lt;strong&gt;Lambda&lt;/strong&gt; filesystem is read only outside the temporary directory, &lt;br&gt;
which means models cannot be downloaded or saved there at runtime.&lt;/p&gt;

&lt;p&gt;To solve this, I decided to predownload the models during the &lt;strong&gt;Docker&lt;/strong&gt; build and store them directly inside the image.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;Dockerfile&lt;/code&gt;, I added the following:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Copy and run model download script&lt;/span&gt;
COPY download_models.py /tmp/download_models.py

RUN &lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /opt/docling-models &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
python3.12 /tmp/download_models.py &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nb"&gt;rm&lt;/span&gt; /tmp/download_models.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The script initializes a &lt;code&gt;DocumentConverter&lt;/code&gt;, which forces the required &lt;strong&gt;Docling&lt;/strong&gt; models to be downloaded during the image build instead of during Lambda execution.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docling.document_converter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DocumentConverter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PdfFormatOption&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docling.datamodel.base_models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;InputFormat&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docling.datamodel.pipeline_options&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PdfPipelineOptions&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;

   &lt;span class="n"&gt;artifacts_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/opt/docling-models&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;pipeline_options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PdfPipelineOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="n"&gt;artifacts_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;artifacts_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;do_ocr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;
   &lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;converter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DocumentConverter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="n"&gt;format_options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
           &lt;span class="n"&gt;InputFormat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PDF&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;PdfFormatOption&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
               &lt;span class="n"&gt;pipeline_options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pipeline_options&lt;/span&gt;
           &lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="p"&gt;}&lt;/span&gt;
   &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✓ Models downloaded successfully&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;With this approach, the models are packaged inside the container and the &lt;strong&gt;Lambda function&lt;/strong&gt; can start much faster, avoiding unnecessary downloads and problems related to the restricted filesystem.&lt;/p&gt;


&lt;h1&gt;
  
  
  8. 🔁 Orchestrated paper processing
&lt;/h1&gt;

&lt;p&gt;The following function corresponds to the &lt;strong&gt;orchestration Lambda&lt;/strong&gt;. Its goal is to list the papers stored in &lt;strong&gt;Amazon S3&lt;/strong&gt; and run the processing by invoking the &lt;code&gt;docling-lambda&lt;/code&gt; function, which contains the &lt;strong&gt;Docker image&lt;/strong&gt; with &lt;strong&gt;Docling&lt;/strong&gt;.&lt;br&gt;
In this case, the processing is done in a distributed way. Each &lt;code&gt;PDF&lt;/code&gt; file is sent individually to the Lambda function responsible for converting the document into &lt;code&gt;Markdown&lt;/code&gt;.&lt;br&gt;
In the repository, you will find an implementation similar to the following:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;botocore.config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Config&lt;/span&gt;

&lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;lambda_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lambda&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="n"&gt;read_timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;900&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;connect_timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
   &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;BUCKET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docling-papers-tutorial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;DOCLING_LAMBDA&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docling-lambda&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

   &lt;span class="c1"&gt;# List PDFs
&lt;/span&gt;   &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_objects_v2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BUCKET&lt;/span&gt;
   &lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;pdfs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
       &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
       &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Contents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
       &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="p"&gt;]&lt;/span&gt;

   &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDFs found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdfs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

   &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pdf_key&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pdfs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

       &lt;span class="n"&gt;s3_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BUCKET&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pdf_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

       &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;s3_url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

       &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lambda_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
           &lt;span class="n"&gt;FunctionName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DOCLING_LAMBDA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;InvocationType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RequestResponse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;Payload&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;s3_url&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
       &lt;span class="p"&gt;)&lt;/span&gt;

       &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Payload&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

       &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;s3_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="p"&gt;})&lt;/span&gt;

       &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✓ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pdf_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; → &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;processed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;Once the function is deployed, we can execute the &lt;strong&gt;Lambda function&lt;/strong&gt; and analyze the results from &lt;strong&gt;CloudWatch&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkzjfc1uinsznqs8gqd3d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkzjfc1uinsznqs8gqd3d.png" alt=" " width="661" height="521"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;In my tests with these 10 papers, the average was approximately 3.8 seconds per page. This can vary significantly depending on document complexity&lt;/p&gt;

&lt;p&gt;This confirms something important: the processing time depends much more on the &lt;strong&gt;complexity of the content&lt;/strong&gt;, such as tables, images, multiple columns, or figures, than on the file size or the number of pages.&lt;/p&gt;


&lt;h1&gt;
  
  
  9. 🟩 Amazon Bedrock Knowledge Base
&lt;/h1&gt;

&lt;p&gt;Before configuring our &lt;strong&gt;Knowledge Base&lt;/strong&gt;, it is worth understanding what this service is inside &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; and why it plays a key role in a &lt;code&gt;RAG&lt;/code&gt; architecture.&lt;/p&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h2&gt;
  
  
  What is a Knowledge Base?
&lt;/h2&gt;

&lt;p&gt;In simple terms, a &lt;strong&gt;Knowledge Base&lt;/strong&gt; is a layer that connects private data with artificial intelligence models, so they can use that information as context to answer questions.&lt;/p&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h2&gt;
  
  
  What is a Knowledge Base in Amazon Bedrock?
&lt;/h2&gt;

&lt;p&gt;In &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;, a &lt;strong&gt;Knowledge Base&lt;/strong&gt; is a fully managed service that allows you to build &lt;code&gt;RAG&lt;/code&gt; systems over your own data.&lt;br&gt;
This means that models can query information stored in a knowledge base to generate more accurate and contextualized answers based on private data.&lt;br&gt;
The following image summarizes the key benefits of using &lt;strong&gt;Amazon Bedrock Knowledge Bases&lt;/strong&gt; in this type of architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw0auzc4jjepyg91fla4j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw0auzc4jjepyg91fla4j.png" alt=" " width="799" height="444"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Also, it includes capabilities such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;strong&gt;automatic embedding management&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;context management&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;source attribution in the answers&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;direct integration with private data&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h2&gt;
  
  
  Supported data sources
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;Knowledge Base&lt;/strong&gt; in &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; can connect to different data sources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ 🪣 &lt;strong&gt;Amazon S3&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ 🟦 &lt;strong&gt;Confluence&lt;/strong&gt; depending on availability&lt;/li&gt;
&lt;li&gt;〰️ ☁️ &lt;strong&gt;Salesforce&lt;/strong&gt; depending on availability&lt;/li&gt;
&lt;li&gt;〰️ 📑 &lt;strong&gt;Custom data sources&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;〰️ 🕸️ &lt;strong&gt;Web Crawler&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Availability can depend on the AWS Region and account configuration.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In this use case, we are mainly going to work with &lt;strong&gt;Amazon S3&lt;/strong&gt;, where we store the documents processed with &lt;strong&gt;Docling&lt;/strong&gt;.&lt;/p&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h2&gt;
  
  
  Chunking: how information is divided
&lt;/h2&gt;

&lt;p&gt;One of the most important concepts when building a &lt;strong&gt;Knowledge Base&lt;/strong&gt; is &lt;code&gt;chunking&lt;/code&gt;, which is the process of dividing documents into smaller parts called &lt;code&gt;chunks&lt;/code&gt;.&lt;br&gt;
This is necessary because models have context limitations and cannot process very long documents all at once.&lt;/p&gt;

&lt;p&gt;We can understand this from two perspectives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;strong&gt;Context limit&lt;/strong&gt;: models can only handle a limited number of &lt;code&gt;tokens&lt;/code&gt; &lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;Efficient search&lt;/strong&gt;: dividing the content allows the system to retrieve more precise information faster
In this project, &lt;code&gt;chunking&lt;/code&gt; is key because we are working with scientific papers, where the context between sections is very important, for example: results, methods, and adverse effects.
The following image describes the main &lt;code&gt;chunking&lt;/code&gt; features available in &lt;strong&gt;Amazon Bedrock Knowledge Bases&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhk6h7yxz0blu7s3x059m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhk6h7yxz0blu7s3x059m.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h2&gt;
  
  
  Step by step configuration
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Creating the Knowledge Base in Amazon Bedrock
&lt;/h3&gt;

&lt;p&gt;In this section, we are going to configure the &lt;strong&gt;Knowledge Base&lt;/strong&gt; in &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; using the papers processed with &lt;strong&gt;Docling&lt;/strong&gt; and stored in &lt;strong&gt;Amazon S3&lt;/strong&gt; in &lt;code&gt;Markdown&lt;/code&gt; format.&lt;br&gt;
The &lt;code&gt;chunking&lt;/code&gt; strategy selected for this use case is &lt;strong&gt;Hierarchical Chunking&lt;/strong&gt;, because it allows us to keep the relationship between document sections, for example results, methods, or adverse effects. This is key when working with scientific papers.&lt;/p&gt;

&lt;p&gt;Below, I explain why I did not choose the other strategies and what each one implies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ ❌ &lt;strong&gt;Default&lt;/strong&gt;: uses the default chunking configuration, which may split content without preserving the full document structure.&lt;/li&gt;
&lt;li&gt;〰️ ❌ &lt;strong&gt;Fixed size&lt;/strong&gt;: similar to the default strategy, but configurable. It still has the same problem of losing context.&lt;/li&gt;
&lt;li&gt;〰️ ❌ &lt;strong&gt;Semantic&lt;/strong&gt;: groups content by semantic similarity. It can be useful, but it may add extra processing time and can be less predictable depending on the documents.&lt;/li&gt;
&lt;li&gt;〰️ ❌ &lt;strong&gt;No chunking&lt;/strong&gt;:  useful when documents are already small or manually preprocessed into meaningful units.&lt;/li&gt;
&lt;li&gt;〰️ ✅ &lt;strong&gt;Hierarchical&lt;/strong&gt;: keeps a parent child structure, allowing each chunk to preserve its context inside the document.&lt;/li&gt;
&lt;/ul&gt;

&lt;center&gt;•••••&lt;/center&gt;
&lt;h3&gt;
  
  
  Prerequisites and permissions
&lt;/h3&gt;

&lt;p&gt;To create a &lt;strong&gt;Knowledge Base&lt;/strong&gt;, you need to consider the following permissions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;strong&gt;IAM&lt;/strong&gt;: create or select roles with the right permissions&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;Bedrock&lt;/strong&gt;: access to Knowledge Bases and embedding models&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;S3&lt;/strong&gt;: access to the bucket where the processed documents are stored&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;KMS&lt;/strong&gt;: optional, for data encryption&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;Lambda&lt;/strong&gt;: optional, for custom data transformations &lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ AWS does not support creating a Knowledge Base using root user credentials — you need an IAM user or role with the right permissions. Permission configuration is usually one of the most delicate parts of this type of architecture.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;center&gt;_______________&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 1: Create the Knowledge Base
&lt;/h4&gt;

&lt;p&gt;In &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;, go to the &lt;strong&gt;Knowledge Bases&lt;/strong&gt; section and select &lt;code&gt;Create knowledge base with vector store&lt;/code&gt;.&lt;br&gt;
Complete the configuration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name&lt;/strong&gt;: &lt;code&gt;docling-glp1-papers-kb&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Description&lt;/strong&gt;: &lt;code&gt;Knowledge base with GLP-1 papers processed with Docling and Lambda&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IAM Role&lt;/strong&gt;: &lt;code&gt;AmazonBedrockExecutionRoleForKnowledgeBase-docling&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data source&lt;/strong&gt;: &lt;code&gt;Amazon S3&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgt4qegghq1cj0ujwrayv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgt4qegghq1cj0ujwrayv.png" alt=" " width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhaejmc9kj9m7p2vx54e6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhaejmc9kj9m7p2vx54e6.png" alt=" " width="799" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;_______________&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 2: Configure the data source
&lt;/h4&gt;

&lt;p&gt;Configure the data source with the following values:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;strong&gt;Source name&lt;/strong&gt;: &lt;code&gt;docling-glp1-papers-ds&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;S3 path&lt;/strong&gt;: &lt;code&gt;s3://docling-papers-tutorial/output/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;Parsing strategy&lt;/strong&gt;: &lt;code&gt;Amazon Bedrock default parser&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;Chunking strategy&lt;/strong&gt;: &lt;code&gt;Hierarchical chunking&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygwp16arv2gwnnr5t7qf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygwp16arv2gwnnr5t7qf.png" alt=" " width="800" height="619"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;_______________&lt;/center&gt;
&lt;h4&gt;
  
  
  Step 3: Vector store and embeddings
&lt;/h4&gt;

&lt;p&gt;Here, we are going to select the model that we will use to create the &lt;code&gt;RAG&lt;/code&gt; system and the destination where the information will be stored.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;strong&gt;Embeddings model&lt;/strong&gt;: &lt;code&gt;Titan Text Embeddings v2&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;Vector store&lt;/strong&gt;: &lt;code&gt;Amazon S3 Vectors&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this case, we use &lt;code&gt;on demand&lt;/code&gt; mode, although other models are available depending on the use case.&lt;br&gt;
After that, we select the &lt;strong&gt;Amazon S3 bucket&lt;/strong&gt; used by &lt;strong&gt;S3 Vectors&lt;/strong&gt; to store the vector index.&lt;/p&gt;

&lt;p&gt;To better understand how this type of storage works, you can check a previous article I wrote:&lt;/p&gt;


&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b" class="crayons-story__hidden-navigation-link"&gt;From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;
          &lt;a class="crayons-logo crayons-logo--l" href="/aws-builders"&gt;
            &lt;img alt="AWS Community Builders  logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F2794%2F88da75b6-aadd-4ea1-8083-ae2dfca8be94.png" class="crayons-logo__image" width="350" height="350"&gt;
          &lt;/a&gt;

          &lt;a href="/r_elena_mendez_escobar" class="crayons-avatar  crayons-avatar--s absolute -right-2 -bottom-2 border-solid border-2 border-base-inverted  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F719582%2F2d700dae-2335-4c2f-9a32-4435184a4f4f.jpeg" alt="r_elena_mendez_escobar profile" class="crayons-avatar__image" width="200" height="200"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/r_elena_mendez_escobar" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Romina Elena Mendez Escobar
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Romina Elena Mendez Escobar
                
              
              &lt;div id="story-author-preview-content-3137085" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/r_elena_mendez_escobar" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F719582%2F2d700dae-2335-4c2f-9a32-4435184a4f4f.jpeg" class="crayons-avatar__image" alt="" width="200" height="200"&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Romina Elena Mendez Escobar&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

            &lt;span&gt;
              &lt;span class="crayons-story__tertiary fw-normal"&gt; for &lt;/span&gt;&lt;a href="/aws-builders" class="crayons-story__secondary fw-medium"&gt;AWS Community Builders &lt;/a&gt;
            &lt;/span&gt;
          &lt;/div&gt;
          &lt;a href="https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Dec 31 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b" id="article-link-3137085"&gt;
          From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/aws"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;aws&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/python"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;python&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/cloud"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;cloud&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/fire-f60e7a582391810302117f987b22a8ef04a2fe0df7e3258a5f49332df1cec71e.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;6&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              1&lt;span class="hidden s:inline"&gt;&amp;nbsp;comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            21 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;/div&gt;
&lt;br&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnfydh7fty5c1y0o7xpsa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnfydh7fty5c1y0o7xpsa.png" alt=" " width="800" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;_______________&lt;/center&gt;

&lt;h4&gt;
  
  
  Step 4: Data synchronization
&lt;/h4&gt;

&lt;p&gt;Once the &lt;strong&gt;Knowledge Base&lt;/strong&gt; is created, its initial status will be &lt;code&gt;Available&lt;/code&gt;.&lt;br&gt;
To load the documents, you need to run a manual synchronization:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ Go to the &lt;strong&gt;Knowledge Base&lt;/strong&gt; &lt;/li&gt;
&lt;li&gt;〰️ Select the &lt;strong&gt;data source&lt;/strong&gt; &lt;/li&gt;
&lt;li&gt;〰️ Click on &lt;code&gt;Sync&lt;/code&gt;
This synchronization processes the documents, generates the required &lt;code&gt;embeddings&lt;/code&gt;, and makes the content available for natural language queries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcc2z6ec3b5xj9l1agh0p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcc2z6ec3b5xj9l1agh0p.png" alt=" " width="800" height="361"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvtr8f013m0bw2u82fcy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvtr8f013m0bw2u82fcy.png" alt=" " width="800" height="423"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  10. 🟩 BEDROCK: Test the Knowledge Base
&lt;/h1&gt;

&lt;p&gt;Now we return to the point where we left off a few steps ago: testing our &lt;strong&gt;Knowledge Base&lt;/strong&gt; with the processed papers.&lt;/p&gt;

&lt;p&gt;The idea in this stage is to validate whether the system can retrieve relevant information from the &lt;code&gt;10&lt;/code&gt; scientific papers that we previously loaded and processed.&lt;/p&gt;

&lt;p&gt;To do this, we are going to ask some questions focused on clinical analysis and study comparison:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;What gastrointestinal adverse effects were reported in semaglutide clinical trials and what were the incidence rates?&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;What were the cardiovascular outcomes reported in semaglutide clinical trials and which patient populations benefited most?&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;How does semaglutide compare to liraglutide and tirzepatide in terms of weight loss efficacy and adverse effects across the clinical trials?&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These queries allow us to evaluate how the system retrieves specific information across different studies, especially in scenarios where the results are distributed across multiple documents.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpwojcdfi7qb54q1lhbdy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpwojcdfi7qb54q1lhbdy.png" alt=" " width="800" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvd8etexylm9mu75t2b3a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvd8etexylm9mu75t2b3a.png" alt=" " width="800" height="404"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  11. 🎯 Conclusions
&lt;/h1&gt;

&lt;p&gt;This MVP shows that it is possible to build a queryable &lt;strong&gt;knowledge base&lt;/strong&gt; over private scientific documents using &lt;strong&gt;AWS serverless services&lt;/strong&gt; together with open source tools like &lt;strong&gt;Docling&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I learned while building this system:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ The &lt;code&gt;chunking&lt;/code&gt; strategy matters more than it may seem. In the case of scientific papers, &lt;strong&gt;Hierarchical Chunking&lt;/strong&gt; preserves the context between sections such as &lt;code&gt;Results&lt;/code&gt; or &lt;code&gt;Adverse Effects&lt;/code&gt; better than fixed token based strategies.&lt;/li&gt;
&lt;li&gt;〰️ &lt;strong&gt;Docling&lt;/strong&gt; can help reduce the cost and complexity of preprocessing when working with complex &lt;code&gt;PDFs&lt;/code&gt;, especially those with tables, columns, and non linear structures. It allows us to convert these documents into structured information ready to be used in AI systems.
*〰️ &lt;code&gt;Embeddings&lt;/code&gt; are not the same as security. Even though we work with vector representations, research has shown that in some scenarios it is possible to infer or reconstruct sensitive information from embedding vectors. Because of this, treating vector stores as sensitive data and applying access controls and encryption 
is a good practice in real scenarios.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;If we take this to a production environment, three pieces become fundamental:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;〰️ &lt;code&gt;CI/CD&lt;/code&gt; pipelines are necessary to automate processing and system updates as improvements are added.&lt;/li&gt;
&lt;li&gt;〰️ Infrastructure as Code with &lt;strong&gt;Terraform&lt;/strong&gt;, or similar tools, is key to replicate, scale, and maintain the environment consistently across different stages.&lt;/li&gt;
&lt;li&gt;〰️ Any solution that is deployed, especially one that uses AI models, should include observability systems to detect and solve problems in production.
In terms of impact, this type of solution opens a very relevant space in industries such as healthcare and research, where controlled access to large volumes of knowledge can significantly accelerate scientific analysis and decision making.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally, beyond the tools used, the most interesting part of this architecture is how it combines different cloud services and generative AI capabilities to solve a very concrete problem: converting unstructured information into accessible, private, and queryable knowledge using natural language.&lt;/p&gt;







&lt;h1&gt;
  
  
  12. 📚 Technical references
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Amazon Web Services. (n.d.). &lt;strong&gt;AWS Lambda Developer Guide&lt;/strong&gt;. AWS Documentation. Retrieved May 26, 2026, from &lt;a href="https://docs.aws.amazon.com/es_es/lambda/latest/dg/welcome.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/es_es/lambda/latest/dg/welcome.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon Web Services. (n.d.). &lt;strong&gt;Create a Lambda function using a container image&lt;/strong&gt;. AWS Documentation. Retrieved May 26, 2026, from &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/images-create.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/lambda/latest/dg/images-create.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon Web Services. (n.d.). &lt;strong&gt;Amazon Bedrock Knowledge Bases&lt;/strong&gt;. Retrieved May 26, 2026, from &lt;a href="https://aws.amazon.com/es/bedrock/knowledge-bases/" rel="noopener noreferrer"&gt;https://aws.amazon.com/es/bedrock/knowledge-bases/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;IBM. (n.d.). &lt;strong&gt;Docling&lt;/strong&gt;. Retrieved May 26, 2026, from &lt;a href="https://www.docling.ai/" rel="noopener noreferrer"&gt;https://www.docling.ai/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Docling Project. (n.d.). &lt;strong&gt;SmolDocling 256M preview&lt;/strong&gt;. Hugging Face. Retrieved May 26, 2026, from &lt;a href="https://huggingface.co/docling-project/SmolDocling-256M-preview" rel="noopener noreferrer"&gt;https://huggingface.co/docling-project/SmolDocling-256M-preview&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;University of Utah Health. (2026). &lt;strong&gt;GLP 1 FAQs answered by weight loss experts&lt;/strong&gt;. Retrieved from &lt;a href="https://healthcare.utah.edu/healthfeed/2026/03/preguntas-frecuentes-sobre-el-glp-1-respondidas-por-expertos-en-perdida-de-peso" rel="noopener noreferrer"&gt;https://healthcare.utah.edu/healthfeed/2026/03/preguntas-frecuentes-sobre-el-glp-1-respondidas-por-expertos-en-perdida-de-peso&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h1&gt;
  
  
  13. 📄 Research papers used in the use case
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Han, S. H., Safeek, R., Ockerman, K., Trieu, N., Mars, P., Klenke, A., Furnas, H., &amp;amp; Sorice Virk, S. (2023). &lt;strong&gt;Public interest in the off label use of glucagon like peptide 1 agonists (Ozempic) for cosmetic weight loss: A Google Trends analysis&lt;/strong&gt;. &lt;em&gt;Aesthetic Surgery Journal&lt;/em&gt;. &lt;a href="https://doi.org/10.1093/asj/sjad211" rel="noopener noreferrer"&gt;https://doi.org/10.1093/asj/sjad211&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ryan, N., &amp;amp; Savulescu, J. (2026). &lt;strong&gt;The ethics of Ozempic and Wegovy&lt;/strong&gt;. &lt;em&gt;Journal of Medical Ethics, 52&lt;/em&gt;(3), 185–193. &lt;a href="https://doi.org/10.1136/jme-2024-110374" rel="noopener noreferrer"&gt;https://doi.org/10.1136/jme-2024-110374&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mailhac, A., Pedersen, L., Pottegård, A., Søndergaard, J., Mogensen, T., Sørensen, H. T., &amp;amp; Thomsen, R. W. (2024). &lt;strong&gt;Semaglutide (Ozempic®) use in Denmark 2018 through 2023: User trends and off label prescribing for weight loss&lt;/strong&gt;. &lt;em&gt;Clinical Epidemiology&lt;/em&gt;. &lt;a href="https://doi.org/10.2147/CLEP.S456170" rel="noopener noreferrer"&gt;https://doi.org/10.2147/CLEP.S456170&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Manoharan, S. V. R. R., &amp;amp; Madan, R. (2024). &lt;strong&gt;GLP 1 agonists can affect mood: A case of worsened depression on Ozempic (Semaglutide)&lt;/strong&gt;. &lt;em&gt;Case Reports in Psychiatry&lt;/em&gt;. &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC11208009/" rel="noopener noreferrer"&gt;https://pmc.ncbi.nlm.nih.gov/articles/PMC11208009/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Humphrey, C. D., &amp;amp; Lawrence, A. C. (2023). &lt;strong&gt;Implications of Ozempic and other semaglutide medications for facial plastic surgeons&lt;/strong&gt;. &lt;em&gt;Facial Plastic Surgery&lt;/em&gt;. &lt;a href="https://doi.org/10.1055/a-2148-6321" rel="noopener noreferrer"&gt;https://doi.org/10.1055/a-2148-6321&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Pillarisetti, L., &amp;amp; Agrawal, D. K. (2025). &lt;strong&gt;Semaglutide: Double edged sword with risks and benefits&lt;/strong&gt;. &lt;em&gt;Archives of Internal Medicine Research, 8&lt;/em&gt;(1), 1–13. &lt;a href="https://doi.org/10.26502/aimr.0189" rel="noopener noreferrer"&gt;https://doi.org/10.26502/aimr.0189&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fong, S., Carollo, A., Lazuras, L., Corazza, O., &amp;amp; Esposito, G. (2024). &lt;strong&gt;Ozempic (Glucagon like peptide 1 receptor agonist) in social media posts: Unveiling user perspectives through Reddit topic modeling&lt;/strong&gt;. &lt;em&gt;Dialogues in Health&lt;/em&gt;. &lt;a href="https://www.sciencedirect.com/science/article/pii/S2667118224000163" rel="noopener noreferrer"&gt;https://www.sciencedirect.com/science/article/pii/S2667118224000163&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Carboni, A., Woessner, S., Martini, O., Marroquin, N. A., &amp;amp; Waller, J. (2024). &lt;strong&gt;Natural weight loss or “Ozempic Face”: Demystifying a social media phenomenon&lt;/strong&gt;. &lt;em&gt;Journal of Drugs in Dermatology, 23&lt;/em&gt;(1). &lt;a href="https://doi.org/10.36849/JDD.7613" rel="noopener noreferrer"&gt;https://doi.org/10.36849/JDD.7613&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Grech, V. S., Lotsaris, K., Grech, I., &amp;amp; Kefala, V. (2024). &lt;strong&gt;Semaglutide (Ozempic) and obesity: A comprehensive guide for aestheticians&lt;/strong&gt;. &lt;em&gt;Review of Clinical Pharmacology and Pharmacokinetics, 38&lt;/em&gt;(Suppl. 1), 31–35. &lt;a href="https://www.researchgate.net/publication/378300594_Semaglutide_Ozempic_and_obesity_A_comprehensive_guide_for_aestheticians" rel="noopener noreferrer"&gt;https://www.researchgate.net/publication/378300594_Semaglutide_Ozempic_and_obesity_A_comprehensive_guide_for_aestheticians&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Vambe, S. D., Zulu, W., Hough, E., &amp;amp; Luvhimbi, M. J. (2024). &lt;strong&gt;Semaglutide (Ozempic®): A comprehensive review of its pharmacology, efficacy, and safety profile in type 2 diabetes mellitus and weight management&lt;/strong&gt;. &lt;em&gt;SA Pharmaceutical Journal, 91&lt;/em&gt;(6), 31–34. &lt;a href="https://www.researchgate.net/publication/388790459_Semaglutide_Ozempic_R_a_comprehensive_review_of_its_pharmacology_efficacy_and_safety_profile_in_type_2_diabetes_mellitus_and_weight_management" rel="noopener noreferrer"&gt;https://www.researchgate.net/publication/388790459_Semaglutide_Ozempic_R_a_comprehensive_review_of_its_pharmacology_efficacy_and_safety_profile_in_type_2_diabetes_mellitus_and_weight_management&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>opensource</category>
      <category>python</category>
    </item>
    <item>
      <title>Google I/O 2026: What Happens When Everything Connects?</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Sat, 23 May 2026 14:03:06 +0000</pubDate>
      <link>https://dev.to/gdg/google-io-2026-what-happens-when-everything-connects-4gf8</link>
      <guid>https://dev.to/gdg/google-io-2026-what-happens-when-everything-connects-4gf8</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-io-writing-2026-05-19"&gt;Google I/O Writing Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Google I/O 2026 showed us something that goes beyond a list of launches: a vision of where technology is heading.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzkw4zvgqlhzhv9mojzja.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzkw4zvgqlhzhv9mojzja.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Image created by the author&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;

&lt;p&gt;Sundar Pichai (Google CEO) opened the presentation sharing some interesting numbers about the *&lt;em&gt;evolution of AI with Google statistics&lt;/em&gt; that you can see in the following &lt;code&gt;infographic&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8ry4llkx539e5i2kkh7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8ry4llkx539e5i2kkh7.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Infographic created by the author&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;

&lt;p&gt;&lt;strong&gt;But let’s get back to the event…&lt;/strong&gt;&lt;br&gt;
Over two hours, Google announced a wide range of products, updates and platforms. From new AI models to &lt;strong&gt;smart glasses&lt;/strong&gt;, from &lt;strong&gt;music generation tools&lt;/strong&gt; to a &lt;strong&gt;digital twin&lt;/strong&gt; of the entire planet.&lt;br&gt;
What stands out is not any single product on its own, but the way they are designed to work together. Most of them are not meant to exist in isolation but to integrate with each other and with the models Google is deploying across its ecosystem.&lt;/p&gt;

&lt;p&gt;Below you will find all the launches organized by category, with my take on each one and direct links to the exact moment in the keynote where it was announced.&lt;/p&gt;


&lt;h2&gt;
  
  
  🧠 Models &amp;amp; Infrastructure
&lt;/h2&gt;

&lt;p&gt;Behind almost everything presented at &lt;strong&gt;Google I/O 2026&lt;/strong&gt; has the same foundation: more advanced models and an infrastructure designed to scale new forms of interaction between people and systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Firc0l77fk4toyvknoz0z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Firc0l77fk4toyvknoz0z.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Image created by the author&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;
&lt;h3&gt;
  
  
  Gemini Omni
&lt;/h3&gt;

&lt;p&gt;The first model that allows you to modify videos using natural language, with inputs that can be images, text or video. But it is not just about understanding text, images, audio and video at the same time;  it is about reasoning over all of them together to generate something new.&lt;br&gt;
What sets it apart from any previous video generator is that it combines an intuitive understanding of physics with real knowledge about history, science and cultural context. So you can take a video you recorded and ask it to change what happens in it, edit the action, add characters, transform a moment into something completely unexpected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Gemini Omni&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=1035s" rel="noopener noreferrer"&gt;17:15&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Gemini 3.5 Flash
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Gemini 3.5 Flash&lt;/strong&gt; is the direct evolution of the main line of large language models (LLM) from Google, optimized to deliver ultra-high speed performance, advanced logical reasoning capabilities and code orchestration. It is an ideal model for building autonomous agents capable of executing complex task flows in the background, writing code, processing large text contexts or powering searches.&lt;br&gt;
All of this while being four times faster than comparable models, which makes it the option specifically designed for agentic tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Gemini 3.5 Flash&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=1423s" rel="noopener noreferrer"&gt;23:43&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🤖 Agents &amp;amp; Productivity: Your Digital Life, Managed
&lt;/h2&gt;

&lt;p&gt;This is one of the categories I enjoyed the most, featuring several tools that change the way we work and organize our daily tasks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4vtld5dmdo6yj2la756.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4vtld5dmdo6yj2la756.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Image created by the author&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;
&lt;h3&gt;
  
  
  Gemini Spark
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Gemini Spark&lt;/strong&gt; is a &lt;strong&gt;personal AI agent&lt;/strong&gt; that runs tasks in the background, across all your applications, without you having to supervise every step. With Spark you can organize an event, manage a chain of emails, coordinate a complex task across multiple services.&lt;br&gt;
It also connects with external tools through the open MCP protocol, which extends its reach beyond the Google ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Gemini Spark&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=2118s" rel="noopener noreferrer"&gt;35:18&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Daily Brief
&lt;/h3&gt;

&lt;p&gt;Google has been offering AI summaries for a while, but Daily Brief is something different. Instead of summarizing a document you provide, it reads your chats, your Gmail emails, your calendar context and your pending tasks, and then prioritizes what matters for your day. The difference between a generic summary and one that actually understands your context is significant, and that is exactly what Daily Brief proposes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Daily Brief&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=4506s" rel="noopener noreferrer"&gt;1:15:06&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Ask YouTube
&lt;/h3&gt;

&lt;p&gt;Ask YouTube changes the search bar we know on YouTube into a conversation. You can ask for a summary, a specific recommendation, or request to find exactly what you need at a precise moment without watching the full video.&lt;br&gt;
What I find most interesting is the impact on creators. The algorithm can now understand the deeper context of content and recommend videos that used to stay hidden behind generic titles. For creators, this is an opportunity, and for content consumption in general, it changes the way we interact with the platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Ask YouTube&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=461s" rel="noopener noreferrer"&gt;7:41&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Docs Live
&lt;/h3&gt;

&lt;p&gt;Docs Live changes the way we create documents. While it's already possible to create content using Gemini's voice input options, this solution lets you share your ideas aloud, and Gemini will start creating a document, formatting, structuring, and writing the text in real time.&lt;br&gt;
The key difference from Gemini's existing voice input is that the result isn't a chat response; it's a properly formatted Google Docs document, complete with headings, lists, and a professional structure right from the start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Docs Live&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=554s" rel="noopener noreferrer"&gt;9:14&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🔍 Search &amp;amp; Commerce: Your Next Purchase Will Be Made by an Agent
&lt;/h2&gt;

&lt;p&gt;AI search is already part of our daily routine, but this year Google took it further, turning search from a simple query into an agent that can look for information and even make purchases on your behalf.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz9v58xfxzmj692e7whs0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz9v58xfxzmj692e7whs0.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Image created by the author&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;
&lt;h3&gt;
  
  
  AI Search Box
&lt;/h3&gt;

&lt;p&gt;The new search box is no longer limited to text. It now accepts &lt;code&gt;images&lt;/code&gt;, &lt;code&gt;files&lt;/code&gt;, &lt;code&gt;videos&lt;/code&gt; and even &lt;code&gt;Chrome tabs&lt;/code&gt; as input. It may seem like a small change, but it completely transforms the experience of searching for something, powered by the new Gemini 3.5 Flash models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 AI Search Box&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=2765s" rel="noopener noreferrer"&gt;46:05&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Search Agents
&lt;/h3&gt;

&lt;p&gt;Search agents are a feature that, in my opinion, will become essential without many people noticing at first. They are background agents that you can set up to monitor specific topics, such as the value of a stock you are tracking, a flight route for an upcoming trip, or a property you want to rent in a specific neighborhood.&lt;/p&gt;

&lt;p&gt;If the &lt;strong&gt;agent detects changes&lt;/strong&gt;, it can &lt;code&gt;summarize the information&lt;/code&gt; and &lt;code&gt;notify you&lt;/code&gt;. It can pick up updates from blogs, news sites, social media, and real-time data on finance, shopping and sports.&lt;br&gt;
This changes the current experience we have with Google Alerts, which were based only on keywords, since it takes both alerts and information retrieval to a much more advanced level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Search Agents&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=2868s" rel="noopener noreferrer"&gt;47:48&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Generative UI in Search
&lt;/h3&gt;

&lt;p&gt;This announcement is quite interesting because information can already be accessed through agents, chats or other channels, but the main idea is to &lt;strong&gt;provide an interface that is intuitive for users to interpret that information&lt;/strong&gt;. Instead of returning a list of links, Search can now build a personalized interactive interface for complex queries. Here you can get a live comparison table, a dynamic chart or an interactive explanation, all generated in real time for your specific question.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Generative UI&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=3077s" rel="noopener noreferrer"&gt;51:17&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Universal Cart &amp;amp; UCP + AP2
&lt;/h3&gt;

&lt;p&gt;Google is redefining the online shopping experience with two announcements. On one hand, &lt;strong&gt;Universal Cart&lt;/strong&gt; turns the shopping cart into something where you can add products, and the system then works in the background autonomously, monitoring price drops, analyzing price history and notifying you when an item becomes available again.&lt;/p&gt;

&lt;p&gt;On the other hand, &lt;strong&gt;the Universal Commerce Protocol (UCP)&lt;/strong&gt; establishes an open standard that allows all of this to scale beyond Google. It creates a common language for agents and systems to work together across the entire shopping process, from finding a product to post-purchase support, connecting consumer platforms, businesses and payment providers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UCP&lt;/strong&gt; is compatible with other key ecosystem protocols such as &lt;strong&gt;Agent2Agent (A2A)&lt;/strong&gt;, &lt;strong&gt;Agent Payments Protocol (AP2)&lt;/strong&gt; and &lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt;, positioning it not as a Google tool, but as the infrastructure for agentic commerce.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 AP2 Protocol&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=3677s" rel="noopener noreferrer"&gt;1:01:17&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;📍 Universal Cart&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=3793s" rel="noopener noreferrer"&gt;1:03:13&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🎬 Creative Tools: Can anyone make a movie today?
&lt;/h2&gt;

&lt;p&gt;Content creation was another area where the shift in focus at Google I/O 2026 became very clear. It is not just about generating images, music or video, but about how these tools are starting to integrate into a continuous creative flow, closer to a conversation than to a technical process.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fom9uab517rmxxjsiheqb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fom9uab517rmxxjsiheqb.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Image created by the author&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;
&lt;h3&gt;
  
  
  Google Flow
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Google Flow&lt;/strong&gt; is a creative platform developed by Google that allows users &lt;code&gt;to generate, edit and compose videos, images and music from prompts or images&lt;/code&gt;. In its latest update, it integrates with Gemini Omni to take video editing to a more conversational level, allowing you to change environments, add characters, and generate 16 different camera angles from a single image.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Google Flow&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=5288s" rel="noopener noreferrer"&gt;1:28:08&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Flow Music
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Flow Music&lt;/strong&gt; is a generative tool that &lt;code&gt;allows users to compose songs and create full music videos from prompts&lt;/code&gt;. It also lets you provide a reference recording and build a complete track around it, edit it section by section, reimagine the style of a song while keeping its original melody, or create music videos by directly conversing with the agent.&lt;br&gt;
I think it is an ideal tool for independent artists who develop games, apps or video content and want to create using AI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Flow Music&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=5461s" rel="noopener noreferrer"&gt;1:31:01&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Stitch
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Google Stitch&lt;/strong&gt; is a tool developed by Google Labs that uses AI to &lt;code&gt;design user interfaces (UI/UX)&lt;/code&gt;. These designs can be exported directly to code, Figma, Google Antigravity or Google AI Studio. The design process is driven by instructions that can be given through text or voice, and it is generated in real time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Stitch&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=5136s" rel="noopener noreferrer"&gt;1:25:36&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Google Pics
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Google Pics&lt;/strong&gt; is a &lt;code&gt;new image creation and editing tool based on Nano Banana&lt;/code&gt;, Google’s model for this type of task. It allows users to select and edit specific elements with precision, such as moving objects, changing colors, or transforming one element into another without affecting the rest of the image.&lt;/p&gt;

&lt;p&gt;Without a doubt, it is a very useful tool for content creators, making it easier to modify and edit images using text instructions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Google Pics&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=5060s" rel="noopener noreferrer"&gt;1:24:20&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  ⚙️ Developer Tools &amp;amp; Hardware
&lt;/h3&gt;

&lt;p&gt;This section is perhaps one of the most diverse of the event, as it combines developer tools with consumer hardware that is still in active development. It ranges from systems capable of coordinating code agents at scale to devices that start bringing Gemini interactions directly into the physical world.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwrsrw1zcqi44nhycq1xi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwrsrw1zcqi44nhycq1xi.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Image created by the author&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;
&lt;h3&gt;
  
  
  Antigravity 2.0
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Antigravity 2.0&lt;/strong&gt; is a native desktop application that works as a central platform for coordinating multiple sub-agents running tasks in parallel. The keynote demo showed one of the most complex examples, building an operating system from scratch, and this tool allows you to create a plan and define how sub-agents should run in parallel in order to achieve the goal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Antigravity 2.0&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=1615s" rel="noopener noreferrer"&gt;26:55&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  CodeMender
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;CodeMender&lt;/strong&gt; is a security tool originally developed by Google DeepMind. The tool &lt;code&gt;scans code&lt;/code&gt;, &lt;code&gt;identifies vulnerabilities autonomously&lt;/code&gt;, &lt;code&gt;recommends fixes&lt;/code&gt;, &lt;code&gt;tests them in a safe environment&lt;/code&gt;, and &lt;code&gt;can apply the necessary patches&lt;/code&gt; with your approval at each step.&lt;br&gt;
&lt;strong&gt;📍 CodeMender&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=6358s" rel="noopener noreferrer"&gt;1:45:58&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Audio Glasses
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Audio glasses&lt;/strong&gt;, developed with Samsung and designed in collaboration with Gentle Monster and Warby Parker, allow you to use Gemini without a screen and without taking your phone out. With these glasses, you can ask about a restaurant you just passed, get step-by-step directions, manage calls and messages, take photos with a voice command, or use the apps installed on your phone.&lt;br&gt;
&lt;strong&gt;📍 Audio Glasses&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=5672s" rel="noopener noreferrer"&gt;1:34:32&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Display Glasses &amp;amp; Android XR
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Display Glasses&lt;/strong&gt; go one step further: they include micro-projectors built into the lenses that overlay useful information on the real world, such as navigation maps or real-time translations on signs, among other features.&lt;br&gt;
Meanwhile, Android XR is the operating system platform that powers these devices, developed with Samsung and Qualcomm. It is still in a trusted testers phase, with a wider rollout expected later this year.&lt;br&gt;
&lt;strong&gt;📍 Android XR / Display Glasses&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=5595s" rel="noopener noreferrer"&gt;1:33:15&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🔬 Science: What Happens When AI Never Stops Researching?
&lt;/h2&gt;

&lt;p&gt;This was, for me, the most important section of the event. The following initiatives from Google apply AI to problems that go far beyond personal productivity, from accelerating scientific research to modeling the global climate and rethinking the process of drug discovery.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmyvgur5x1prmujyhkw0l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmyvgur5x1prmujyhkw0l.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Image created by the author&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;
&lt;h3&gt;
  
  
  Gemini for Science
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Gemini for Science&lt;/strong&gt; is a research acceleration platform that allows scientists to stay up to date with newly published papers, turn research goals into executable code, and generate new hypotheses. It is still in a prototype phase in Google Labs, but the concept is what matters: AI is not presented as a replacement for scientific thinking, but as infrastructure that removes friction from the early stages of research, enabling literature search, synthesis of papers, and translation of hypotheses into experiments.&lt;br&gt;
A researcher who can stay updated in real time across their entire field and automatically translate a hypothesis into an experiment is a researcher who can focus on more meaningful work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Gemini for Science&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=6397s" rel="noopener noreferrer"&gt;1:46:37&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  AlphaEarth Foundations &amp;amp; WeatherNext
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AlphaEarth Foundations&lt;/strong&gt; is a model developed by Google DeepMind that works as an interactive digital twin of the Earth, powered by real-time satellite data, climate sensors, ocean readings, and biodiversity records. Meanwhile, &lt;strong&gt;WeatherNext&lt;/strong&gt; is its atmospheric counterpart, a weather forecasting engine capable of predicting hurricane paths and extreme weather events with greater accuracy and speed than traditional systems.&lt;br&gt;
These models were presented at the conference as examples of how Google’s AI technology can be applied to solve problems that affect billions of people around the world.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 AlphaEarth + WeatherNext&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=6397s" rel="noopener noreferrer"&gt;1:46:37&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  Isomorphic Labs
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Isomorphic Labs&lt;/strong&gt;, the biotechnology company within Alphabet (a sister company of &lt;strong&gt;Google DeepMind&lt;/strong&gt;), continues to build on AlphaFold. This is an AI system developed by DeepMind that enables the prediction of protein structures.&lt;br&gt;
It is another example of &lt;code&gt;how Google’s technology is being applied in the pharmaceutical industry&lt;/code&gt;, helping to significantly accelerate research and molecular design for treatments against cancer and complex immune disorders.&lt;br&gt;
In the keynote, this work was described as “science at digital speed”, where AI acts as a tool to understand biological systems that were previously impossible to model directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 Isomorphic Labs&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=6397s" rel="noopener noreferrer"&gt;1:46:37&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  SynthID
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;SynthID&lt;/strong&gt; is Google’s &lt;code&gt;invisible watermarking system&lt;/code&gt; for AI-generated content. This label is added to images, videos and audio at the moment of creation, allowing anyone or any system to later verify whether something was generated by AI.&lt;br&gt;
This announcement is important not only from a safety perspective, but also because it is being adopted as a standard by companies such as OpenAI and ElevenLabs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📍 SynthID&lt;/strong&gt; → &lt;a href="https://www.youtube.com/watch?v=wYSncx9zLIU&amp;amp;t=1266s" rel="noopener noreferrer"&gt;21:06&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  What I'm Most Excited to Try
&lt;/h2&gt;

&lt;p&gt;If I had to choose the three announcements I will follow most closely:&lt;/p&gt;
&lt;h3&gt;
  
  
  🛒 Universal Cart &amp;amp; UCP + AP2
&lt;/h3&gt;

&lt;p&gt;These two announcements will change the way we shop online. For years, we have designed experiences for human users: visual interfaces, marketplaces, recommendations and conversion funnels. But Google is proposing something different: agents capable of discovering products, evaluating options, monitoring prices and executing purchases on our behalf.&lt;br&gt;
This means ecommerce is no longer only a human platform interaction, but starts to become an ecosystem where agents also participate as consumers.&lt;br&gt;
I do not think these solutions will replace traditional commerce overnight, but they will deeply change how trust is built, how products are presented, and how companies compete for attention, not only from people but also from intelligent agents.&lt;/p&gt;
&lt;h3&gt;
  
  
  🔍 Search Agents
&lt;/h3&gt;

&lt;p&gt;Traditional alerts have always been passive: they depended on exact keywords and often generated more noise than context. This is one of the announcements I will probably follow most closely because it completely changes that logic.&lt;br&gt;
Instead of manually searching for information, you can now delegate the monitoring of a topic to a system that understands intent, relevance and meaningful changes. An agent that continuously tracks the internet in the background.&lt;br&gt;
And the more I think about it, the clearer it becomes that this might be one of the most important features of the keynote, precisely because it will quietly integrate into our daily routine.&lt;/p&gt;

&lt;center&gt;• • • •&lt;/center&gt;
&lt;h3&gt;
  
  
  🧬 Gemini for Science
&lt;/h3&gt;

&lt;p&gt;Of everything announced, this is probably the project with the deepest potential impact.&lt;br&gt;
Modern scientific research has a silent problem: the speed of knowledge has already surpassed human capacity to absorb it. Thousands of papers are published every week, information is fragmented, hypotheses are scattered, and entire weeks are spent just trying to stay updated.&lt;br&gt;
Gemini for Science proposes something fundamentally different: turning AI into infrastructure for research.&lt;br&gt;
The ability to translate scientific literature into actionable hypotheses, generate experimental code, connect discoveries across disciplines and accelerate research processes could completely change the scale at which science progresses.&lt;br&gt;
Because perhaps the most important application of artificial intelligence is not to automate work, but to accelerate human knowledge.&lt;/p&gt;


&lt;h2&gt;
  
  
  🎥 What I Actually Tried: Google Flow
&lt;/h2&gt;

&lt;p&gt;Most of the announcements that caught my attention were related to systems, agents and infrastructure. But beyond the long term vision, I also wanted to understand what it actually feels like to interact with one of these tools in practice.&lt;br&gt;
So instead of just describing them, I decided to try one myself. I opened Google Flow and started experimenting with a short video prompt, and for a moment, I found myself living that childhood idea of creating my own animation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbcnrfbny9u8azw9uw4mc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbcnrfbny9u8azw9uw4mc.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;code&gt;Image source: Image created by google flow&lt;/code&gt;&lt;/center&gt;

&lt;center&gt;____________&lt;/center&gt;

&lt;p&gt;While doing this, I could not help but think about how tools like this could evolve beyond individual experimentation. In the future, they might become part of how schools teach storytelling, narration and creative thinking, helping children express ideas through visual and generative tools. At the same time, it also raises an interesting challenge: how younger generations will learn to use these systems in a way that is both creative and intentional, rather than just consumptive.&lt;/p&gt;


&lt;h4&gt;
  
  
  What the Experience Felt Like
&lt;/h4&gt;

&lt;p&gt;The result was an 8 second clip and honestly, the quality surprised me. With a fairly simple prompt describing a scene, the model generated something that looked cinematic and coherent. The prompt itself was also AI generated, and I am leaving it in the appendix at the end of this article in case you want to replicate the experiment.&lt;br&gt;
A few things stood out from the experience.&lt;br&gt;
The 8 second limit felt a bit frustrating at first. But after thinking about it, I am not sure if that is a limitation of Flow or simply how professional video production works. Scenes in film and TV are often short clips that are later assembled in post production. Flow seems to follow that same logic, where you build a story by connecting multiple clips instead of generating one long video in a single shot.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/fg4-glVbsiM"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;In a previous test, I was able to concatenate two clips, and what impressed me the most was the character consistency between them. The same character appeared in both scenes without drifting, keeping the same look and style. That is actually one of the hardest problems in AI video generation today: maintaining character consistency across different prompts. Flow’s approach of letting you define and save a character that can later be reused across scenes feels like a real step forward. Whether it can maintain that consistency across longer or more complex sequences is something I still want to keep testing.&lt;/p&gt;

&lt;h4&gt;
  
  
  How I Tested It
&lt;/h4&gt;

&lt;p&gt;Under the hood, I used Gemini Omni Flash as the base model, with an 8 second duration, 2x speed and a 16:9 aspect ratio.&lt;br&gt;
Each generation costs 50 credits, which gives a more concrete sense of the cost per clip when planning a longer project. The tool also allows direct publishing to YouTube and lets you select a custom thumbnail, making the end to end workflow surprisingly complete for a creative platform.&lt;/p&gt;


&lt;h2&gt;
  
  
  🔮 How far do we want to go?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Google I/O 2026&lt;/strong&gt; presented a series of solutions, models and use cases where artificial intelligence is becoming more deeply integrated into our daily lives. But while watching each demo, one broader question kept coming to my mind:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How far do we really want to go?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because many of these tools do not only automate tasks, they also start to reorganize how we access information, how we make decisions and how we interact with the digital world.&lt;/p&gt;

&lt;p&gt;What stood out the most was not each announcement on its own, but what happens when you look at them together. Agents that run in the background, search that becomes conversational, interfaces that adapt to context in real time. Not as isolated products, but as parts of a system that is still being built.&lt;/p&gt;

&lt;p&gt;And in any system, what matters is not only each component, but how they connect with each other and what kind of emerging behavior appears when they start interacting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google showed the pieces based on AI.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What is still being defined is the full system: its direction, its shape, and how it will integrate into our daily lives and work.&lt;br&gt;
And maybe that is the most important part of this new era: it is not only about what technology is capable of building, but about how each of us decides to interpret it, use it, and become part of it.&lt;/p&gt;


&lt;h2&gt;
  
  
  ⚠️ A note on process and transparency
&lt;/h2&gt;

&lt;p&gt;I am an organizer at GDG Barcelona, and I led the event where we brought together 37 people to watch the full Google I/O 2026 keynote live. That experience, I was able to follow the announcements in real time, listening to the reactions in the room ... is what shaped the perspective and opinions in this article.&lt;/p&gt;

&lt;p&gt;This article is also based on the official sources listed in the references section, which I read and consulted directly after the event to verify and expand on each announcement.&lt;/p&gt;

&lt;p&gt;All the visuals in this article were designed by me using Figma Design, because creating my own images is something I genuinely enjoy as part of my writing process. AI tools were used for text correction and translation assistance and all opinions, analysis and perspectives are my own.&lt;/p&gt;


&lt;h2&gt;
  
  
  📚 References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Google. (n.d.). Gemini Omni. Google Blog. &lt;a href="https://blog.google/intl/es-es/productos/presentamos-gemini-omni/" rel="noopener noreferrer"&gt;https://blog.google/intl/es-es/productos/presentamos-gemini-omni/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Gemini Spark. Gemini Overview. &lt;a href="https://gemini.google/overview/agent/spark/" rel="noopener noreferrer"&gt;https://gemini.google/overview/agent/spark/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Daily Brief. Gemini Overview. &lt;a href="https://gemini.google/overview/daily-brief/" rel="noopener noreferrer"&gt;https://gemini.google/overview/daily-brief/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). The next evolution of the Gemini app. Google Blog. &lt;a href="https://blog.google/innovation-and-ai/products/gemini-app/next-evolution-gemini-app/" rel="noopener noreferrer"&gt;https://blog.google/innovation-and-ai/products/gemini-app/next-evolution-gemini-app/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;YouTube. (n.d.). YouTube News: Google I/O 2026. YouTube Blog. &lt;a href="https://blog.youtube/news-and-events/youtube-news-google-io-2026/" rel="noopener noreferrer"&gt;https://blog.youtube/news-and-events/youtube-news-google-io-2026/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). YouTube search updates and AI features. Google Support. &lt;a href="https://support.google.com/youtube/answer/16943763?hl=en" rel="noopener noreferrer"&gt;https://support.google.com/youtube/answer/16943763?hl=en&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Search at Google I/O 2026: AI-powered search updates. Google Blog. &lt;a href="https://blog.google/products-and-platforms/products/search/search-io-2026/" rel="noopener noreferrer"&gt;https://blog.google/products-and-platforms/products/search/search-io-2026/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Universal Cart and shopping updates. Google Blog. &lt;a href="https://blog.google/intl/es-419/actualizaciones-de-producto/informacion/google-shopping-cart/" rel="noopener noreferrer"&gt;https://blog.google/intl/es-419/actualizaciones-de-producto/informacion/google-shopping-cart/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Merchant Center help: Shopping updates. Google Support. &lt;a href="https://support.google.com/merchants/answer/16837055?hl=es" rel="noopener noreferrer"&gt;https://support.google.com/merchants/answer/16837055?hl=es&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Flow updates. Google Blog. &lt;a href="https://blog.google/intl/es-419/actualizaciones-de-producto/flow-updates/" rel="noopener noreferrer"&gt;https://blog.google/intl/es-419/actualizaciones-de-producto/flow-updates/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Stitch updates from Google Labs. Google Blog. &lt;a href="https://blog.google/innovation-and-ai/models-and-research/google-labs/stitch-updates/" rel="noopener noreferrer"&gt;https://blog.google/innovation-and-ai/models-and-research/google-labs/stitch-updates/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Google I/O 2026 developer tools highlights. Google Blog. &lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/google-io-2026-collection/" rel="noopener noreferrer"&gt;https://blog.google/innovation-and-ai/technology/developers-tools/google-io-2026-collection/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Workspace updates at Google I/O 2026. Google Blog. &lt;a href="https://blog.google/products-and-platforms/products/workspace/workspace-updates/" rel="noopener noreferrer"&gt;https://blog.google/products-and-platforms/products/workspace/workspace-updates/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Antigravity 2.0. &lt;a href="https://antigravity.google/product/antigravity-2" rel="noopener noreferrer"&gt;https://antigravity.google/product/antigravity-2&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Developer highlights from Google I/O 2026. Google Blog. &lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/google-io-2026-developer-highlights/" rel="noopener noreferrer"&gt;https://blog.google/innovation-and-ai/technology/developers-tools/google-io-2026-developer-highlights/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google Cloud. (n.d.). Innovations from Google I/O 2026 on Google Cloud. &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/innovations-from-google-io-26-on-google-cloud" rel="noopener noreferrer"&gt;https://cloud.google.com/blog/products/ai-machine-learning/innovations-from-google-io-26-on-google-cloud&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;Google. (n.d.). Android XR at Google I/O 2026. Google Blog. &lt;a href="https://blog.google/products-and-platforms/platforms/android/android-xr-io-2026/" rel="noopener noreferrer"&gt;https://blog.google/products-and-platforms/platforms/android/android-xr-io-2026/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;li&gt;DeepMind. (n.d.). SynthID. &lt;a href="https://deepmind.google/models/synthid/" rel="noopener noreferrer"&gt;https://deepmind.google/models/synthid/&lt;/a&gt; (Accessed May 22, 2026)&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  📄 Appendix
&lt;/h2&gt;

&lt;p&gt;This is the AI-generated prompt I used to create the video in Google Flow. I am including it here so you can replicate the experiment or explore it further.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A short animated intro video, 15 seconds. Chibi anime art style — soft cel-shading, vibrant neon colors, cinematic lighting, dark moody atmosphere. Think Lo-Fi anime meets cyberpunk gamer aesthetic.
Character: A small chubby panda in chibi style. Oversized black hoodie with hood down while walking, round panda ears visible on top. Serious and unbothered expression. Tiny paws. Soft black and white fur with subtle neon light reflections. This is a developer panda — cool, focused, says nothing.
Scene 1 — 0:00 to 0:05: Wide shot of a dark misty forest at night. A distant neon city skyline glows purple and cyan through the trees. Fog rolls along the ground. The panda walks alone from the right side of frame through a forest path, hands in hoodie pocket, completely unbothered. He approaches a large mossy rock formation — a hidden cave entrance covered by hanging vines with faint bioluminescent blue glow. He pushes the vines aside and steps in. Slow cinematic cut to black.
Scene 2 — 0:05 to 0:10: Interior of the cave — full gamer setup. RGB neon strips in Google colors (blue, red, yellow, green) line the rocky cave walls casting dramatic colored light on everything. A dark stone desk holds a glowing MacBook Pro, mechanical keyboard with RGB backlighting, mouse with neon underglow, and stacked empty energy drink cans. The panda walks to the desk, drops his backpack on the floor. Pulls out the chair and sits down. He opens the MacBook — a burst of white light floods his face and the cave. He slowly reaches to the side, picks up thick black sunglasses and puts them on. Then places large black headphones over his panda ears. He cracks his tiny paw knuckles. Leans forward. The RGB strips pulse once in sync.
Scene 3 — 0:10 to 0:15: Ultra slow cinematic push-in toward the MacBook screen. The cave darkens around it. The screen fills the entire frame glowing bright. Bold text appears: "GOOGLE I/O 2026 — THE AGENTIC ERA IS HERE" in clean white typography on dark background. Neon blue and green light pulses around the text edges. Below it, six glowing color blocks appear one by one with smooth fade-ins: MODELS · AGENTS · SEARCH · CREATIVE · DEV · SCIENCE. Each block in its Google neon color, white bold uppercase text, subtle neon glow border. Final frame holds 2 seconds with all blocks visible, neon pulsing softly. Fade to black.
Lighting &amp;amp; mood throughout: Dark, moody, cinematic. Neon reflections on all surfaces — the panda's fur, the cave walls, the desk. Color palette: deep black backgrounds, cyan #00F5FF, neon green #39FF14, Google blue #4285F4, Google red #EA4335, Google yellow #FBBC04, purple #BF5FFF. Inspired by cyberpunk anime aesthetics — think lo-fi coder vibes meets Akira color palette.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
`&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>googleiochallenge</category>
      <category>ai</category>
      <category>googlecloud</category>
    </item>
    <item>
      <title>From Hype to Product: How AI Is Being Used Today</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Fri, 10 Apr 2026 12:10:27 +0000</pubDate>
      <link>https://dev.to/r_elena_mendez_escobar/from-hype-to-product-how-ai-is-being-used-today-4hj0</link>
      <guid>https://dev.to/r_elena_mendez_escobar/from-hype-to-product-how-ai-is-being-used-today-4hj0</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbq1y9zwrb0wzd3j3mo2l.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbq1y9zwrb0wzd3j3mo2l.webp" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For years, we talked about artificial intelligence as a promise, and over the last two years in the era of LLMs… AI has stopped being the protagonist and has become invisible infrastructure.&lt;/p&gt;

&lt;p&gt;Today we don’t see it as “AI”, it becomes invisible and already forms part of our interactions and experiences without us noticing. This means when an app understands what you need before you search, when a recommendation hits the mark effortlessly, or when a decision happens in real time without an explicit interface.&lt;/p&gt;

&lt;p&gt;In this article, we explore some concrete examples of how artificial intelligence is moving from hype to product, and the patterns that are starting to emerge.&lt;/p&gt;




&lt;h1&gt;
  
  
  💄Sephora Transforms User Experience with AI in ChatGPT
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdb8rh8ljfvbz72iic3qn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdb8rh8ljfvbz72iic3qn.png" alt=" " width="799" height="451"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Image Reference:&lt;/strong&gt; &lt;a href="https://newsroom.sephora.com" rel="noopener noreferrer"&gt;https://newsroom.sephora.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sephora, a global leader in premium beauty products, has launched its app within ChatGPT, currently in a pilot phase in the United States. Participating users can receive personalized beauty product recommendations by linking their Beauty Insider account, explore solutions tailored to their needs, and take advantage of benefits such as free shipping and samples.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 Link:&lt;/strong&gt; &lt;a href="https://newsroom.sephora.com/sephora-app-in-chatgpt-brings-a-new-personalized-beauty-experience/" rel="noopener noreferrer"&gt;https://newsroom.sephora.com/sephora-app-in-chatgpt-brings-a-new-personalized-beauty-experience/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  ☕️ What’s Behind Every Cup of Coffee You Enjoy at Starbucks?
&lt;/h1&gt;

&lt;p&gt;Starbucks, the global coffeehouse chain, is using artificial intelligence to enhance the experience for both customers and partners without replacing human interaction. Tools like Green Dot Assist help baristas get quick answers about recipes, routines, and service standards, while Smart Queue optimizes the order flow across in-store, drive-thru, mobile, and delivery channels.&lt;/p&gt;

&lt;p&gt;Additionally, the Starbucks Ordering Companion, still in development, will guide customers to discover their ideal drink and locate nearby stores, while maintaining the warmth and personalization that define the brand.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 Link:&lt;/strong&gt; &lt;a href="https://about.starbucks.com/press/2026/supporting-the-moments-that-matter-with-artificial-intelligence/" rel="noopener noreferrer"&gt;https://about.starbucks.com/press/2026/supporting-the-moments-that-matter-with-artificial-intelligence/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🖼️ Pinterest AI Turns Your Ads into Scalable Results
&lt;/h1&gt;

&lt;p&gt;Pinterest, the visual discovery and search platform, is integrating artificial intelligence into its Performance+ campaigns, allowing advertisers to optimize results without micromanaging every detail. AI automation helps intelligently combine content, audiences, and placements, making ads more relevant and scalable while advertisers focus on providing content and product signals.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 Link:&lt;/strong&gt; &lt;a href="https://business.pinterest.com/es/blog/pinterest-performance-plus-campaigns/" rel="noopener noreferrer"&gt;https://business.pinterest.com/es/blog/pinterest-performance-plus-campaigns/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  ✈️ Delta Reveals Why We Still Seek Real Experiences When Traveling
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F71pwctwq7n7srxaqs2nh.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F71pwctwq7n7srxaqs2nh.webp" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;
Reference image: &lt;a href="https://news.delta.com/deltas-inaugural-connection-index-finds-why-travelers-are-priorizing-real-wold-experiences" rel="noopener noreferrer"&gt;https://news.delta.com/deltas-inaugural-connection-index-finds-why-travelers-are-priorizing-real-wold-experiences&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A new global trends report from Delta Air Lines, Connection Index: Why We Fly, examines why travelers choose to fly and how these experiences impact their emotions. The study shows that even in an increasingly digital world, travelers prioritize authentic experiences: 84% of international travelers report a strong desire to explore new places and meet people, and 73% have traveled specifically to see in person something they first discovered online.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 Link:&lt;/strong&gt; &lt;a href="https://news.delta.com/deltas-inaugural-connection-index-finds-why-travelers-are-prioritizing-real-world-experiences" rel="noopener noreferrer"&gt;https://news.delta.com/deltas-inaugural-connection-index-finds-why-travelers-are-prioritizing-real-world-experiences&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  📊 Agentic AI in Action: How Amazon Helps Sellers Make Real-Time Decisions
&lt;/h1&gt;

&lt;p&gt;Amazon, a global leader in e-commerce and technology, is integrating generative and agentic AI into its Seller Central platform to improve how sellers manage and scale their businesses. Through a new experience called Canvas, sellers can create interactive visual spaces that combine business data, insights, and recommended actions in real time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ruxfk0wcpcanc18s331.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ruxfk0wcpcanc18s331.webp" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
Reference Image: &lt;a href="https://www.aboutamazon.com/news/innovation-at-amazon" rel="noopener noreferrer"&gt;https://www.aboutamazon.com/news/innovation-at-amazon&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This experience is based on the agentic architecture of Seller Assistant, powered by Amazon Bedrock and models such as Amazon Nova and Anthropic Claude. Sellers can ask questions or select suggestions, and the system automatically builds a personalized dashboard that enables performance analysis, scenario simulations (such as changes in demand or inventory), and optimization of marketing campaigns with concrete projections.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 Link:&lt;/strong&gt; &lt;a href="https://www.aboutamazon.com/news/innovation-at-amazon/amazon-sellers-canvas-artificial-intelligence" rel="noopener noreferrer"&gt;https://www.aboutamazon.com/news/innovation-at-amazon/amazon-sellers-canvas-artificial-intelligence&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  📱Apple Unifies Business Management on a Single Platform
&lt;/h1&gt;

&lt;p&gt;Apple, a leading technology company, has launched Apple Business, an all-in-one platform that integrates device management, collaboration tools, and brand presence into a single environment. The solution allows companies to automatically configure devices through Blueprint projects, manage users and apps, and centralize services such as email, calendar, and directory with their own domains.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 Link:&lt;/strong&gt; &lt;a href="https://www.apple.com/es/newsroom/2026/03/introducing-apple-business/" rel="noopener noreferrer"&gt;https://www.apple.com/es/newsroom/2026/03/introducing-apple-business/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🛒 How Instacart Connects Physical Stores and Real-Time Data with AI
&lt;/h1&gt;

&lt;p&gt;Instacart, a technology platform for retail and supermarkets, is developing a “Physical AI” system that connects real-world data with cloud models to enhance the shopping experience. Through devices like Caper Carts, equipped with sensors, cameras, and edge processing (NVIDIA Jetson), the company captures real-time information about products, cart location, and user behavior within the store. This data is combined with cloud systems that use recommendation models and transformer-based architectures to generate insights and suggestions at the exact moment of purchase.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flcbo1x2i4ma4b1tyhi4i.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flcbo1x2i4ma4b1tyhi4i.webp" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reference: &lt;a href="https://www.instacart.com/company/enterprise-blog" rel="noopener noreferrer"&gt;https://www.instacart.com/company/enterprise-blog&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 Link:&lt;/strong&gt; &lt;a href="https://www.instacart.com/company/enterprise-blog/connecting-stores-from-edge-to-cloud-reinventing-retail-with-physical-ai" rel="noopener noreferrer"&gt;https://www.instacart.com/company/enterprise-blog/connecting-stores-from-edge-to-cloud-reinventing-retail-with-physical-ai&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;💳 How Mastercard Is Building Trust in the Era of AI Agents&lt;br&gt;
Mastercard is a global payments technology company that connects consumers, merchants, and financial institutions, developing infrastructures that enable secure and scalable transactions worldwide.&lt;/p&gt;

&lt;p&gt;In this context, it introduces Verifiable Intent, a new layer of trust for commerce with AI agents, developed in partnership with Google. This system creates a cryptographically verifiable record of what a user authorized before an agent acts on their behalf, connecting identity, intent, and action in a single source of truth. As agents move from assisting to executing purchases, this solution addresses a key challenge for bringing AI into production: ensuring traceability, authorization, and dispute resolution in autonomous transactions, making trust a central component of the product.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 Link:&lt;/strong&gt; &lt;a href="https://www.mastercard.com/global/en/news-and-trends/stories/2026/verifiable-intent.html" rel="noopener noreferrer"&gt;https://www.mastercard.com/global/en/news-and-trends/stories/2026/verifiable-intent.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Before You Go
&lt;/h1&gt;

&lt;p&gt;As we wrap up this edition, here’s a tool to help you put AI into action in your own projects:&lt;/p&gt;

&lt;h2&gt;
  
  
  🧩 Recommended App
&lt;/h2&gt;

&lt;p&gt;Stitch is an experimental AI tool from Google Labs that lets you quickly turn text prompts into functional app designs for mobile and desktop. It supports interactive prototyping, allows real-time collaboration, and can export your designs to popular platforms like Figma or as HTML code. In short, Stitch makes it easy to transform ideas into high-fidelity UI prototypes without complex setup.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpcpdmepqll2vwce7pyt6.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpcpdmepqll2vwce7pyt6.webp" alt=" " width="800" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference image&lt;/strong&gt;: &lt;a href="https://stitch.withgoogle.com" rel="noopener noreferrer"&gt;https://stitch.withgoogle.com&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  📢 Join the Conversation
&lt;/h2&gt;

&lt;p&gt;What do you think about the ways AI is transforming experiences across industries, from retail to travel to design? I’d love to hear your thoughts, ideas, or favorite AI tools.&lt;/p&gt;

&lt;p&gt;Hit reply and share your perspective!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>product</category>
      <category>ux</category>
    </item>
    <item>
      <title>More women in Tech. Fewer women leading</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Tue, 31 Mar 2026 13:41:07 +0000</pubDate>
      <link>https://dev.to/r_elena_mendez_escobar/more-women-in-tech-fewer-women-leading-5f9e</link>
      <guid>https://dev.to/r_elena_mendez_escobar/more-women-in-tech-fewer-women-leading-5f9e</guid>
      <description>&lt;p&gt;Every March invites me to pause, and on a personal level, it’s a moment to acknowledge the progress made toward equality, but also to reflect honestly on the challenges that still remain.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8238lzz0pm8nq5dm01jm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8238lzz0pm8nq5dm01jm.png" alt=" " width="780" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In recent years, we have seen encouraging signs: more women are pursuing careers in technology, science, and data. At the same time, initiatives to promote diversity within organizations have grown, along with conversations around female leadership and inclusion programs across the sector.&lt;/p&gt;

&lt;p&gt;However, when we look at who occupies decision-making roles in technology (who leads teams, defines strategy, or drives innovation) the reality still reflects an uneven path.&lt;/p&gt;

&lt;p&gt;From my experience working in IT, one question keeps coming up: if more women are studying STEM fields (science, technology, engineering, and mathematics) and developing technical skills, why is it still so difficult to see them in technical leadership roles?&lt;/p&gt;

&lt;p&gt;With that question in mind, I reviewed several recent reports and what I found is that there is no single cause, but rather a combination of structural and cultural factors that reinforce one another.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Understanding them together is key to explaining why progress remains so slow.&lt;/strong&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  A persistent gap: the numbers behind the reality
&lt;/h1&gt;

&lt;p&gt;To frame the conversation, it is worth starting with a few recent data points: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;(1)&lt;/strong&gt; Globally, women represent around 50% of the working-age population, yet they hold only 40% of total employment and approximately 35.4% of management positions, according to the International Labour Organization &lt;strong&gt;[1]&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;(2)&lt;/strong&gt; Within the technology sector, the situation is even more pronounced. In Europe, women account for fewer than one in five tech workers [6], and according to McKinsey’s analysis, their presence in core technical roles has not only failed to improve over time but has actually declined: from 22% in earlier reports to approximately 19% in more recent ones. This suggests that, rather than closing, the gap may in fact be widening &lt;strong&gt;[2]&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;(3)&lt;/strong&gt; At the highest levels, the numbers are equally telling. In 2025, women lead just 11% of Fortune 500 companies, compared to 10.4% the previous year [3],  a modest increase that, in perspective, highlights the slow pace of progress.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;(4)&lt;/strong&gt; According to the 2025 Women’s Power Gap report, of the 64 new CEOs appointed in the S&amp;amp;P 500 in 2024, only 11 were women (17% of the total), and none were founders of the companies they were set to lead &lt;strong&gt;[4]&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;(5)&lt;/strong&gt; The gender pay gap adds another layer to this picture: in the European Union, women earn on average around 12% less than men &lt;strong&gt;[9]&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;〰️ &lt;strong&gt;(6)&lt;/strong&gt; The 82% of the female leaders surveyed say they have had to change companies at least once in order to take the next step in their professional career &lt;strong&gt;[9]&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These figures describe the outcome, but not the process. To understand why this situation persists, we need to look inside organizations and examine the mechanisms shaping women’s career progression.&lt;/p&gt;




&lt;h1&gt;
  
  
  The broken rung: when careers start at a disadvantage
&lt;/h1&gt;

&lt;p&gt;One of the most useful concepts for explaining this gap is called &lt;strong&gt;Broken Rung&lt;/strong&gt;. The image is precise: it is not about a glass ceiling preventing women from reaching the top, but rather a damaged step at the very beginning that makes it harder for many women to take their first step into leadership.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo9isf5azz7typ9w0zhya.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo9isf5azz7typ9w0zhya.png" alt=" " width="780" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;According to a McKinsey study conducted in the United States, for every 100 men promoted to their first management role, only around 80 women achieve the same advancement [2]. At first glance, this may seem like a small difference, but its consequences compound over time. If fewer women reach the first step of leadership, there will also be fewer candidates at the next level, and even fewer at the level above.&lt;/p&gt;

&lt;p&gt;With each promotion, the starting pool shrinks, and female representation gradually diminishes as one moves up the hierarchy.&lt;/p&gt;

&lt;p&gt;This cascading effect largely explains why executive levels in technology companies show such limited representation. The problem is not  the final barrier before reaching roles such as CEO or CTO, it lies in that initial moment when decisions are made about who takes on early management and leadership responsibilities and who does not.&lt;/p&gt;

&lt;p&gt;At this point, it is worth adding another insight highlighted by McKinsey: 49% of women in the European technology sector reported experiencing sexism or bias in the past year, and 82% said they feel the need to prove their competence more than their male peers in order to be recognized [2].&lt;/p&gt;

&lt;p&gt;These are not just individual experiences; they are indicators of an environment where the standards of evaluation are not the same for everyone, and where promotion decisions may be influenced by different expectations based on gender.&lt;/p&gt;




&lt;h1&gt;
  
  
  Invisible work: tasks that consume time without building careers
&lt;/h1&gt;

&lt;p&gt;Alongside the broken rung, there is a second mechanism that operates more quietly but just as effectively: &lt;code&gt;non-promotable work&lt;/code&gt;. This refers to all the tasks that are necessary for the day-to-day functioning of organizations but are not recognized in performance evaluations nor contribute to career advancement.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fosxiypkj6fga72mtou83.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fosxiypkj6fga72mtou83.png" alt=" " width="780" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The list is familiar to anyone who has worked in an organization: taking meeting notes, organizing team events, coordinating onboarding logistics for new hires, managing recognition initiatives or gifts, or participating in committees that have no direct impact on the business. These tasks are essential, yet they are not reflected in any performance metric and, when it comes to evaluating promotions, they simply do not count.&lt;/p&gt;

&lt;p&gt;The issue is not only that these tasks go unrecognized, but also that they are not distributed equitably. According to an analysis published by The Guardian in 2022 [7], women tend to take on these responsibilities more frequently. This results in less time available for strategic projects and reduced visibility within the organization. In some cases, this difference can amount to nearly a month of work per year spent on tasks that do not contribute to professional growth, compared to their male counterparts.&lt;/p&gt;

&lt;p&gt;Over time, this pattern not only limits individual development but also structurally reinforces the gap in access to leadership roles.&lt;/p&gt;




&lt;h1&gt;
  
  
  Learning to stay relevant: the challenge of continuous upskilling
&lt;/h1&gt;

&lt;p&gt;In this context, one of the most important responses is reskilling: the ability to learn new skills and adapt to ongoing market transformations. Developing capabilities in areas such as AI, data, cloud, infrastructure, cloud computing, DevOps and security will be critical in the coming years for those who want to remain relevant and grow professionally.&lt;/p&gt;

&lt;p&gt;However, technical training, while necessary, is not sufficient on its own. It is equally essential to develop a deep understanding of the industries where technology is applied: understanding the real challenges organizations face, identifying the most appropriate solutions for each context, and being able to design realistic implementation paths. In this sense, training in project management, agile methodologies, and research and development practices is not an optional complement, but a core component of the professional profile the market will demand.&lt;/p&gt;

&lt;p&gt;As Meirav Oren, CEO and co-founder of Versatile, noted during the World Economic Forum:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp3j2xr23mvhu7jvu0785.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp3j2xr23mvhu7jvu0785.png" alt=" " width="780" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This insight points to a well-documented phenomenon: many women tend to apply for new positions only when they feel they meet all the requirements, whereas men often apply when they meet only part of them. This is not a difference in capability, but rather a reflection of how the environment has shaped confidence and risk perception.&lt;/p&gt;

&lt;p&gt;For this reason, fostering environments where women can take on challenges, learn through the process, and make their work visible is just as important as any technical training program.&lt;/p&gt;




&lt;h1&gt;
  
  
  Systemic barriers in transition: the added impact of AI
&lt;/h1&gt;

&lt;p&gt;When viewed together, what emerges is not a list of isolated issues, but a system of barriers that reinforce one another. The broken rung reduces, from the outset, the number of women who enter leadership, while non-promotable work consumes the time and energy that could otherwise be invested in building visibility and career progression.&lt;/p&gt;

&lt;p&gt;And to this already complex system, we must now add a new and accelerating force: artificial intelligence.&lt;/p&gt;

&lt;p&gt;AI is redefining skills, roles, and organizational dynamics. As new opportunities emerge, others evolve or transform at an increasing pace.&lt;/p&gt;

&lt;p&gt;However, this transformation also presents a specific challenge for women's participation in technology. In many teams, women have historically had stronger representation in areas such as design, user experience, and product management. According to McKinsey, &lt;strong&gt;women represent approximately 53% of design roles and 39% of product management positions [2]&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;These same areas are among those most affected by the adoption of AI-driven tools. In particular, early-career roles are already showing signs of decline, &lt;strong&gt;with a 3% decrease in design and a 2% decrease in product roles between 2024 and 2025 [2]&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This does not mean these roles will disappear, but rather that they are evolving rapidly and demanding new technical and strategic capabilities. Entry-level profiles, in particular, face greater challenges, as they require structured support, continuous learning, and real opportunities to adapt.&lt;/p&gt;

&lt;p&gt;In this context, the risk is not technological but structural: if women do not have equitable access to reskilling, upskilling, and leadership opportunities within these transformations, the gap may widen even further in the coming years.&lt;/p&gt;

&lt;p&gt;None of these dynamics operate in isolation. Rather, it is their combination that explains why, despite the growing number of women entering the technology sector, representation in leadership roles remains so limited.&lt;/p&gt;

&lt;p&gt;And precisely because the problem is systemic, the solutions must be as well.&lt;/p&gt;




&lt;h1&gt;
  
  
  Building the future of technology is also a matter of diversity
&lt;/h1&gt;

&lt;p&gt;Technological progress opens up enormous opportunities for society, but it also raises a question we cannot ignore: who is designing the systems we will use in the future?&lt;/p&gt;

&lt;p&gt;Algorithms, digital platforms, and artificial intelligence systems are not neutral. They are shaped by the decisions, experiences, and contexts of those who build them.&lt;/p&gt;

&lt;p&gt;In software architecture, there is a principle known as &lt;strong&gt;Conway’s Law&lt;/strong&gt;, which states that organizations design systems that mirror their communication structures. Applied to diversity, this means that if technology teams are not diverse — or if communication is hierarchical and limited — those same constraints may be reflected in the solutions we create.&lt;/p&gt;

&lt;p&gt;This is not only a matter of equality, but also of innovation, social impact, and the quality of the technology we bring into the world. Diverse teams make better decisions, consider more perspectives, and ultimately build more robust solutions.&lt;/p&gt;

&lt;p&gt;March 8 serves as a reminder that, although progress has been made, the path toward equitable participation in technology leadership is still ongoing. And this challenge does not belong to a single day or a single sector:  it is part of an ongoing responsibility.&lt;/p&gt;

&lt;p&gt;Promoting inclusion, supporting the professional development of women in technology, and creating real pathways to leadership are not just goals. They are ways of building teams where different perspectives can coexist and enrich the decisions we shape — now more than ever — in technology.&lt;/p&gt;

&lt;p&gt;Because the future of technology will not only be defined by what we build... but by who is given the opportunity to build it.&lt;/p&gt;




&lt;h1&gt;
  
  
  📚References
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[1]&lt;/strong&gt; Deloitte. (n.d.). Women at work: Global outlook. &lt;a href="https://www.deloitte.com/global/en/issues/work/content/women-at-work-global-outlook.html" rel="noopener noreferrer"&gt;https://www.deloitte.com/global/en/issues/work/content/women-at-work-global-outlook.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[2]&lt;/strong&gt; McKinsey &amp;amp; Company. (n.d.). Women in tech and AI in Europe: Can the region close its gender gap? &lt;a href="https://www.mckinsey.com/capabilities/mckinsey-technology/our-insights/women-in-tech-and-ai-in-europe-can-the-region-close-its-gender-gap#/" rel="noopener noreferrer"&gt;https://www.mckinsey.com/capabilities/mckinsey-technology/our-insights/women-in-tech-and-ai-in-europe-can-the-region-close-its-gender-gap#/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[3]&lt;/strong&gt; Fortune. (2025, June 2). Fortune 500 female CEOs 2025. &lt;a href="https://fortune.com/2025/06/02/fortune-500-female-ceos-2025/" rel="noopener noreferrer"&gt;https://fortune.com/2025/06/02/fortune-500-female-ceos-2025/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[4]&lt;/strong&gt; Women’s Power Gap. (2025). CEO report 2025. &lt;a href="https://www.womenspowergap.org/wp-content/uploads/2025/05/WPG_CEO-Report_2025.pdf" rel="noopener noreferrer"&gt;https://www.womenspowergap.org/wp-content/uploads/2025/05/WPG_CEO-Report_2025.pdf&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[5]&lt;/strong&gt; Council of the European Union. (n.d.). The EU’s gender pay gap: Facts and figures. &lt;a href="https://www.consilium.europa.eu/en/policies/the-eu-s-gender-pay-gap-facts-and-figures/" rel="noopener noreferrer"&gt;https://www.consilium.europa.eu/en/policies/the-eu-s-gender-pay-gap-facts-and-figures/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[6]&lt;/strong&gt; Euronews. (2026, March 8). Why women are disappearing from Europe’s tech workforce. &lt;a href="https://www.euronews.com/next/2026/03/08/why-women-are-disappearing-from-europes-tech-workforce" rel="noopener noreferrer"&gt;https://www.euronews.com/next/2026/03/08/why-women-are-disappearing-from-europes-tech-workforce&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[7]&lt;/strong&gt;  The Guardian. (2022, May 9). They feel guilty: Why women should say no to office housework. &lt;a href="https://www.theguardian.com/society/2022/may/09/they-feel-guilty-why-women-should-say-no-to-office-housework" rel="noopener noreferrer"&gt;https://www.theguardian.com/society/2022/may/09/they-feel-guilty-why-women-should-say-no-to-office-housework&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[8]&lt;/strong&gt; World Economic Forum. (2025, June). What to know about AI and the gender gap. &lt;a href="https://www.weforum.org/stories/2025/06/amnc25-what-to-know-about-ai-and-the-gender-gap/" rel="noopener noreferrer"&gt;https://www.weforum.org/stories/2025/06/amnc25-what-to-know-about-ai-and-the-gender-gap/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;[9]&lt;/strong&gt; KPMG. (2025). Global female leaders outlook 2025. &lt;a href="https://assets.kpmg.com/content/dam/kpmgsites/pt/pdf/kpmg-global-female-leaders-outlook-2025.pdf.coredownload.inline.pdf" rel="noopener noreferrer"&gt;https://assets.kpmg.com/content/dam/kpmgsites/pt/pdf/kpmg-global-female-leaders-outlook-2025.pdf.coredownload.inline.pdf&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>womenintech</category>
      <category>inclusion</category>
      <category>career</category>
    </item>
    <item>
      <title>AI in healthcare: how OpenAI is transforming medical care</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Mon, 19 Jan 2026 10:17:37 +0000</pubDate>
      <link>https://dev.to/r_elena_mendez_escobar/ai-in-healthcare-how-openai-is-transforming-medical-care-ffn</link>
      <guid>https://dev.to/r_elena_mendez_escobar/ai-in-healthcare-how-openai-is-transforming-medical-care-ffn</guid>
      <description>&lt;p&gt;Introduction&lt;br&gt;
Artificial intelligence is increasingly being adopted in highly regulated industries, and &lt;strong&gt;healthcare&lt;/strong&gt; is a clear example of how this technology can improve processes, access to information, and the quality of care.&lt;br&gt;
According to OpenAI’s latest product announcements, more than &lt;strong&gt;230 million people worldwide&lt;/strong&gt; use ChatGPT every week to ask questions related to health and wellbeing. This growing adoption reflects a broader shift in how individuals and professionals seek medical information and support.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspeqkq4wi310dqdf73vf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspeqkq4wi310dqdf73vf.png" alt=" " width="800" height="704"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Healthcare systems face significant challenges: clinical staff are often overwhelmed, medical knowledge is highly fragmented, and administrative complexity continues to grow. &lt;strong&gt;AI is beginning to address these issues&lt;/strong&gt; by supporting decision-making, reducing operational burdens, and making medical information more accessible.&lt;/p&gt;

&lt;p&gt;This month, OpenAI introduced new &lt;strong&gt;healthcare-focused capabilities&lt;/strong&gt; designed to support both medical professionals and patients. These services aim to bring trusted information and care-related workflows closer to people, while prioritizing &lt;strong&gt;security, compliance, and responsible use&lt;/strong&gt; in one of the most sensitive and regulated industries.&lt;/p&gt;




&lt;h1&gt;
  
  
  OpenAI for Healthcare: Operationalizing AI in Healthcare Organizations
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;OpenAI for Healthcare&lt;/strong&gt; is specifically designed for healthcare organizations such as hospitals, research centers, clinic networks, and integrated health systems. Its primary goal is to provide a &lt;strong&gt;secure, enterprise-grade platform&lt;/strong&gt; that enables these institutions to deliver more consistent, high-quality care, while reducing the administrative burden that consumes a significant amount of clinicians’ time.&lt;/p&gt;

&lt;p&gt;One of the platform’s most distinctive capabilities is its &lt;strong&gt;evidence retrieval with clear citations&lt;/strong&gt;. Responses are grounded in trusted medical sources, including millions of peer-reviewed studies, public health guidelines, and up-to-date clinical directives. This allows healthcare professionals to verify information more easily and support clinical decisions with &lt;strong&gt;reliable, evidence-based insights&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Another particularly valuable feature is the use of &lt;strong&gt;reusable templates to streamline workflows&lt;/strong&gt;. These shared templates support common tasks such as drafting discharge summaries, patient instructions, clinical letters, and prior authorization requests. As a result, clinical teams spend less time rewriting repetitive documentation and searching for information, while patients benefit from clearer guidance and smoother transitions of care.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F93kdyc81rgb2gbqbmeyr.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F93kdyc81rgb2gbqbmeyr.webp" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;image source: &lt;a href="https://openai.com/es-419/index/openai-for-healthcare/" rel="noopener noreferrer"&gt;https://openai.com/es-419/index/openai-for-healthcare/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Below is an overview of the main capabilities offered by this solution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flk2r8aogpvf5pymucp3j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flk2r8aogpvf5pymucp3j.png" alt=" " width="800" height="362"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  ChatGPT Health: A Smarter Way to Understand Your Health
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;ChatGPT Health&lt;/strong&gt; is designed for individual users who want to better understand their own health and navigate a complex healthcare system. Health is already one of the most popular topics on ChatGPT, because every week, users ask questions about health and wellbeing.&lt;/p&gt;

&lt;p&gt;Users can securely &lt;strong&gt;connect personal health data from multiple sources&lt;/strong&gt;, including electronic health records and wellness apps. They can also &lt;strong&gt;upload their own documents or images&lt;/strong&gt;, such as lab results or medical reports. This centralization allows ChatGPT Health to provide &lt;strong&gt;more relevant, personalized responses&lt;/strong&gt;, helping users interpret information, summarize results, and prepare for appointments.&lt;/p&gt;

&lt;p&gt;The tool is designed for practical, everyday use. It can help users review lab results, prepare questions for medical visits, provide guidance on diet, exercise, or wellness routines, and support understanding of insurance options based on personal health habits. It also includes features like voice input, dictation, and advanced search, making the experience more accessible and tailored to individual needs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fehz5mbi4t2uzotwpe102.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fehz5mbi4t2uzotwpe102.webp" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;image source: &lt;a href="https://openai.com/es-ES/index/introducing-chatgpt-health/" rel="noopener noreferrer"&gt;https://openai.com/es-ES/index/introducing-chatgpt-health/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Below is an overview of the types of data sources users can integrate with ChatGPT Health.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb1wlevgx4jwf3w10z7a5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb1wlevgx4jwf3w10z7a5.png" alt=" " width="799" height="362"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  Comparative Overview
&lt;/h1&gt;

&lt;p&gt;A side-by-side look at how OpenAI for Healthcare and ChatGPT Health support clinical teams and individual users with AI-driven health insights.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fotitpi3t2g888as5cdyy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fotitpi3t2g888as5cdyy.png" alt=" " width="800" height="528"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;The introduction of AI in healthcare is showing &lt;strong&gt;real potential&lt;/strong&gt;, not only as a tool to support clinical workflows, but also as a way to provide reliable information and guidance to people who may not have easy access to specialized care. &lt;strong&gt;OpenAI for Healthcare and ChatGPT Health&lt;/strong&gt; represent a major step forward in applying AI to one of the most regulated and sensitive industries.&lt;/p&gt;

&lt;p&gt;Currently, these tools are &lt;strong&gt;limited in availability&lt;/strong&gt;: OpenAI for Healthcare serves select institutions, and ChatGPT Health operates through a waitlist. How and when these solutions expand to smaller clinics, rural areas, or other countries will be key in determining their ability to truly democratize access to high-quality health support.&lt;/p&gt;

&lt;p&gt;Healthcare is constantly evolving, with &lt;strong&gt;new scientific evidence, clinical guidelines, and regulatory updates&lt;/strong&gt;. AI solutions like these can help by keeping pace with these changes, providing relevant and accurate information over time.&lt;/p&gt;

&lt;p&gt;While AI will not &lt;strong&gt;replace healthcare professionals&lt;/strong&gt;, these tools offer opportunities to reduce administrative burdens, improve efficiency, and empower both clinicians and patients with personalized insights. By making healthcare more accessible, understandable, and responsive, AI can &lt;strong&gt;complement human care&lt;/strong&gt;, helping to achieve better outcomes while supporting professionals rather than replacing them.&lt;/p&gt;




&lt;h1&gt;
  
  
  📚Referencias
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;OpenAI. (2025)&lt;/strong&gt;. OpenAI for Healthcare. OpenAI. &lt;a href="https://openai.com/es-419/index/openai-for-healthcare/" rel="noopener noreferrer"&gt;https://openai.com/es-419/index/openai-for-healthcare/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI. (2025)&lt;/strong&gt;. Introducing ChatGPT Health. OpenAI. &lt;a href="https://openai.com/es-ES/index/introducing-chatgpt-health" rel="noopener noreferrer"&gt;https://openai.com/es-ES/index/introducing-chatgpt-health&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  📌 How to cite this article
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;APA style&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Mendez Escobar, Romina Elena. (2025). &lt;strong&gt;AI in healthcare: how OpenAI is transforming medical care&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
&lt;a href="https://dev.to/r_elena_mendez_escobar/ai-in-healthcare-how-openai-is-transforming-medical-care-ffn"&gt;https://dev.to/r_elena_mendez_escobar/ai-in-healthcare-how-openai-is-transforming-medical-care-ffn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BibTeX&lt;/strong&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
text
@article{mendez2025aihealthcare,
  title  = {AI in healthcare: how OpenAI is transforming medical care},
  author = {Mendez Escobar, Romina Elena},
  year   = {2025},
  url    = {https://dev.to/r_elena_mendez_escobar/ai-in-healthcare-how-openai-is-transforming-medical-care-ffn}
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>openai</category>
      <category>ai</category>
      <category>data</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Mon, 05 Jan 2026 08:41:45 +0000</pubDate>
      <link>https://dev.to/r_elena_mendez_escobar/toon-vs-json-for-llm-prompts-can-we-reduce-token-usage-without-losing-response-quality-59ed</link>
      <guid>https://dev.to/r_elena_mendez_escobar/toon-vs-json-for-llm-prompts-can-we-reduce-token-usage-without-losing-response-quality-59ed</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;Over the past months, I came across several articles claiming that &lt;strong&gt;TOON&lt;/strong&gt; can significantly reduce token usage in LLM prompts compared to traditional &lt;strong&gt;JSON&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flmwoxyu8jbloeqvnasc0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flmwoxyu8jbloeqvnasc0.png" alt=" " width="800" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That raised a few questions for me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does TOON still provide benefits with &lt;strong&gt;real-world API responses?&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;How much does it actually reduce tokens?&lt;/li&gt;
&lt;li&gt;And more importantly: &lt;strong&gt;does changing the format affect how an LLM interprets the data or the quality of the response?&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Answering these questions isn’t simple, and the results can vary depending on the dataset, the structure of the data, and even the LLM itself. It’s also not a simple matter of counting token, different formats may influence how the model understands and processes the information. &lt;/p&gt;

&lt;p&gt;In this article, I aim to run a practical benchmark to explore whether TOON could be useful in production pipelines, in what contexts it performs best, and whether it works well across different types of JSON.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This article walks through the experiment, the results, and the conclusions.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  What Is TOON (and How Is It Different from JSON)?
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;TOON (Terse Object-Oriented Notation)&lt;/strong&gt; is a data serialization format designed specifically for LLM prompts. The goal is simple: reduce syntactic overhead while remaining readable for both humans and machines.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5fqktut0uee255noit04.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5fqktut0uee255noit04.png" alt=" " width="800" height="503"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  The Experiment
&lt;/h1&gt;

&lt;p&gt;This experiment evaluates whether alternative data serialization formats can reduce token usage in LLM prompts without degrading response quality.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08j9ix49vdatle7whlcz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08j9ix49vdatle7whlcz.png" alt=" " width="800" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The experiment follows four main stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Dataset Fetching&lt;/strong&gt;: Data is retrieved from public APIs and prepared for downstream processing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token Benchmarking&lt;/strong&gt;: Each dataset is encoded in JSON and TOON, and token counts are computed using a tokenizer to measure size differences across formats.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM Interaction:&lt;/strong&gt; The serialized data is sent to an LLM via Amazon Bedrock to generate responses and embeddings under deterministic settings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic Evaluation:&lt;/strong&gt; Outputs generated from JSON and TOON prompts are compared using semantic (cosine similarity) and lexical (ROUGE, BLEU) metrics to assess equivalence.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The goal is not to optimize prompt content, but to isolate the impact of serialization format on token efficiency and response consistency.&lt;/p&gt;




&lt;h2&gt;
  
  
  Datasets
&lt;/h2&gt;

&lt;p&gt;In this experiment, I wanted to test TOON with &lt;strong&gt;realistic, publicly available data&lt;/strong&gt;, rather than small, manually created datasets. Using real API responses allows us to see how token savings and LLM behavior hold up in practical scenarios.&lt;br&gt;
I selected &lt;strong&gt;two public APIs with very different characteristics&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GitHub Events API&lt;/strong&gt;: Returns a &lt;strong&gt;stream of recent public events on GitHub&lt;/strong&gt;, such as pushes, pull requests, issues, and comments.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 URL&lt;/strong&gt;: &lt;a href="https://api.github.com/events" rel="noopener noreferrer"&gt;https://api.github.com/events&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;🧩 Data structure&lt;/strong&gt;: Deeply nested, heterogeneous objects with multiple levels of dictionaries and arrays.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;💡 Why this matters&lt;/strong&gt;: Represents the kind of &lt;strong&gt;complex operational API&lt;/strong&gt; data you might send to an LLM in real projects.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Wikipedia Page Views API&lt;/strong&gt;:Returns the** top-viewed articles on English Wikipedia** for a given day.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔗 URL&lt;/strong&gt;: &lt;a href="https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/all-access/2024/01/01" rel="noopener noreferrer"&gt;https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/all-access/2024/01/01&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;🧩 Data structure&lt;/strong&gt;: Flat, repetitive lists of articles, each with numeric metrics (title, views, category).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;💡 Why this matters&lt;/strong&gt;: Ideal for testing TOON’s efficiency with &lt;strong&gt;flat, repetitive data&lt;/strong&gt;, where token savings are expected to be highest.
Using these two APIs allows us to evaluate TOON in both &lt;strong&gt;complex nested&lt;/strong&gt; and &lt;strong&gt;flat list scenarios&lt;/strong&gt;, giving a more comprehensive view of its performance in real-world LLM prompts.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  Fetching the Data
&lt;/h2&gt;

&lt;p&gt;To extract data from these APIs, we created the following utility class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DatasetFetcher&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Fetch datasets from different sources&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="nd"&gt;@staticmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_github_events&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Fetch recent GitHub events&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.github.com/events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[:&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="nd"&gt;@staticmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_wikipedia_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Fetch popular Wikipedia pages&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User-Agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TOON-Benchmark/1.0 (Research)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/all-access/2024/01/01&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;items&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;articles&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][:&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;This class allows you to quickly fetch &lt;strong&gt;sample datasets&lt;/strong&gt; for testing token efficiency with TOON and JSON formats.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h1&gt;
  
  
  Part 1: Token Reduction
&lt;/h1&gt;
&lt;h2&gt;
  
  
  🏗️ Methodology
&lt;/h2&gt;

&lt;p&gt;To measure token usage, I used &lt;a href="https://pypi.org/project/tiktoken/" rel="noopener noreferrer"&gt;tiktoken&lt;/a&gt;, the same tokenizer employed by many OpenAI-compatible models. This allows us to estimate how many tokens are consumed by the &lt;strong&gt;prompt payload itself&lt;/strong&gt;, independent of the model’s output.&lt;br&gt;
For TOON generation, I used the &lt;strong&gt;toon-format&lt;/strong&gt; library, which converts Python objects into TOON while preserving structure and ordering.&lt;br&gt;
The following &lt;strong&gt;classes&lt;/strong&gt; implement token counting and incremental benchmarking using these libraries:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TokenCounter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Count tokens using tiktoken&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tiktoken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encoding_for_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Count tokens in a text string&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This class allows you to quickly count tokens in any string, whether it’s JSON, TOON, or plain text.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TokenBenchmark&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Benchmark token reduction: JSON vs TOON&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BenchmarkConfig&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TokenCounter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;incremental_benchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;dataset_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        Perform incremental benchmark comparing JSON vs TOON

        Args:
            data_list: List of objects to analyze
            dataset_name: Name of dataset for identification

        Returns:
            DataFrame with benchmark results
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="n"&gt;accum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Encode in both formats
&lt;/span&gt;            &lt;span class="n"&gt;json_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ensure_ascii&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;toon_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;toon_encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Count tokens
&lt;/span&gt;            &lt;span class="n"&gt;json_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;toon_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;toon_prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Calculate reduction
&lt;/span&gt;            &lt;span class="n"&gt;saved&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_tokens&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;toon_tokens&lt;/span&gt;
            &lt;span class="n"&gt;reduction_pct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;saved&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;json_tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;json_tokens&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

            &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;num_items&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;JSON_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TOON_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;toon_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens_saved&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;saved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reduction_pct&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reduction_pct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dataset&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;dataset_name&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;These &lt;strong&gt;classes&lt;/strong&gt; allow us to &lt;strong&gt;incrementally benchmark token usage&lt;/strong&gt;, providing a detailed view of how much TOON reduces tokens compared to JSON as items accumulate in a prompt.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  🧪 Results: Token Reduction Metrics
&lt;/h2&gt;

&lt;p&gt;In the code available in the repository, you can see the classes used to compute these results.&lt;br&gt;&lt;br&gt;
What stands out, however, is that token reduction is &lt;strong&gt;not uniform across datasets&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;dataset&lt;/th&gt;
&lt;th&gt;mean&lt;/th&gt;
&lt;th&gt;std&lt;/th&gt;
&lt;th&gt;min&lt;/th&gt;
&lt;th&gt;max&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;github_events&lt;/td&gt;
&lt;td&gt;2.77&lt;/td&gt;
&lt;td&gt;0.26&lt;/td&gt;
&lt;td&gt;2.60&lt;/td&gt;
&lt;td&gt;4.02&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wikipedia_pages&lt;/td&gt;
&lt;td&gt;42.61&lt;/td&gt;
&lt;td&gt;6.66&lt;/td&gt;
&lt;td&gt;13.64&lt;/td&gt;
&lt;td&gt;46.70&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Events (complex, nested data) - Average token reduction:&lt;/strong&gt; ~3%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wikipedia Pages (flat, repetitive data) - Average token reduction:&lt;/strong&gt; ~43%&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;
  
  
  💡 Why the difference?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;For &lt;strong&gt;GitHub Events&lt;/strong&gt;, the reduction is only ~3%, which means that using TOON instead of JSON &lt;strong&gt;does not significantly reduce token usage&lt;/strong&gt;. The reason is that &lt;strong&gt;deep nesting and heterogeneous keys limit how much syntactic overhead can be removed&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;For &lt;strong&gt;Wikipedia Pages&lt;/strong&gt;, the reduction is ~43% because &lt;strong&gt;flat, repetitive lists benefit greatly from removing braces, commas, and repeated field names&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1&gt;
  
  
  Part 2: Does Response Quality Stay the Same?
&lt;/h1&gt;

&lt;p&gt;The second experiment focuses on &lt;strong&gt;response quality&lt;/strong&gt;, the goal is to verify whether using the &lt;strong&gt;same prompt&lt;/strong&gt;, but providing the data encoded in &lt;strong&gt;JSON&lt;/strong&gt; versus &lt;strong&gt;TOON&lt;/strong&gt;, produces equivalent outputs from the LLM.&lt;br&gt;
For this experiment, I used the &lt;strong&gt;Wikipedia dataset&lt;/strong&gt;, since it showed the highest token reduction (~45%). This makes it an ideal candidate to evaluate whether aggressive token savings have any negative impact on output quality.&lt;br&gt;
To compare the responses, I generated outputs using both formats and evaluated them using several &lt;strong&gt;text similarity metrics&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  🧪 Results: Evaluation Metrics
&lt;/h2&gt;

&lt;p&gt;To assess output quality, I used the following metrics, each capturing a different aspect of similarity.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy7d5bmoorq3lz77gze66.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy7d5bmoorq3lz77gze66.png" alt=" " width="800" height="643"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  LLM and Embeddings Setup (AWS Bedrock)
&lt;/h2&gt;

&lt;p&gt;All responses and embeddings were generated using AWS Bedrock, Amazon’s fully managed service for accessing foundation models.&lt;br&gt;
The following models were used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;⚡ Amazon Nova Lite (amazon.nova-lite-v1:0)&lt;/strong&gt;: A lightweight, cost-efficient LLM optimized for fast inference.  In this experiment, it was used for prompt completion and response generation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⚡ Amazon Titan Embeddings (amazon.titan-embed-text-v2:0):&lt;/strong&gt; A text embedding model that converts text into high-dimensional vectors. It was used to generate vector representations of the responses for semantic similarity comparison.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Bedrock Client Implementation
&lt;/h2&gt;

&lt;p&gt;The following class encapsulates interaction with &lt;strong&gt;AWS Bedrock&lt;/strong&gt; for both &lt;strong&gt;prompt generation and embedding extraction.&lt;/strong&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;code&gt;invoke_prompt&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;This method sends a prompt to the LLM and returns the generated response.&lt;br&gt;
It accepts the following parameters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;💬 prompt&lt;/strong&gt;: The base instruction or question provided to the model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;📄 dataset&lt;/strong&gt;: The data to analyze, encoded either in &lt;strong&gt;JSON&lt;/strong&gt; or &lt;strong&gt;TOON&lt;/strong&gt;, which is appended to the prompt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;🌡️ temperature&lt;/strong&gt;: Controls the randomness of the model’s output.&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;
  
  
  🌡️ Why &lt;code&gt;temperature = 0&lt;/code&gt;?
&lt;/h5&gt;

&lt;p&gt;The temperature parameter with this value is due to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It reduces randomness&lt;/strong&gt; in model outputs&lt;/li&gt;
&lt;li&gt;It makes responses &lt;strong&gt;deterministic&lt;/strong&gt; across multiple runs&lt;/li&gt;
&lt;li&gt;It ensures that any differences in the outputs are due to the &lt;strong&gt;input format (JSON vs TOON)&lt;/strong&gt;, not sampling variability&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Without fixing the temperature, it would be impossible to reliably attribute differences in response quality to the serialization format alone.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4&gt;
  
  
  &lt;code&gt;get_embeddings&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;This method generates &lt;strong&gt;vector embeddings&lt;/strong&gt; for a given text using the embedding model.&lt;br&gt;
The resulting vectors are later used to compute &lt;strong&gt;cosine similarity&lt;/strong&gt;, allowing us to measure &lt;strong&gt;semantic equivalence&lt;/strong&gt; between responses generated from &lt;code&gt;JSON&lt;/code&gt; and &lt;code&gt;TOON&lt;/code&gt; inputs.&lt;/p&gt;

&lt;p&gt;Overall, these parameters allow us to &lt;strong&gt;control model behavior&lt;/strong&gt; and isolate the impact of input serialization, with &lt;strong&gt;temperature&lt;/strong&gt; being the most important variable for this experiment.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AWSBedrockClient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Client to interact with AWS Bedrock&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="n"&gt;aws_access_key_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;aws_secret_access_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model_prompt&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model_embedding&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;aws_access_key_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws_access_key_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;aws_secret_access_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws_secret_access_key&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;invoke_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        Invoke model with prompt

        Args:
            prompt: Base prompt
            dataset: Data to analyze (JSON or TOON encoded)
            temperature: 0 = deterministic, higher = more random
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;prompt_final&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt_final&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inferenceConfig&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_new_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_p&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;response_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response_body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error invoking prompt model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Generate embeddings for a text&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputText&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;response_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response_body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error generating embeddings: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Experimental Setup
&lt;/h3&gt;

&lt;p&gt;The experiment is based on the following principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Same prompt structure&lt;/strong&gt;, changing only the data serialization format (&lt;strong&gt;JSON vs TOON&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;25 independent runs per format&lt;/strong&gt; to capture variability and compute robust statistics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temperature = 0&lt;/strong&gt; to minimize randomness and ensure deterministic model behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This setup allows us to isolate the impact of the serialization format on the model’s output.&lt;/p&gt;


&lt;h3&gt;
  
  
  Prompt Design
&lt;/h3&gt;

&lt;p&gt;The following is the prompt we will use for testing. The same prompt will be used in all executions, and we will only modify the data attached to the prompt for testing with toon and json format.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimwknb6os5rkrqerdg4a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimwknb6os5rkrqerdg4a.png" alt=" " width="708" height="354"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;By concatenating the dataset directly to the prompt, we ensure that the &lt;strong&gt;instruction remains identical&lt;/strong&gt;, and any differences in the response are attributable solely to the input format.&lt;/p&gt;


&lt;h2&gt;
  
  
  Evaluation Procedure
&lt;/h2&gt;

&lt;p&gt;To assess response equivalence between &lt;code&gt;JSON&lt;/code&gt; and &lt;code&gt;TOON&lt;/code&gt;, the experiment relies on the &lt;strong&gt;SemanticEvaluator&lt;/strong&gt; class, which encapsulates response generation and similarity evaluation.&lt;br&gt;
At the core of the evaluation is the comparison of &lt;strong&gt;two responses per run&lt;/strong&gt;, generated using the same prompt but different data encodings (JSON vs TOON), with temperature fixed at 0 to ensure deterministic behavior.&lt;br&gt;
The evaluation is structured as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;cosine_similarity&lt;/strong&gt; computes semantic similarity between the two responses using embedding vectors generated by Amazon Titan. This metric captures meaning-level equivalence and is insensitive to surface-level wording changes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;evaluate_single_run&lt;/strong&gt; performs a full comparison for one run. It invokes the &lt;strong&gt;LLM&lt;/strong&gt; twice (&lt;strong&gt;JSON&lt;/strong&gt; and &lt;strong&gt;TOON&lt;/strong&gt;), generates embeddings, and computes cosine similarity along with lexical overlap metrics (&lt;strong&gt;ROUGE-1&lt;/strong&gt;, &lt;strong&gt;ROUGE-2&lt;/strong&gt;, &lt;strong&gt;ROUGE-L&lt;/strong&gt;) and BLEU. The output is a consolidated set of similarity scores for that run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;evaluate_multiple_runs&lt;/strong&gt; repeats the single-run evaluation 25 times using the same prompt and dataset. Results from all runs are aggregated into a DataFrame, enabling statistical analysis such as mean values, variance, and stability across runs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This design allows us to determine whether TOON’s token savings preserve response quality, both semantically and lexically, across multiple deterministic evaluations.&lt;/p&gt;


&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;After running &lt;strong&gt;25 deterministic evaluations&lt;/strong&gt; (temperature = 0), the analysis focused exclusively on &lt;strong&gt;response equivalence&lt;/strong&gt;, measuring whether JSON and TOON produce comparable outputs when token savings are significant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantic Equivalence (Cosine Similarity ≈ 0.991)&lt;/strong&gt;&lt;br&gt;
The most important signal comes from &lt;strong&gt;cosine similarity&lt;/strong&gt;, computed using embeddings generated by Amazon Titan.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;An average score of &lt;strong&gt;0.991&lt;/strong&gt; indicates that, for the LLM, responses generated from TOON-encoded data are &lt;strong&gt;semantically equivalent&lt;/strong&gt; to those generated from JSON.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Despite the removal of structural syntax such as braces, quotes, and repeated field names, the model preserved its ability to reason over the data and extract the same insights.&lt;br&gt;
Across all runs, the meaning of the responses remained consistent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5c0lx4sd6qq687hw6ou.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5c0lx4sd6qq687hw6ou.png" alt=" " width="800" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5c9ute9bf3emm3ckozt0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5c9ute9bf3emm3ckozt0.png" alt=" " width="800" height="583"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Lexical Variability vs. Data Accuracy
&lt;/h2&gt;

&lt;p&gt;Lexical similarity metrics such as ROUGE-1 and BLEU report lower absolute values:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ROUGE-1 F1&lt;/strong&gt; = 0.747&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ROUGE-L F1&lt;/strong&gt; = 0.608&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BLEU&lt;/strong&gt; = 0.563&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These scores indicate a moderate degree of lexical and structural variation between responses generated from JSON and TOON inputs. In particular, ROUGE-1 suggests partial overlap at the word level, while the lower ROUGE-L score highlights differences in sentence structure and ordering, consistent with paraphrasing and reformulation rather than content loss. Similarly, BLEU, which is sensitive to exact n-gram matches and word order, penalizes these variations even when responses remain correct and informative.&lt;/p&gt;

&lt;p&gt;Importantly, these lexical differences do not correspond to a degradation in response quality. When inspecting the actual content of the responses, including rankings, averages, and detected trends, the results were numerically and logically consistent across formats.&lt;/p&gt;



&lt;p&gt;🗂️ Code repository&lt;br&gt;
If you want to analyze my code and see all these experiments performed, you can consult them from my repository, where all the code is available.&lt;br&gt;
If you find this tutorial useful, do not forget to leave a star ⭐️ on the repository and follow me to receive notifications about new articles. Your support helps keep creating valuable technical content for the community 🚀&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/RominaElenaMendezEscobar" rel="noopener noreferrer"&gt;
        RominaElenaMendezEscobar
      &lt;/a&gt; / &lt;a href="https://github.com/RominaElenaMendezEscobar/experiment-toon-vs-json" rel="noopener noreferrer"&gt;
        experiment-toon-vs-json
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      This repository contains a practical benchmark comparing JSON and TOON (Terse Object Oriented Notation) as data serialization formats for LLM prompts.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;p&gt;&lt;a href="https://www.buymeacoffee.com/r0mymendez" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/b96fd4ea89ea15fcec30a4f86382eef0bbd17454aa3a8d4de8c8c5e92b55cf6c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4275792532304d6525323041253230436f666665652d737570706f72742532306d79253230776f726b2d4646444430303f7374796c653d666c6174266c6162656c436f6c6f723d313031303130266c6f676f3d6275792d6d652d612d636f66666565266c6f676f436f6c6f723d7768697465" alt="Buy Me A Coffee"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;em&gt;A practical benchmark comparing TOON and JSON formats for LLM prompts&lt;/em&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;|&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;llm&lt;/code&gt;, &lt;code&gt;ai&lt;/code&gt;, &lt;code&gt;optimization&lt;/code&gt;, &lt;code&gt;python&lt;/code&gt;|&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/RominaElenaMendezEscobar/experiment-toon-vs-json/img/preview.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FRominaElenaMendezEscobar%2Fexperiment-toon-vs-json%2FHEAD%2Fimg%2Fpreview.png" alt="img-preview"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Introduction&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;Over the past months, I came across several articles claiming that &lt;strong&gt;TOON&lt;/strong&gt; can significantly reduce token usage in &lt;strong&gt;LLM&lt;/strong&gt; prompts compared to traditional &lt;strong&gt;JSON&lt;/strong&gt;. Most of these examples, however, relied on small or artificial datasets.
That raised a few questions for me:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Does TOON still provide benefits with &lt;strong&gt;real-world API responses&lt;/strong&gt;?&lt;/li&gt;
&lt;li&gt;How much does it actually reduce tokens?&lt;/li&gt;
&lt;li&gt;And more importantly: &lt;strong&gt;does changing the format affect how an LLM interprets the data or the quality of the response&lt;/strong&gt;?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this article, I aim to run a &lt;strong&gt;practical benchmark&lt;/strong&gt; to explore whether TOON could be useful in production pipelines, in what contexts it performs best, and whether it works well across…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/RominaElenaMendezEscobar/experiment-toon-vs-json" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;





&lt;h1&gt;
  
  
  Conclusions
&lt;/h1&gt;

&lt;p&gt;This experiment shows that TOON can significantly reduce token usage while preserving response quality, as long as it is applied to the right type of data. For flat, repetitive structures, TOON acts as an effective form of prompt compression: the LLM retains semantic understanding, and any differences in wording are superficial rather than affecting meaning or correctness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚠️ &lt;code&gt;Key limitations&lt;/code&gt;:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Only a single LLM was tested (Amazon Nova Lite)&lt;/li&gt;
&lt;li&gt;Only specific datasets were used (GitHub Events and Wikipedia Page Views)&lt;/li&gt;
&lt;li&gt;Evaluation was conducted in English only&lt;/li&gt;
&lt;li&gt;Prompts were simple analytical tasks, not complex reasoning scenarios&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As in any systems project, &lt;strong&gt;solutions should be carefully evaluated&lt;/strong&gt; to determine whether they are truly optimal for a given use case. Outcomes often depend on &lt;strong&gt;many variables&lt;/strong&gt;, so testing and validation in the specific context are essential before making decisions or implementing at scale.&lt;/p&gt;




&lt;h3&gt;
  
  
  📌 How to cite this article
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;APA style&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Mendez Escobar, Romina Elena. (2025). &lt;strong&gt;TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
&lt;a href="https://dev.to/r_elena_mendez_escobar/toon-vs-json-for-llm-prompts-can-we-reduce-token-usage-without-losing-response-quality-59ed"&gt;https://dev.to/r_elena_mendez_escobar/toon-vs-json-for-llm-prompts-can-we-reduce-token-usage-without-losing-response-quality-59ed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BibTeX&lt;/strong&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
text
@article{mendez2025ai,
  title  = {TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?},
  author = {Mendez Escobar, Romina Elena},
  year   = {2025},
  url    = {https://dev.to/r_elena_mendez_escobar/toon-vs-json-for-llm-prompts-can-we-reduce-token-usage-without-losing-response-quality-59ed}
}



&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>database</category>
      <category>performance</category>
    </item>
    <item>
      <title>From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Wed, 31 Dec 2025 10:11:22 +0000</pubDate>
      <link>https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b</link>
      <guid>https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b</guid>
      <description>&lt;p&gt;In recent months, we have increasingly incorporated artificial intelligence into our solutions, and with it a recurring need has emerged: searching and querying our own data using natural language efficiently.&lt;/p&gt;

&lt;p&gt;Use cases such as semantic search or building solutions based on Retrieval-Augmented Generation (RAG) are no longer optional. Today, we need to understand the meaning of text, combine it with structured filters, and do so in an efficient and scalable way.&lt;br&gt;
In this article, I explore a recent alternative within the AWS ecosystem: Amazon S3 Vectors 🪣, a serverless approach for vector storage and querying that aims to balance scalability, simplicity, and cost.&lt;/p&gt;

&lt;p&gt;To make it more concrete (and a bit more entertaining)...we will work with a dataset of coffee products ☕ and build a complete flow that goes from generating embeddings with Amazon Bedrock 🧠 to an application deployed on AWS with Streamlit ✨, which allows natural language searches combined with filters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqcgqcdvuwd4t6gyeou44.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqcgqcdvuwd4t6gyeou44.png" alt=" " width="800" height="765"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h1&gt;
  
  
  A quick note on embeddings and semantic search
&lt;/h1&gt;

&lt;p&gt;Before diving into the implementation, it is worth briefly clarifying two key concepts used throughout this tutorial:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Embeddings&lt;/strong&gt; are numerical representations of text that capture semantic meaning. Instead of relying on exact word matching, embeddings map text into high-dimensional vector spaces where semantically similar pieces of text are positioned closer together. This representation allows systems to reason about intent and context rather than purely lexical similarity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Semantic search&lt;/strong&gt; builds on top of embeddings by retrieving results based on meaning rather than exact terms. A user query is first transformed into an embedding and then compared against stored vectors using similarity metrics such as cosine or Euclidean distance. This approach enables more flexible, intent-aware searches and can be further refined by combining semantic similarity with structured metadata filters to improve precision and relevance.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13nzlac3a3bdqa9klo6t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13nzlac3a3bdqa9klo6t.png" alt=" " width="799" height="485"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h1&gt;
  
  
  What is Amazon S3 Vectors?
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Amazon S3 Vectors&lt;/strong&gt; is a new type of storage within Amazon S3 designed specifically to natively &lt;strong&gt;store and query vectors&lt;/strong&gt;.&lt;br&gt;
 In addition to storing vectors, this type of bucket allows associating &lt;strong&gt;structured metadata&lt;/strong&gt;, which enables queries that combine &lt;strong&gt;semantic search&lt;/strong&gt; with filters on those attributes.&lt;br&gt;
Vector buckets support searches based on distance metrics, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cosine similarity&lt;/strong&gt;: measures how similar two vectors are based on the angle between them, and is very common in text embeddings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Euclidean distance&lt;/strong&gt;: measures the “geometric” distance between two vectors in space.
Unlike traditional vector databases, Amazon S3 Vectors makes it possible to &lt;strong&gt;implement a fully serverless architecture&lt;/strong&gt;, achieving a good balance between &lt;code&gt;scalability&lt;/code&gt;, &lt;code&gt;operational&lt;/code&gt; &lt;code&gt;simplicity&lt;/code&gt;, and &lt;code&gt;cost&lt;/code&gt;.
Below are some of the main benefits of using this functionality:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqta0037jxq5gc1cju7uo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqta0037jxq5gc1cju7uo.png" alt=" " width="800" height="644"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  How do vectors work in Amazon S3?
&lt;/h2&gt;

&lt;p&gt;Amazon S3 Vectors is based on the following main components:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🪣 1. Vector buckets&lt;/strong&gt;&lt;br&gt;
These are specialized buckets optimized for vector storage.&lt;br&gt;
They support encryption and organize data internally through &lt;strong&gt;vector indexes&lt;/strong&gt;, which enables efficient large-scale searches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧭 2. Vector indexes&lt;/strong&gt;&lt;br&gt;
An index defines how vectors are stored and queried within the bucket.&lt;br&gt;
In addition to the vector, it allows associating &lt;strong&gt;metadata&lt;/strong&gt;, which can later be used in queries through filters with a syntax similar to well-known operators, such as those used in MongoDB.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔍 3. Queries&lt;/strong&gt;&lt;br&gt;
Queries are based on &lt;strong&gt;similarity searches&lt;/strong&gt;, using the distance metric configured when creating the index, such as &lt;strong&gt;cosine&lt;/strong&gt; or &lt;strong&gt;Euclidean&lt;/strong&gt;.&lt;br&gt;
These searches can be combined with metadata filters to refine results and reduce ambiguities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚙️ 4. API&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Amazon S3 Vectors&lt;/strong&gt; exposes an API that allows querying data through operations such as &lt;code&gt;QueryVectors&lt;/code&gt;.&lt;br&gt;
These queries can be executed using tools like the &lt;strong&gt;AWS CLI&lt;/strong&gt; or &lt;strong&gt;Boto3&lt;/strong&gt;, combining a query vector with metadata-based filters and parameters such as the number of results to return or whether to include the distance between vectors.&lt;/p&gt;


&lt;h1&gt;
  
  
  Process Flow
&lt;/h1&gt;

&lt;p&gt;The previous image shows the complete workflow to implement semantic search using Amazon S3 Vectors, divided into three main stages:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfq3zqssetmuur7kdgx3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfq3zqssetmuur7kdgx3.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  1️⃣ Generate Vector Embeddings
&lt;/h2&gt;

&lt;p&gt;The process starts from the input documents. These documents are sent to an embeddings model, in this case &lt;strong&gt;AWS Titan&lt;/strong&gt; through &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;, which transforms the text into numerical vectors.&lt;br&gt;
At this stage, not only are the vectors generated, but metadata describing each document is also associated.&lt;/p&gt;
&lt;h2&gt;
  
  
  2️⃣ Store Vector Data
&lt;/h2&gt;

&lt;p&gt;The generated vectors, together with their metadata, are stored in an &lt;strong&gt;S3 Vector Bucket&lt;/strong&gt;.&lt;br&gt;
Within the bucket, the data is organized through one or more &lt;strong&gt;vector indexes&lt;/strong&gt;, defined with a specific distance metric.&lt;br&gt;
Being integrated into AWS, this data can be consumed by other services such as &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;, &lt;strong&gt;Amazon SageMaker&lt;/strong&gt;, or &lt;strong&gt;Amazon OpenSearch&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  3️⃣ Semantic Search via Vector Index
&lt;/h2&gt;

&lt;p&gt;To perform a search, a natural language query is transformed again into a vector using the same embeddings model.&lt;br&gt;
This query vector, together with metadata filters and the topK parameter, is used to query the vector index and retrieve the most semantically similar results.&lt;/p&gt;


&lt;h1&gt;
  
  
  Reference Architecture
&lt;/h1&gt;

&lt;p&gt;In this tutorial, the use case is based on processing data initially stored in &lt;strong&gt;JSON&lt;/strong&gt; format, which is transformed into &lt;strong&gt;Parquet&lt;/strong&gt; as part of a data preparation workflow. From this processed data, the &lt;strong&gt;Amazon Titan&lt;/strong&gt; model is invoked through &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; to generate embeddings, which are then stored together with their metadata in an &lt;strong&gt;Amazon S3 Vectors bucket&lt;/strong&gt;, thus enabling semantic queries over the information.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F31ha7s4qltdjph27c99p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F31ha7s4qltdjph27c99p.png" alt=" " width="800" height="524"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Data processing is carried out through an &lt;strong&gt;Amazon Glue job in Python&lt;/strong&gt;, where a typical clean data stage of any production data pipeline is implemented. In this phase, only the relevant fields are selected, text descriptions are normalized and corrected when necessary, and only after this cleaning is completed is the Titan model invoked. This approach helps optimize costs and performance by avoiding unnecessary model calls on data that will not be used later.&lt;/p&gt;

&lt;p&gt;Finally, the data stored in the vector bucket is consumed by an application developed with &lt;strong&gt;Streamlit&lt;/strong&gt;, which is deployed on &lt;strong&gt;AWS Elastic Beanstalk&lt;/strong&gt; within a VPC. The application allows user queries to be transformed back into embeddings and used to query the vector index, combining semantic search with metadata-based filters, while access to services and system observability are managed through &lt;strong&gt;IAM&lt;/strong&gt; roles and &lt;strong&gt;CloudWatch&lt;/strong&gt; Logs.&lt;/p&gt;


&lt;h1&gt;
  
  
  Amazon Bedrock and Amazon Titan
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Amazon Bedrock&lt;/strong&gt; is a fully managed service that allows developers to build, deploy, and scale applications powered by artificial intelligence without the need to manage infrastructure. Through a unified API, Bedrock provides access to foundation models from different providers, making their integration into cloud architectures simple and secure.&lt;/p&gt;

&lt;p&gt;For this tutorial, we use &lt;strong&gt;Amazon Titan Text Embeddings V2&lt;/strong&gt;, a model available in Bedrock that can process up to &lt;code&gt;8,192 tokens&lt;/code&gt; or &lt;code&gt;50,000 characters&lt;/code&gt; and generate &lt;code&gt;1,024-dimensional vectors&lt;/code&gt;. This model is optimized for information retrieval tasks, semantic search, similarity measurement, and clustering, making it a suitable choice for RAG scenarios and large-scale text analysis.&lt;/p&gt;


&lt;h1&gt;
  
  
  Amazon Elastic Beanstalk
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Amazon Elastic Beanstalk&lt;/strong&gt; is a managed service that allows you to deploy and run web applications without the need to directly manage the underlying infrastructure. It automatically handles resource provisioning, load balancing, scaling, and monitoring, allowing the focus to remain on application development rather than operations.&lt;br&gt;
In this tutorial, we use &lt;strong&gt;Elastic Beanstalk&lt;/strong&gt; to deploy the application developed with &lt;strong&gt;Streamlit&lt;/strong&gt;, taking advantage of its native integration with services such as &lt;strong&gt;EC2&lt;/strong&gt;, &lt;strong&gt;Auto Scaling&lt;/strong&gt;, and &lt;strong&gt;CloudWatch&lt;/strong&gt;, which enables a fast, secure, and scalable deployment.&lt;/p&gt;

&lt;p&gt;Below is a summary of some of the main benefits of using this solution:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfruc4crs9yw9iw6jry7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfruc4crs9yw9iw6jry7.png" alt=" " width="800" height="855"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h1&gt;
  
  
  📊 Dataset
&lt;/h1&gt;

&lt;p&gt;The dataset used in this tutorial was obtained from the &lt;strong&gt;Amazon Reviews 2023 project&lt;/strong&gt;, presented in the paper Bridging Language and Items for Retrieval and Recommendation (Hou et al., 2024). This dataset contains reviews and metadata for Amazon products, including titles, descriptions, categories, stores, and ratings.&lt;br&gt;
For this use case, only the &lt;strong&gt;“Grocery_and_Gourmet_Food”&lt;/strong&gt; category was selected, and within it, products related to coffee were filtered. This allows us to work with rich textual information and structured attributes that are ideal for semantic search scenarios.&lt;br&gt;
The project repository includes both the filtered coffee product datasets and the already processed versions containing vector embeddings, making it easier to reproduce the tutorial and analyze the complete workflow.&lt;/p&gt;


&lt;h1&gt;
  
  
  Use Case
&lt;/h1&gt;

&lt;p&gt;The use case presented in this tutorial starts from a simple but representative scenario: a user who wants to query &lt;strong&gt;coffee products&lt;/strong&gt; using &lt;strong&gt;natural language&lt;/strong&gt;, exploring the available catalog in a more flexible and intuitive way than a traditional search.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs7jteavn4iyfvf17nlut.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs7jteavn4iyfvf17nlut.png" alt=" " width="800" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To enable this type of query, different textual attributes of the product are used, such as the &lt;code&gt;title&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;, and &lt;code&gt;category&lt;/code&gt;, which helps better capture user intent. Within the dataset, several coffee-related &lt;strong&gt;categories&lt;/strong&gt; are included, such as Coffee, Instant Coffee, Ground Coffee, Whole Coffee Beans, Single-Serve Capsules &amp;amp; Pods, Iced Coffee &amp;amp; Cold-Brew, among others.&lt;/p&gt;

&lt;p&gt;Based on this, an application is designed in which the user can interact primarily through natural language, while complementing the search with structured filters to reduce ambiguities. These filters include, for example, &lt;strong&gt;product rating&lt;/strong&gt;, &lt;strong&gt;store name&lt;/strong&gt; (a detail that users often do not know or remember precisely), and &lt;strong&gt;price&lt;/strong&gt;, allowing more accurate and relevant results without relying exclusively on a textual query.&lt;/p&gt;


&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;
&lt;h3&gt;
  
  
  (1) 🗂️ Code repository
&lt;/h3&gt;

&lt;p&gt;To follow this tutorial, it is necessary to &lt;strong&gt;clone the project repository&lt;/strong&gt;, where the complete solution code is available.&lt;br&gt;
In the following sections, the most relevant aspects of the implementation and design decisions are highlighted, rather than providing an exhaustive walkthrough of the entire source code.&lt;br&gt;
If you find this tutorial useful, do not forget to leave &lt;strong&gt;a star ⭐️&lt;/strong&gt; on the repository and follow me to receive notifications about new articles. Your support helps keep creating valuable technical content for the community 🚀&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/RominaElenaMendezEscobar" rel="noopener noreferrer"&gt;
        RominaElenaMendezEscobar
      &lt;/a&gt; / &lt;a href="https://github.com/RominaElenaMendezEscobar/s3-vector-coffee-tutorial" rel="noopener noreferrer"&gt;
        s3-vector-coffee-tutorial
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      S3 Vector tutorial using cafe data and creating a Streamlit app deployed on Elastic Beanstalk
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;p&gt;&lt;a href="https://www.buymeacoffee.com/r0mymendez" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/b96fd4ea89ea15fcec30a4f86382eef0bbd17454aa3a8d4de8c8c5e92b55cf6c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4275792532304d6525323041253230436f666665652d737570706f72742532306d79253230776f726b2d4646444430303f7374796c653d666c6174266c6162656c436f6c6f723d313031303130266c6f676f3d6275792d6d652d612d636f66666565266c6f676f436f6c6f723d7768697465" alt="Buy Me A Coffee"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/RominaElenaMendezEscobar/s3-vector-coffee-tutorial/img/1-preview.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FRominaElenaMendezEscobar%2Fs3-vector-coffee-tutorial%2FHEAD%2Fimg%2F1-preview.png" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;In recent months, we have increasingly incorporated artificial intelligence into our solutions, and with it a recurring need has emerged: searching and querying our own data using natural language efficiently.&lt;/p&gt;
&lt;p&gt;Use cases such as semantic search or building solutions based on Retrieval-Augmented Generation (RAG) are no longer optional. Today, we need to understand the meaning of text, combine it with structured filters, and do so in an efficient and scalable way
In this article, I explore a recent alternative within the AWS ecosystem: Amazon S3 Vectors 🪣, a serverless approach for vector storage and querying that aims to balance scalability, simplicity, and cost.&lt;/p&gt;
&lt;p&gt;To make it more concrete (and a bit more entertaining)...we will work with a dataset of coffee products ☕ and build a complete flow that goes from generating embeddings…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/RominaElenaMendezEscobar/s3-vector-coffee-tutorial" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;





&lt;h3&gt;
  
  
  (2) 🪣 Create Amazon S3 buckets
&lt;/h3&gt;

&lt;p&gt;As part of this workflow, we need two Amazon S3 buckets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A standard bucket&lt;/strong&gt; to store raw and processed data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An Amazon S3 Vectors bucket&lt;/strong&gt; to store vectors and their metadata.
In this tutorial, the following names are used as references:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;AWS_BUCKET_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coffee-products-tutorial-full-data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;AWS_BUCKET_VECTOR_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coffee-products-tutorial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;AWS_INDEX_VECTOR_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idx-coffee-products&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  (2.1) 🪣 Creating the S3 Vectors bucket
&lt;/h4&gt;

&lt;p&gt;The first step is to create the &lt;strong&gt;vector bucket&lt;/strong&gt; from the Amazon S3 console, in the Vector buckets section, select Create vector bucket and define a unique name for the bucket.&lt;br&gt;
In the encryption configuration, you can use Amazon S3–managed encryption (SSE-S3), which is sufficient for this use case. It is worth noting that this setting cannot be modified later, so it is important to define it correctly from the beginning.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv0733eqqqeacx544zlg7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv0733eqqqeacx544zlg7.png" alt=" " width="800" height="772"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  (2.2) 🧭 Creating the vector index
&lt;/h4&gt;

&lt;p&gt;Once the bucket is created, the next step is to define a &lt;strong&gt;vector index&lt;/strong&gt;, which will be responsible for organizing and querying the vectors efficiently.&lt;/p&gt;

&lt;p&gt;During this configuration, three key aspects must be specified:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Index name&lt;/strong&gt;, which must be unique within the bucket.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector dimension&lt;/strong&gt;, which must match the output of the embeddings model (in this case, 1,024 dimensions for Amazon Titan).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Distance metric&lt;/strong&gt;, where you can choose between cosine or Euclidean. For text embeddings, cosine similarity is usually the most commonly used option.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Like the bucket, the index also inherits the encryption configuration, and this cannot be modified once it has been created.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5k6vwaia33hbyx0l6eyh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5k6vwaia33hbyx0l6eyh.png" alt=" " width="800" height="569"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h4&gt;
  
  
  (3) 🔐 Policies
&lt;/h4&gt;

&lt;p&gt;To work on this project, it is necessary to configure a set of &lt;strong&gt;IAM policies&lt;/strong&gt; that allow access to the different services involved in the workflow.&lt;/p&gt;

&lt;p&gt;In particular, the following are required:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Titan policy:&lt;/strong&gt; allows invoking the Amazon Titan embeddings model through Amazon Bedrock to generate vectors from text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon S3 policy:&lt;/strong&gt; enables reading and writing data in the Amazon S3 bucket used to store raw and processed data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon S3 Vectors policy:&lt;/strong&gt; allows writing and querying vectors, along with their metadata, in the Amazon S3 Vectors bucket.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally, these policies are attached to an &lt;strong&gt;IAM role&lt;/strong&gt; that is used by the application deployed on &lt;strong&gt;AWS Elastic Beanstalk&lt;/strong&gt;, ensuring controlled and secure access to the required resources.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;All the policies mentioned are available in the &lt;strong&gt;project repository&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h1&gt;
  
  
  🛠️ Implementation Guide
&lt;/h1&gt;
&lt;h2&gt;
  
  
  ✅ Step 1: Dataset
&lt;/h2&gt;

&lt;p&gt;As mentioned earlier, we start from a dataset in &lt;strong&gt;JSON&lt;/strong&gt; format, which we download and then process into &lt;strong&gt;Parquet&lt;/strong&gt;, since this format is more efficient for reading, storage, and processing in data pipelines.&lt;br&gt;
The dataset used in this tutorial is available in my repository, inside the &lt;code&gt;data/&lt;/code&gt; folder.&lt;/p&gt;


&lt;h2&gt;
  
  
  ⚙️ Step 2: Process data (embedding generation)
&lt;/h2&gt;

&lt;p&gt;To generate the &lt;strong&gt;embeddings&lt;/strong&gt;, we use a class that I created to simplify the code and encapsulate the interaction with &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;. By default, the class uses the amazon.titan-embed-text-v2:0 model, although the design allows it to be easily changed if you want to try another model.&lt;/p&gt;

&lt;p&gt;This class includes three main methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;create_client():&lt;/strong&gt; creates the Bedrock Runtime client with Boto3, using region and credentials.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;get_embeddings(text):&lt;/strong&gt; invokes the Titan model by sending the text and returns the generated vector.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;generate_embeddings_batch(texts):&lt;/strong&gt; generates embeddings in batches by iterating over a list of texts and showing progress with tqdm.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingsGenerator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;amazon.titan-embed-text-v2:0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;
                &lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MODEL_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_REGION&lt;/span&gt;


   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
               &lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;aws_access_key_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;aws_secret_access_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;
           &lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;

   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
       &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
           &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
               &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputText&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
           &lt;span class="p"&gt;})&lt;/span&gt;
       &lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="n"&gt;response_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
       &lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response_body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;

   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_embeddings_batch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="n"&gt;embeddings_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
       &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;tqdm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
           &lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
           &lt;span class="n"&gt;embeddings_list&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;embeddings_list&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;To run it locally, you need a &lt;code&gt;.env&lt;/code&gt; file with your credentials and region:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;AWS_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;YOUR_ACCESS_KEY
&lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;YOUR_AWS_SECRET_ACCESS_KEY
&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;YOUR_AWS_REGION
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And a minimal usage example would be the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;


&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;AWS_ACCESS_KEY&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;AWS_REGION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;AWS_REGION&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;emb_generator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingsGenerator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;input_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;instant coffee sweet creamy vanilla flavor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;emb_generator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🪣 Step 3: Store data (S3 + S3 Vectors)
&lt;/h2&gt;

&lt;p&gt;To simplify data ingestion, I created an &lt;strong&gt;S3&lt;/strong&gt; class that encapsulates access to both the standard S3 bucket and the &lt;strong&gt;Amazon S3 Vectors bucket&lt;/strong&gt;. The idea is to keep the code clean and reusable, separating connection logic from write logic.&lt;/p&gt;

&lt;p&gt;This class includes three main methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;create_client()&lt;/strong&gt;: creates a Boto3 client for the specified service (&lt;strong&gt;s3&lt;/strong&gt; or &lt;strong&gt;s3vectors&lt;/strong&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;upload_file()&lt;/strong&gt;: uploads files to the standard S3 bucket (useful for raw and processed data).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;upload_vector_data()&lt;/strong&gt;: loads vectors into S3 Vectors using * &lt;strong&gt;put_vectors&lt;/strong&gt;, sending them in batches to respect the per-request limit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;query_embedding()&lt;/strong&gt;: enables semantic search by querying the vector index using an embedding and optional metadata filters, returning the most relevant results ranked by similarity.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;lass&lt;/span&gt; &lt;span class="n"&gt;S3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Class to handle S3 operations including uploading files and vector data&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_BUCKET_NAME&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_BUCKET_VECTOR_NAME&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;AWS_INDEX_VECTOR_NAME&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;
                &lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_REGION&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_BUCKET_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_BUCKET_NAME&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_BUCKET_VECTOR_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_BUCKET_VECTOR_NAME&lt;/span&gt;
       &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_INDEX_VECTOR_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_INDEX_VECTOR_NAME&lt;/span&gt;


   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
       Create a boto3 client for the specified AWS service.
       &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
       &lt;span class="n"&gt;s3_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
           &lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;aws_access_key_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;aws_secret_access_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;
       &lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s3_client&lt;/span&gt;


   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;upload_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;object_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
       Upload a file to an S3 bucket.
       &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
       &lt;span class="n"&gt;s3_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
       &lt;span class="n"&gt;s3_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_BUCKET_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;object_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; uploaded to bucket &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_BUCKET_NAME&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; as &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;object_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;upload_vector_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
       Upload vector data to S3 Vectors in batches with tqdm for progress tracking.
       batchsize: it is the number of vectors per batch to avoid exceeding maximum size.
       &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
       &lt;span class="n"&gt;s3_vector_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3vectors&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


       &lt;span class="c1"&gt;# Helper for chunking data into batches
&lt;/span&gt;       &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chunked&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
           &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
               &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;


       &lt;span class="n"&gt;batches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chunked&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


       &lt;span class="c1"&gt;# see the progress of the upload
&lt;/span&gt;       &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tqdm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batches&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Uploading batches&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
           &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
               &lt;span class="n"&gt;s3_vector_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_vectors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                   &lt;span class="n"&gt;vectorBucketName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_BUCKET_VECTOR_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_INDEX_VECTOR_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;
               &lt;span class="p"&gt;)&lt;/span&gt;
           &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
               &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error uploading batch &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="n"&gt;filter_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
       &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Perform complete search with text and filters&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
       &lt;span class="n"&gt;s3_vector_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3vectors&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

       &lt;span class="c1"&gt;# Prepare base parameters
&lt;/span&gt;       &lt;span class="n"&gt;query_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vectorBucketName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_BUCKET_VECTOR_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;indexName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AWS_INDEX_VECTOR_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queryVector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;float32&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topK&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;returnDistance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;returnMetadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
       &lt;span class="p"&gt;}&lt;/span&gt;

       &lt;span class="c1"&gt;# Only add filter if exists
&lt;/span&gt;       &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;filter_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
           &lt;span class="n"&gt;query_params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;filter_data&lt;/span&gt;

       &lt;span class="c1"&gt;# Execute search
&lt;/span&gt;       &lt;span class="n"&gt;query_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s3_vector_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query_vectors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;query_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;query_result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;vectors&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To upload vectors to S3 Vectors, we first need to build the structure expected by &lt;strong&gt;put_vectors&lt;/strong&gt;. Each item must include a &lt;strong&gt;key&lt;/strong&gt; (a unique identifier in string format), the vector in data.float32, and a &lt;strong&gt;metadata&lt;/strong&gt; object with the attributes that we will later use as filters in queries.&lt;br&gt;
In addition, since &lt;strong&gt;no more than 100 vectors can be sent per request&lt;/strong&gt;, the upload is performed in batches controlled by the &lt;strong&gt;batch_size&lt;/strong&gt; parameter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;vector_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;


&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_coffee_filter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
   &lt;span class="n"&gt;vector_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_coffee_filter&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;  &lt;span class="c1"&gt;# always need to be string
&lt;/span&gt;       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;float32&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data_coffee_filter&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;embeddings&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
       &lt;span class="p"&gt;},&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;average&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_coffee_filter&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;average_rating&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rating_number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_coffee_filter&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rating_number&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_coffee_filter&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shop_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_coffee_filter&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;shop_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
       &lt;span class="p"&gt;}&lt;/span&gt;
   &lt;span class="p"&gt;})&lt;/span&gt;


&lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;S3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;AWS_BUCKET_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_BUCKET_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;AWS_BUCKET_VECTOR_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_BUCKET_VECTOR_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;AWS_INDEX_VECTOR_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_INDEX_VECTOR_NAME&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_vector_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔍 Step 4: Retrieve (QueryVectors + filters)
&lt;/h2&gt;

&lt;p&gt;To retrieve results from Amazon S3 Vectors, the flow is always the same. First, we convert a natural language query into an embedding (vector) using the same model that was used during indexing. Then, we execute &lt;strong&gt;query_vectors&lt;/strong&gt;, passing that vector as &lt;strong&gt;queryVector&lt;/strong&gt;. From there, the service returns the &lt;strong&gt;top K&lt;/strong&gt; most similar vectors according to the distance metric configured in the index (&lt;code&gt;Cosine&lt;/code&gt; or &lt;code&gt;Euclidean&lt;/code&gt;) and optionally, we can apply metadata filters to reduce ambiguity and improve precision.&lt;/p&gt;

&lt;p&gt;The most important query_vectors parameters are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;queryVector&lt;/strong&gt;: the embedding of the search text (in the format {"float32": [...]}).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;topK&lt;/strong&gt;: how many results we want to retrieve.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;filter&lt;/strong&gt;: filters based on the metadata stored together with the vector (for example shop_name, average, price).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;returnDistance&lt;/strong&gt;: whether to return the distance or similarity for each result. This is useful for applying a &lt;strong&gt;threshold&lt;/strong&gt; and discarding results that are close but not very relevant.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;returnMetadata&lt;/strong&gt;: whether to also return the metadata associated with the vector, to display information in the app or apply additional logic.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;To reduce the complexity of query implementation, a helper method is provided and encapsulated within the S3 utility class. This abstraction centralizes the interaction with Amazon S3 Vectors, simplifying semantic search execution and making the codebase cleaner, more reusable, and easier to maintain.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Amazon S3 Vectors, simplifying semantic search execution and making the codebase cleaner, more reusable, and easier to maintain.&lt;/p&gt;




&lt;h3&gt;
  
  
  Query Examples with Metadata Filters
&lt;/h3&gt;

&lt;h4&gt;
  
  
  🔎 Query by Single Metadata Field (Exact Match)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Example: filter by &lt;code&gt;shop_name&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;filter_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shop_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nescafé&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;response&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;distance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.41610199213027954&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;de46725d-ef52-47ca-80e2-f1ba82c0353d&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;11.48&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;shop_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nescafé&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;average&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;4.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rating_number&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;248&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
 &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;distance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.47703248262405396&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;03915b9f-e592-40ec-b806-bd06b4213d90&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;13.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;average&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;3.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;shop_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nescafé&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rating_number&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;471&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
 &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;distance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.514411211013794&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;5037ea28-b789-427a-9b1f-d825ad68dd2d&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rating_number&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3052&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;shop_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nescafé&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;average&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;4.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;17.75&lt;/span&gt;&lt;span class="p"&gt;}}]&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  🔢 Query Using Comparison Operators
&lt;/h4&gt;

&lt;p&gt;In filters, you can use comparison operators, for example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;$gt&lt;/strong&gt;: greater than&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$gte&lt;/strong&gt;: greater than or equal&lt;/li&gt;
&lt;li&gt;(and others such as $lt, $lte, $eq, $ne depending on the case)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here you can find more information about the commands you can use:&lt;br&gt;
&lt;a href="https://docs.aws.amazon.com/es_es/AmazonS3/latest/userguide/s3-vectors-metadata-filtering.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/es_es/AmazonS3/latest/userguide/s3-vectors-metadata-filtering.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example: average rating greater than or equal to 4.2&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;filter_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;average&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$gte&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;4.2&lt;/span&gt;&lt;span class="p"&gt;}})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  🔗 Query with Combined Conditions
&lt;/h4&gt;

&lt;p&gt;When you need more than one condition, you can combine filters with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;$and&lt;/strong&gt;: logical AND between multiple conditions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$or&lt;/strong&gt;: logical OR between multiple conditions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example: average rating ≥ 4.2 AND price ≤ 20&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;filter_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$and&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
           &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;average&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$gte&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;4.2&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
           &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$lte&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;20.0&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
       &lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🖥️ Step 5: App (Streamlit)
&lt;/h2&gt;

&lt;p&gt;While developing this tutorial, I realized that although it is possible to run the entire flow directly from Python code, it is &lt;strong&gt;not the most comfortable approach for an end user&lt;/strong&gt;. For this reason, I decided to build &lt;strong&gt;a web application using Streamlit&lt;/strong&gt;, a framework that allows you to create interactive interfaces in Python with very few lines of code.&lt;/p&gt;

&lt;p&gt;In the repository, you will find a single file called &lt;strong&gt;app.py&lt;/strong&gt;, which contains all the application logic. This makes it easy to clearly see how embedding generation, querying Amazon S3 Vectors, and result visualization are integrated, while keeping the focus on a simple and straightforward flow.&lt;/p&gt;

&lt;p&gt;Streamlit provides an API with many interactive components such as text inputs, selectors, sliders, and chat-oriented elements. These components are ideal for this use case. For more details about the available components, you can check the official documentation:&lt;br&gt;
&lt;a href="https://docs.streamlit.io/develop/api-reference" rel="noopener noreferrer"&gt;https://docs.streamlit.io/develop/api-reference&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr6jjg0ahs6eu5814pifs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr6jjg0ahs6eu5814pifs.png" alt=" " width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft2k85u2pampxjwplwdhn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft2k85u2pampxjwplwdhn.png" alt=" " width="799" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zye4vjcd74rjkh92lk1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zye4vjcd74rjkh92lk1.png" alt=" " width="800" height="236"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvec10fp201kgzhzlnc80.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvec10fp201kgzhzlnc80.png" alt=" " width="799" height="648"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🚀 Step 6: Configure the project to deploy the app (Elastic Beanstalk)
&lt;/h2&gt;

&lt;p&gt;To deploy the application on &lt;strong&gt;AWS Elastic Beanstalk&lt;/strong&gt;, we will package the project into a &lt;code&gt;.zip&lt;/code&gt; with a specific structure. &lt;strong&gt;Beanstalk&lt;/strong&gt; uses these files to configure the environment, install dependencies, and define how the app is executed when the instance starts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;    app.zip
    |__ 📂.ebextensions/
    |    |__ 📄 iam-role.config
    |    |__ 📄 securitygroup.config
    |__ 📂img/
    |    |__🏞️ preview_app.png
    |__ 📄 .ebignore
    |__ 📄 app.py
    |__ 📄 Procfile
    |__ 📄requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  📁 .ebextensions/iam-role.config (instance IAM role)
&lt;/h3&gt;

&lt;p&gt;This file configures which &lt;strong&gt;IAM Instance Profile&lt;/strong&gt; the &lt;strong&gt;Elastic Beanstalk&lt;/strong&gt; instance will use. It is key because that role is what allows your app to have permissions to invoke Bedrock and query S3 and S3 Vectors (based on the policies you defined).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;option_settings:
  aws:autoscaling:launchconfiguration:
    IamInstanceProfile: ElasticBeanstalk-CoffeeApp-Role
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🔒 .ebextensions/securitygroup.config (restrict access by IP)
&lt;/h3&gt;

&lt;p&gt;By default, the app is publicly accessible (depending on how the environment is configured). In this case, this configuration restricts access to the application only to your IP by adding inbound rules to the Beanstalk security group for HTTP (80) and HTTPS (443). This is useful in test environments or demos to prevent unwanted access.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Tip: you can get your public IP by searching “what is my ip” and replace &lt;strong&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Resources:
  httpSecurityGroupIngress: 
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      GroupId: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"Fn::GetAtt"&lt;/span&gt; : &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"AWSEBSecurityGroup"&lt;/span&gt;, &lt;span class="s2"&gt;"GroupId"&lt;/span&gt;&lt;span class="o"&gt;]}&lt;/span&gt;
      IpProtocol: tcp
      ToPort: 80
      FromPort: 80
      CidrIp: &amp;lt;your_ip&amp;gt;/32

  httpsSecurityGroupIngress:
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      GroupId: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"Fn::GetAtt"&lt;/span&gt; : &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"AWSEBSecurityGroup"&lt;/span&gt;, &lt;span class="s2"&gt;"GroupId"&lt;/span&gt;&lt;span class="o"&gt;]}&lt;/span&gt;
      IpProtocol: tcp
      ToPort: 443
      FromPort: 443
      CidrIp: &amp;lt;your_ip&amp;gt;/32
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🚫 .ebignore
&lt;/h3&gt;

&lt;p&gt;This file works like a .gitignore, but for deployment. It indicates which files should not be uploaded to Elastic Beanstalk. This helps avoid including credentials, system junk, or unnecessary files that increase the package size.&lt;/p&gt;




&lt;h3&gt;
  
  
  🖥️ app.py (Streamlit application)
&lt;/h3&gt;

&lt;p&gt;This is the main application file, where the Streamlit interface and the logic to generate &lt;code&gt;embeddings&lt;/code&gt;, query &lt;strong&gt;S3 Vectors&lt;/strong&gt;, and display results are defined. In this tutorial, the entire app lives in this single file to keep it simple and easy to follow.&lt;/p&gt;

&lt;h3&gt;
  
  
  🧾 Procfile (startup command)
&lt;/h3&gt;

&lt;p&gt;Elastic Beanstalk needs to know which command to run to start your application. The &lt;strong&gt;Procfile&lt;/strong&gt; defines that entrypoint. In this case, we start &lt;strong&gt;Streamlit&lt;/strong&gt; listening on &lt;code&gt;0.0.0.0&lt;/code&gt; to accept external traffic, and using a port defined for the environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;web: streamlit run app.py &lt;span class="nt"&gt;--server&lt;/span&gt;.port&lt;span class="o"&gt;=&lt;/span&gt;8000 &lt;span class="nt"&gt;--server&lt;/span&gt;.address&lt;span class="o"&gt;=&lt;/span&gt;0.0.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  📦 requirements.txt (dependencies)
&lt;/h3&gt;

&lt;p&gt;This file lists the libraries required for the app to run. Beanstalk installs them automatically during deployment.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 Step 7: Deploy the solution
&lt;/h2&gt;

&lt;h3&gt;
  
  
  (1) Create a new application
&lt;/h3&gt;

&lt;p&gt;In this step, a new application is created in &lt;strong&gt;AWS Elastic Beanstalk&lt;/strong&gt;, which acts as the logical container for the project.&lt;br&gt;
You only need to define an &lt;strong&gt;application&lt;/strong&gt; &lt;strong&gt;name&lt;/strong&gt; and, optionally, a short &lt;strong&gt;description&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  (2) Environment
&lt;/h3&gt;

&lt;p&gt;In this step, &lt;strong&gt;the environment&lt;/strong&gt; where the application will be deployed is configured. For this use case, a &lt;strong&gt;Web server environment&lt;/strong&gt; is selected, since it is a web application built with Streamlit that exposes an HTTP interface for users.&lt;/p&gt;

&lt;p&gt;By default, Elastic Beanstalk suggests an &lt;strong&gt;environment name&lt;/strong&gt; based on the application name, which is sufficient for this tutorial. This environment will be responsible for running the app, handling traffic, and applying scaling and monitoring configurations in the following steps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8nxxcvn4qxgwiwrqfjzo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8nxxcvn4qxgwiwrqfjzo.png" alt=" " width="800" height="414"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  (2) Environment – Step 1: Configure environment
&lt;/h4&gt;

&lt;p&gt;In this step, the basic environment parameters are defined:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Environment tier:&lt;/strong&gt; select Web server environment, since the application exposes a web interface over HTTP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application name:&lt;/strong&gt; automatically filled with the name defined in the previously created application.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment name:&lt;/strong&gt; name of the environment; the default suggested value can be used.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain:&lt;/strong&gt;  can be left empty so that Elastic Beanstalk automatically generates the subdomain.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Platform:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Platform: Python&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Platform branch&lt;/code&gt;: Python 3.11 running on 64bit Amazon Linux 2023&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Platform version&lt;/code&gt;: leave the default recommended version.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Application code:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select Upload your code.&lt;/li&gt;
&lt;li&gt;Upload the &lt;code&gt;.zip&lt;/code&gt; file generated previously.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Presets:&lt;/strong&gt; Select Single instance (free tier eligible) for this tutorial.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;--&lt;/p&gt;

&lt;h4&gt;
  
  
  (2) Environment – Step 2: Configure service access
&lt;/h4&gt;

&lt;p&gt;In this step, the &lt;strong&gt;IAM roles&lt;/strong&gt; that allow Elastic Beanstalk and EC2 instances to access AWS resources are configured:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Service role&lt;/strong&gt;: the role that Elastic Beanstalk uses to create and manage the environment (Auto Scaling, Load Balancer, logs, etc.).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EC2 instance profile:&lt;/strong&gt; the role used by the EC2 instances where the application runs.This role must include the necessary policies to access Amazon Bedrock, Amazon S3, and Amazon S3 Vectors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EC2 key pair (optional):&lt;/strong&gt; can be omitted if SSH access to the instances is not required.
With this configuration, the application is correctly authorized to interact with AWS services in a secure manner.&lt;/li&gt;
&lt;/ul&gt;




&lt;h4&gt;
  
  
  (2) Environment – Step 3: Set up networking, database, and tags (optional)
&lt;/h4&gt;

&lt;p&gt;In this step, the network where the environment will run is configured. For this tutorial, the default VPC values are used, making only the following adjustments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;VPC&lt;/strong&gt;: select the account’s default VPC to simplify the configuration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Public IP address:&lt;/strong&gt; enable it so the application is accessible from the Internet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instance subnets:&lt;/strong&gt; select two subnets in different Availability Zones, as shown in the image.
Selecting more than one subnet allows Elastic Beanstalk to distribute instances across multiple Availability Zones, improving resilience and fault tolerance, even when using a simple deployment for tests or demos.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The remaining options (database and tags) can be left unconfigured for this use case.&lt;/p&gt;




&lt;h4&gt;
  
  
  (2) Environment – Step 4: Configure instance traffic and scaling
&lt;/h4&gt;

&lt;p&gt;In this step, how the application runs and what type of resources it uses are defined:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Environment type&lt;/strong&gt;: select Single instance, which is sufficient for this tutorial and helps reduce costs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fleet composition&lt;/strong&gt;: use On-Demand instance, avoiding the complexity of Spot instances.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture&lt;/strong&gt;: choose x86_64 to ensure compatibility with all Python dependencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instance type&lt;/strong&gt;: select a lightweight type such as t3.small, suitable for running a low-consumption Streamlit application.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring and metadata&lt;/strong&gt;: keep the default values, enabling CloudWatch metrics and using IMDSv2.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This configuration allows the application to be deployed in a simple, stable, and cost-effective way, ideal for tests, demos, and development environments.&lt;/p&gt;




&lt;h4&gt;
  
  
  (2) Environment – Step 5: Configure updates, monitoring, and logging
&lt;/h4&gt;

&lt;p&gt;In this step, monitoring, update, and observability options for the environment are configured:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring:&lt;/strong&gt; enable basic or enhanced monitoring so Elastic Beanstalk reports instance metrics to CloudWatch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Health reporting:&lt;/strong&gt; allows you to visualize the application status and detect failures early.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Managed platform updates:&lt;/strong&gt; automatic environment updates (minor and patch) can be enabled during a defined weekly window.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Email notifications:&lt;/strong&gt; allows configuring an email address to receive notifications about relevant environment events.&lt;/li&gt;
&lt;li&gt;**Rolling updates and deployments: **defines how deployments and configuration changes are applied (for this tutorial, default values can be used).&lt;/li&gt;
&lt;li&gt;**Logs: **enable sending instance logs to CloudWatch Logs to facilitate debugging and observability.&lt;/li&gt;
&lt;li&gt;*&lt;em&gt;Environment properties: *&lt;/em&gt; here you can define environment variables required by the application (for example AWS region, bucket names, or other configuration values the app needs).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With this configuration, the environment is prepared to operate in a stable and observable way, with controlled updates and no additional adjustments required for this use case.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 Step 7: Validate the application deployment (Elastic Beanstalk)
&lt;/h2&gt;

&lt;p&gt;Once the application is deployed, it is important to validate that everything is working correctly:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmvjnbtwqppluq8jhhmn1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmvjnbtwqppluq8jhhmn1.png" alt=" " width="800" height="513"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  (1) Environment status
&lt;/h3&gt;

&lt;p&gt;The first step is to verify that the environment status is &lt;strong&gt;Health: OK&lt;/strong&gt;. This indicates that Elastic Beanstalk was able to start the application correctly and that no critical errors were detected during deployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  (2) Application access
&lt;/h3&gt;

&lt;p&gt;If the status is correct, you can click on the** environment domain** to access the application from the browser and confirm that the Streamlit interface loads correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  (3) Log review
&lt;/h3&gt;

&lt;p&gt;If the application does not work as expected or the status is not OK, go to the Logs tab. From there, you can &lt;strong&gt;request logs&lt;/strong&gt;, and it is recommended to download &lt;strong&gt;the last 100 records&lt;/strong&gt; to make error analysis easier.&lt;/p&gt;

&lt;h3&gt;
  
  
  (4) Deploy a new version
&lt;/h3&gt;

&lt;p&gt;If an issue is detected in the logs and the code needs to be fixed, you can deploy a new version using the &lt;strong&gt;Upload and deploy&lt;/strong&gt; button. In this step, you only need to upload the updated &lt;code&gt;.zip&lt;/code&gt; file and assign a new application version.&lt;/p&gt;




&lt;h1&gt;
  
  
  🧩 Conclusions
&lt;/h1&gt;

&lt;p&gt;This tutorial presents a complete workflow for processing and querying data through semantic search, where it is essential not to lose sight of &lt;strong&gt;best practices&lt;/strong&gt; in data cleaning and the correct definition of metadata. &lt;strong&gt;Metadata&lt;/strong&gt; plays a fundamental role in guiding searches, reducing the amount of information queried, and significantly improving the relevance of results.&lt;/p&gt;

&lt;p&gt;During the tests performed,** query performance** was notably fast, to the point that in some cases the spinner implemented in the application barely had time to appear. This shows that &lt;strong&gt;Amazon S3 Vectors&lt;/strong&gt; can deliver suitable performance even for interactive, end-user–oriented scenarios.&lt;/p&gt;

&lt;p&gt;When exploring the &lt;strong&gt;Boto3 API&lt;/strong&gt;, it becomes apparent that some features commonly found in traditional databases are still missing, such as aggregated statistics or an equivalent of &lt;strong&gt;count(*)&lt;/strong&gt;. Currently, to determine the number of stored vectors, it is necessary to use operations like &lt;strong&gt;list_vectors&lt;/strong&gt; with pagination. This suggests that, as a relatively new feature, there are clear opportunities for improvement in future versions of the service.&lt;/p&gt;

&lt;p&gt;On the other hand, &lt;strong&gt;AWS Elastic Beanstalk&lt;/strong&gt; proves to be a very good solution for deploying this type of application quickly and easily. However, in production scenarios, combining it with tools such as &lt;strong&gt;Terraform&lt;/strong&gt; and &lt;strong&gt;CI/CD&lt;/strong&gt; pipelines would allow deployments to be automated and manual intervention to be further reduced. In this tutorial, a console-based deployment was chosen to keep complexity under control and focus on the main use case.&lt;/p&gt;

&lt;p&gt;Finally, this approach demonstrates how unstructured &lt;strong&gt;text analysis&lt;/strong&gt; use cases, combined with structured data, offer a very compelling balance. In particular, building a chat-like interface that does not rely exclusively on &lt;strong&gt;natural language&lt;/strong&gt;, but also incorporates explicit filters, makes it possible to create a hybrid model that improves precision, reduces ambiguity, and enriches the search experience.&lt;/p&gt;




&lt;h1&gt;
  
  
  📚 References
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Web Services.&lt;/strong&gt; (s.f.). Amazon S3 Vectors: Revolutionizing AI data storage with use cases. AWS re:Post.
&lt;a href="https://repost.aws/articles/ARY9EKiGFISfisAyvigDX3lQ/amazon-s3-vectors-revolutionizing-ai-data-storage-with-use-cases" rel="noopener noreferrer"&gt;https://repost.aws/articles/ARY9EKiGFISfisAyvigDX3lQ/amazon-s3-vectors-revolutionizing-ai-data-storage-with-use-cases&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Web Services.&lt;/strong&gt; (s.f.). Amazon S3 Vectors.
&lt;a href="https://aws.amazon.com/es/s3/features/vectors/" rel="noopener noreferrer"&gt;https://aws.amazon.com/es/s3/features/vectors/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Web Services.&lt;/strong&gt; (s.f.). Vector buckets for Amazon S3.
&lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-buckets-details.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-buckets-details.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Web Services.&lt;/strong&gt; (s.f.). Metadata filtering for Amazon S3 Vectors.
&lt;a href="https://docs.aws.amazon.com/es_es/AmazonS3/latest/userguide/s3-vectors-metadata-filtering.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/es_es/AmazonS3/latest/userguide/s3-vectors-metadata-filtering.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hou, Y., Li, J., He, Z., Yan, A., Chen, X., &amp;amp; McAuley, J.&lt;/strong&gt; (2024). Bridging language and items for retrieval and recommendation.
&lt;a href="https://amazon-reviews-2023.github.io/" rel="noopener noreferrer"&gt;https://amazon-reviews-2023.github.io/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit Inc. (s.f.).&lt;/strong&gt; Streamlit API reference.
&lt;a href="https://docs.streamlit.io/develop/api-reference" rel="noopener noreferrer"&gt;https://docs.streamlit.io/develop/api-reference&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Web Services. (s.f.).&lt;/strong&gt; Amazon S3 Vectors.
&lt;a href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3vectors.html" rel="noopener noreferrer"&gt;https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3vectors.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;







&lt;h3&gt;
  
  
  📌 How to cite this article
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;APA style&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Mendez Escobar, Romina Elena. (2025). &lt;strong&gt;From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
&lt;a href="https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b"&gt;https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BibTeX&lt;/strong&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
text
@article{mendez2025aiawscoffee,
  title  = {From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock},
  author = {Mendez Escobar, Romina Elena},
  year   = {2025},
  url    = {https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b}
}



&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>aws</category>
      <category>python</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Data-Driven Project Analysis: Analyzing Trello Kanban Projects with AI on AWS Bedrock</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Tue, 23 Dec 2025 11:22:10 +0000</pubDate>
      <link>https://dev.to/aws-builders/data-driven-project-analysis-analyzing-trello-kanban-projects-with-ai-on-aws-bedrock-15f4</link>
      <guid>https://dev.to/aws-builders/data-driven-project-analysis-analyzing-trello-kanban-projects-with-ai-on-aws-bedrock-15f4</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;Modern software projects often involve multiple distributed teams working on high-complexity initiatives, with frequent releases and ongoing production fixes. While tools like Kanban boards help organize tasks, epics, and workflows, they also generate large volumes of unstructured data in the form of comments, status changes, and timelines.&lt;br&gt;
As the number of interdependent tasks and contributors grows, understanding the real state of a project, and identifying early risks or bottlenecks, becomes increasingly difficult. As a result, manual analysis is time-consuming and often subjective, limiting timely and objective decision-making.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fesn1w1ecuclllif54qcf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fesn1w1ecuclllif54qcf.png" alt=" " width="800" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article, I present a practical use case that leverages AWS services and generative AI to enhance project analysis and interpretation. By analyzing task metadata and detecting semantic patterns in comments (such as ambiguity, implicit dependencies, missing definitions, or scope creep) AI enables more objective insights, early warnings, and data-driven decision-making&lt;/p&gt;


&lt;h1&gt;
  
  
  Understanding Kanban Board and Trello
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Kanban&lt;/strong&gt; is a visual project management methodology that originated in Toyota’s manufacturing system. It focuses on limiting work in progress and enabling continuous delivery by representing work items across different stages of a workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trello&lt;/strong&gt; is a widely used web-based project management tool that implements &lt;strong&gt;Kanban principles&lt;/strong&gt; through &lt;code&gt;boards&lt;/code&gt;, &lt;code&gt;lists&lt;/code&gt;, and &lt;code&gt;cards&lt;/code&gt;. Each card typically represents a task, feature, or user story, and includes not only a status but also descriptive text, comments, and historical changes over time.&lt;br&gt;
While Kanban boards are primarily designed for human collaboration, they also generate a rich source of textual and contextual data that can be analyzed programmatically.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fatd9nn8wtka3bhxine6k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fatd9nn8wtka3bhxine6k.png" alt=" " width="800" height="785"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  User Stories as a Data Structure
&lt;/h2&gt;

&lt;p&gt;A well-defined user story usually follows a consistent structure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Who&lt;/strong&gt;: the requester (As a…)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What&lt;/strong&gt;: the objective (I want to…)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why&lt;/strong&gt;: the purpose (So that…)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Acceptance Criteria&lt;/strong&gt;: explicit conditions for completion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7qgp0qi40sysunylpl5v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7qgp0qi40sysunylpl5v.png" alt=" " width="372" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This structure is not only useful for aligning teams, it also provides a clear semantic pattern that can be leveraged by AI models. When tasks are written consistently, the model can more easily understand intent, scope, dependencies, and completion expectations.&lt;br&gt;
In other words, &lt;strong&gt;writing better user stories improves both human understanding and machine interpretation&lt;/strong&gt;, making it a best practice for data-driven project analysis.&lt;/p&gt;


&lt;h1&gt;
  
  
  AWS Bedrock and Amazon Nova
&lt;/h1&gt;

&lt;p&gt;For this tutorial, we leverage Amazon’s generative AI services, which provide a variety of pre-trained foundation models accessible through a single, unified platform.&lt;br&gt;
&lt;strong&gt;AWS Bedrock&lt;/strong&gt; is a fully managed service that allows developers to build, deploy, and scale AI-powered applications without the overhead of managing infrastructure. It provides seamless access to state-of-the-art foundation models from leading AI providers, all through a simple API.&lt;br&gt;
For our implementation, we use &lt;strong&gt;Amazon Nova&lt;/strong&gt;, AWS’s family of foundation models designed for tasks such as text generation, analysis, and summarization. In particular, &lt;strong&gt;Nova Lite offers&lt;/strong&gt; a balanced combination of ⚡️performance and 💰cost-efficiency, making it ideal for analyzing project data and generating actionable insights.&lt;br&gt;
In the following sections, we will demonstrate how to implement this service in Python, showing how AI can be applied to extract meaningful insights from Kanban project data.&lt;/p&gt;


&lt;h1&gt;
  
  
  Reference Architecture
&lt;/h1&gt;

&lt;p&gt;Before diving into the implementation details, it is useful to understand the overall architecture that supports this use case. The following reference architecture illustrates how project data flows from Trello through AWS services and into an AI-powered analysis pipeline.&lt;/p&gt;

&lt;p&gt;The entire process is executed through an AWS Glue job implemented in Python, which orchestrates data extraction, transformation, AI inference, and report generation in a scalable and automated manner. &lt;/p&gt;

&lt;p&gt;At a high level, the architecture ingests Kanban project data from Trello, enriches it with temporal and contextual metadata, applies semantic analysis using generative AI models on AWS Bedrock, and produces structured, human-readable reports for project stakeholders.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxzb03plv9bn7xa7xh1je.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxzb03plv9bn7xa7xh1je.png" alt=" " width="799" height="656"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Core Components
&lt;/h2&gt;
&lt;h4&gt;
  
  
  (1). 📋Trello Integration Class
&lt;/h4&gt;

&lt;p&gt;Connects to Trello boards via the Trello API&lt;br&gt;
Retrieves boards, lists, and cards with enriched metadata&lt;br&gt;
Calculates time-based metrics (e.g., days until due date)&lt;br&gt;
Exports structured data to Amazon S3 in JSON format&lt;/p&gt;
&lt;h4&gt;
  
  
  (2). ✨AWS Bedrock Integration
&lt;/h4&gt;

&lt;p&gt;Invokes the Amazon Nova model using custom prompts&lt;br&gt;
Processes project datasets to generate semantic insights&lt;br&gt;
Uses configurable inference parameters to balance cost and accuracy&lt;/p&gt;
&lt;h4&gt;
  
  
  (3).📊 Report Generation (MarkdownPDFReport)
&lt;/h4&gt;

&lt;p&gt;Converts AI-generated markdown into professional PDF reports&lt;br&gt;
Applies custom styling for readability and consistency&lt;br&gt;
Supports tables, lists, and structured summaries&lt;/p&gt;
&lt;h4&gt;
  
  
  (4). Supporting Services
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;🔐 &lt;strong&gt;AWS Secrets Manager&lt;/strong&gt;: securely stores Trello API credentials&lt;/li&gt;
&lt;li&gt;🪣 &lt;strong&gt;Amazon S3&lt;/strong&gt;: stores datasets, prompts, and generated reports&lt;/li&gt;
&lt;li&gt;📩 &lt;strong&gt;Amazon SES&lt;/strong&gt;: distributes automated reports via email&lt;/li&gt;
&lt;/ul&gt;


&lt;h1&gt;
  
  
  Implementation Guide
&lt;/h1&gt;

&lt;p&gt;The use case presented in this guide is based on a simulated Trello board representing a e-commerce software project. The board includes typical development activities such as feature implementation, backlog items, in-progress tasks, and delivery milestones, closely mirroring how Kanban is used in production environments.&lt;br&gt;
This example is intentionally designed to resemble a realistic project scenario, allowing us to analyze both structured data (task metadata, statuses, due dates) and unstructured data (descriptions and comments). The following diagram illustrates the initial project setup and serves as the input for the implementation steps described in the next sections.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fstp0kct4qtjyy6606xco.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fstp0kct4qtjyy6606xco.png" alt=" " width="799" height="405"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before running the solution, a few &lt;strong&gt;AWS&lt;/strong&gt; and &lt;strong&gt;Trello&lt;/strong&gt; prerequisites must be in place. These prerequisites ensure secure access to project data, proper execution of the Glue job, and automated report delivery.&lt;/p&gt;
&lt;h3&gt;
  
  
  (1). 🔑 Trello API credentials
&lt;/h3&gt;

&lt;p&gt;To access Trello boards and cards programmatically, you need valid Trello API credentials, consisting of an API key and an access token.&lt;/p&gt;
&lt;h5&gt;
  
  
  Step 1: Obtain the API key
&lt;/h5&gt;

&lt;p&gt;The API key can be generated from the Trello Power-Ups administration page:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;https://trello.com/power-ups/admin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h5&gt;
  
  
  Step 2: Generate the access token
&lt;/h5&gt;

&lt;p&gt;Once you have the API key, you must authorize your application and generate a token using the following endpoint (replace {API_KEY} with your own key):&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;https://trello.com/1/authorize?expiration&lt;span class="o"&gt;=&lt;/span&gt;never&amp;amp;name&lt;span class="o"&gt;=&lt;/span&gt;MyApp&amp;amp;scope&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;read&lt;/span&gt;,write&amp;amp;response_type&lt;span class="o"&gt;=&lt;/span&gt;token&amp;amp;key&lt;span class="o"&gt;={&lt;/span&gt;API_KEY&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;This authorization flow grants read and write access to Trello resources and returns a token that will be used by the application to query boards, lists, cards, and comments. Both the API key and token should be treated as sensitive credentials.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  (2). ⚙️ AWS IAM role
&lt;/h3&gt;

&lt;p&gt;On the AWS side, an IAM role is required to execute the AWS Glue job and interact with the supporting services used in this solution.&lt;br&gt;
The role must include permissions for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;AWS Glue&lt;/code&gt; (job execution)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Amazon S3&lt;/code&gt; (data storage and retrieval)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AWS Secrets Manager&lt;/code&gt; (secure storage of Trello credentials)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Amazon Bedrock&lt;/code&gt; (AI model)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Amazon SES&lt;/code&gt; (email delivery)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;A complete example IAM policy with the required permissions is provided in the project repository. You can attach this policy to the IAM role used by the Glue job to ensure the pipeline runs end to end without permission issues.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  (3). 📩 Amazon SES configuration
&lt;/h3&gt;

&lt;p&gt;Finally, Amazon Simple Email Service (SES) must be configured to enable automated report delivery.&lt;br&gt;
This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;☑️ Verifying at least one sender email address or domain (SES identities)&lt;/li&gt;
&lt;li&gt;☑️ Ensuring your AWS account has sufficient sending limits&lt;/li&gt;
&lt;li&gt;☑️ Confirming the SES region matches the region used by the Glue job&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once configured, SES will be used to send the generated PDF reports to stakeholders automatically as part of the pipeline execution.&lt;/p&gt;


&lt;h1&gt;
  
  
  Implementation Steps
&lt;/h1&gt;

&lt;p&gt;The following steps describe the end-to-end implementation of the solution, from secure credential management to AI-driven analysis and automated report distribution.&lt;/p&gt;
&lt;h2&gt;
  
  
  🔐 Step 1: Configure Secrets Manager
&lt;/h2&gt;

&lt;p&gt;Store your Trello credentials securely in AWS Secrets Manager and this avoids hardcoding sensitive information and follows AWS security best practices. For this reason the secret should contain the Trello API key and token in JSON format.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj1pbsy6o14tivnrx5yji.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj1pbsy6o14tivnrx5yji.png" alt=" " width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  ⚙️ Step 2: Set Up the AWS Glue Environment
&lt;/h2&gt;

&lt;p&gt;For this tutorial, the solution is implemented using an AWS Glue Python notebook, which provides a fully managed, serverless environment for running data processing jobs. Therefore, the complete source code is available in the project repository, because in the following sections shighlights the most relevant implementation details and design decisions rather than providing a full code walkthrough.&lt;/p&gt;

&lt;p&gt;If you find this tutorial helpful, feel free to leave a star ⭐️ and follow me to get notified about new articles. Your support helps me grow within the tech community and create more valuable content! 🚀&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/RominaElenaMendezEscobar" rel="noopener noreferrer"&gt;
        RominaElenaMendezEscobar
      &lt;/a&gt; / &lt;a href="https://github.com/RominaElenaMendezEscobar/aws-trello-ai-tutorial" rel="noopener noreferrer"&gt;
        aws-trello-ai-tutorial
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      End-to-end AWS Glue pipeline for extracting Trello Kanban data, analyzing it with Amazon Bedrock, and generating automated PDF reports.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;p&gt;&lt;a href="https://www.buymeacoffee.com/r0mymendez" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/b96fd4ea89ea15fcec30a4f86382eef0bbd17454aa3a8d4de8c8c5e92b55cf6c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4275792532304d6525323041253230436f666665652d737570706f72742532306d79253230776f726b2d4646444430303f7374796c653d666c6174266c6162656c436f6c6f723d313031303130266c6f676f3d6275792d6d652d612d636f66666565266c6f676f436f6c6f723d7768697465" alt="Buy Me A Coffee"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;🏷️ Data-Driven Project Analysis: Analyzing Trello Kanban Projects with AI on AWS Bedrock&lt;/h1&gt;
&lt;/div&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Introduction&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;Modern software projects often involve multiple distributed teams working on high-complexity initiatives, with frequent releases and ongoing production fixes. While tools like Kanban boards help organize tasks, epics, and workflows, they also generate large volumes of unstructured data in the form of comments, status changes, and timelines
As the number of interdependent tasks and contributors grows, understanding the real state of a project, and identifying early risks or bottlenecks, becomes increasingly difficult. Manual analysis is time-consuming and often subjective.&lt;/p&gt;

&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/RominaElenaMendezEscobar/aws-trello-ai-tutorial/img/trello-aws-preview.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FRominaElenaMendezEscobar%2Faws-trello-ai-tutorial%2FHEAD%2Fimg%2Ftrello-aws-preview.png" alt="preview"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this repository, I present a practical use case that leverages AWS services and generative AI to enhance project analysis and interpretation. By analyzing task metadata and detecting semantic patterns in comments (such as ambiguity, implicit dependencies, missing definitions, or scope creep) AI enables more objective insights, early warnings, and data-driven decision-making&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;🗂️ Folder Structure&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;The repository…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/RominaElenaMendezEscobar/aws-trello-ai-tutorial" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;





&lt;h3&gt;
  
  
  📦 Step 2.1: Installing Additional Python Packages
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AWS Glue&lt;/strong&gt; comes with a predefined Python environment, but this solution requires additional libraries to interact with AWS services, process text, and generate reports.&lt;/p&gt;

&lt;p&gt;The following directive installs the required dependencies at runtime:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Install required Python packages&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%additional_python_modules boto3==1.34.34,botocore==1.34.34,markdown==3.5.2,beautifulsoup4==4.12.3,reportlab==4.0.8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;These packages are used for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;boto3 / botocore&lt;/strong&gt;: AWS SDK for Python, used to interact with services such as S3, Secrets Manager, Bedrock, and SES&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;markdown&lt;/strong&gt;: Converts AI-generated Markdown into HTML&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;beautifulsoup4&lt;/strong&gt;: Parses and transforms HTML content before PDF generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;reportlab&lt;/strong&gt;: Generates styled PDF documents programmatically
Installing only the required dependencies helps keep the Glue job lightweight and efficient.&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;
  
  
  📋 Step 2.2: Trello Data Extraction Class
&lt;/h3&gt;

&lt;p&gt;The Trello class encapsulates all interactions with the Trello REST API and is responsible for retrieving, enriching, and preparing project data for AI analysis.&lt;/p&gt;
&lt;h4&gt;
  
  
  Key input parameters
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;BUCKET_NAME&lt;/strong&gt;: Target S3 bucket for exporting processed data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API_KEY / API_TOKEN&lt;/strong&gt;: Trello credentials retrieved securely from Secrets Manager&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S3&lt;/strong&gt;: Helper class instance used to write data to Amazon S3&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  Dataset design considerations
&lt;/h4&gt;

&lt;p&gt;Although Trello provides a large number of fields, the implementation intentionally selects a &lt;strong&gt;minimal but meaningful subset&lt;/strong&gt; of columns:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;self.DATAFRAME_COLUMNS = [
    'id', 'dueComplete', 'desc', 'listName', 'name',
    'start', 'checkItems', 'checkItemsChecked', 'due', 'time_to_due']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This design choice offers several benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduces token usage during AI inference (lower cost)&lt;/li&gt;
&lt;li&gt;Avoids passing empty or unused fields&lt;/li&gt;
&lt;li&gt;Improves model focus and processing efficiency&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  Temporal enrichment
&lt;/h4&gt;

&lt;p&gt;The class automatically calculates the number of days remaining until each task’s due date (time_to_due). This temporal context helps the AI model reason about urgency, delays, and potential risks.&lt;br&gt;
Finally, the data can be exported to Amazon S3 in CSV format or returned as filtered JSON, typically limited to tasks in To Do and Doing states.&lt;/p&gt;


&lt;h3&gt;
  
  
  🧩 Step 2.3: AWS Helper Classes (boto3 Abstractions)
&lt;/h3&gt;

&lt;p&gt;To keep the AWS Glue notebook readable, modular, and maintainable, all AWS service interactions are encapsulated into small helper classes built on top of boto3.&lt;/p&gt;
&lt;h4&gt;
  
  
  aws_s3
&lt;/h4&gt;

&lt;p&gt;Handles all Amazon S3 operations, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reading prompt templates and input files&lt;/li&gt;
&lt;li&gt;Writing intermediate datasets&lt;/li&gt;
&lt;li&gt;Persisting generated PDF reports&lt;/li&gt;
&lt;li&gt;Automatically partitioning outputs by execution date&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  aws_secrets_manager
&lt;/h4&gt;

&lt;p&gt;Responsible for securely retrieving sensitive configuration from AWS Secrets Manager, in our use case is the Trello API credentials.&lt;/p&gt;
&lt;h4&gt;
  
  
  aws_ses
&lt;/h4&gt;

&lt;p&gt;Manages email delivery workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reads the generated PDF report from S3&lt;/li&gt;
&lt;li&gt;Renders an HTML email body (template stored in the repository)&lt;/li&gt;
&lt;li&gt;Attaches the PDF report&lt;/li&gt;
&lt;li&gt;Sends emails to configured recipients&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;
  
  
  🧠 Step 2.4: AWS Bedrock Integration and Inference Strategy
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;AWSBedrock&lt;/code&gt; class manages the interaction with &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;, invoking the &lt;strong&gt;Amazon Nova&lt;/strong&gt; Lite model to analyze Trello project data.&lt;/p&gt;
&lt;h4&gt;
  
  
  Model inputs
&lt;/h4&gt;

&lt;p&gt;The model receives:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A filtered dataset (JSON) containing only relevant tasks and fields&lt;/li&gt;
&lt;li&gt;A custom prompt defining the analysis objectives, expected insights, and report structure&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Both the dataset and the prompt can be adjusted to fit different team practices or project types. The prompt used in this tutorial is provided in the repository as a reference example.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AWSBedrock&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;PROMPT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;DATASET&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;REGION&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-east-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amazon.nova-lite-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PROMPT&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DATASET&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prompt_final&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;REGION&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODEL_ID&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_bedrock_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;bedrock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_payload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prompt_final&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inferenceConfig&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_new_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_p&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;bedrock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_bedrock_client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_payload&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;response_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;response_body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
             &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h4&gt;
  
  
  Inference configuration
&lt;/h4&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inferenceConfig&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_new_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_p&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;max_new_tokens (5000)&lt;/strong&gt;: Allows the model to generate detailed, structured reports&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;temperature (0.4)&lt;/strong&gt;: Ensures consistent and reliable analysis while preserving enough flexibility to detect patterns and nuances&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;top_p (0.9)&lt;/strong&gt;: Enables controlled diversity in model responses&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;A temperature of &lt;strong&gt;0.4&lt;/strong&gt; was selected after iterative testing, as higher values introduced unnecessary variability, while lower values reduced the model’s ability to surface implicit risks and insights.&lt;br&gt;
Before finalizing this configuration, multiple test runs were performed, refining both the dataset and the prompt to ensure the output aligned with the intended project analysis goals.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you want to learn more about how these parameters work, I've included this article.&lt;/p&gt;


&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/r_elena_mendez_escobar/genai-foundations-chapter-2-prompt-engineering-in-action-unlocking-better-ai-responses-l28" class="crayons-story__hidden-navigation-link"&gt;GenAI Foundations – Chapter 2: Prompt Engineering in Action – Unlocking Better AI Responses&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/r_elena_mendez_escobar" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F719582%2F2d700dae-2335-4c2f-9a32-4435184a4f4f.jpeg" alt="r_elena_mendez_escobar profile" class="crayons-avatar__image" width="200" height="200"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/r_elena_mendez_escobar" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Romina Elena Mendez Escobar
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Romina Elena Mendez Escobar
                
              
              &lt;div id="story-author-preview-content-2828216" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/r_elena_mendez_escobar" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F719582%2F2d700dae-2335-4c2f-9a32-4435184a4f4f.jpeg" class="crayons-avatar__image" alt="" width="200" height="200"&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Romina Elena Mendez Escobar&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/r_elena_mendez_escobar/genai-foundations-chapter-2-prompt-engineering-in-action-unlocking-better-ai-responses-l28" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Sep 9 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/r_elena_mendez_escobar/genai-foundations-chapter-2-prompt-engineering-in-action-unlocking-better-ai-responses-l28" id="article-link-2828216"&gt;
          GenAI Foundations – Chapter 2: Prompt Engineering in Action – Unlocking Better AI Responses
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/openai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;openai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/data"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;data&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
            &lt;a href="https://dev.to/r_elena_mendez_escobar/genai-foundations-chapter-2-prompt-engineering-in-action-unlocking-better-ai-responses-l28#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              &lt;span class="hidden s:inline"&gt;Add&amp;nbsp;Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            16 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;/div&gt;
&lt;br&gt;





&lt;h3&gt;
  
  
  📄 Step 2.5: Report Generation and Distribution
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;MarkdownPDFReport&lt;/strong&gt; class converts AI-generated Markdown into a professional, styled PDF document.&lt;/p&gt;

&lt;h4&gt;
  
  
  &amp;nbsp;Input parameters
&lt;/h4&gt;

&lt;p&gt;The class requires only:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Markdown text generated by the AI model&lt;/li&gt;
&lt;li&gt;An optional output path (in-memory or file-based)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Key features
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Custom heading hierarchies and typography&lt;/li&gt;
&lt;li&gt;Styled tables and lists&lt;/li&gt;
&lt;li&gt;Emoji-to-symbol mapping for visual status indicators&lt;/li&gt;
&lt;li&gt;Fully customizable styles defined in internal methods&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All visual styles are centralized and can be easily adapted to match organizational branding or reporting standards.&lt;/p&gt;

&lt;p&gt;Once generated, the PDF is stored in &lt;strong&gt;🪣 Amazon S3&lt;/strong&gt;* and sent via 📩 email using the previously described SES class, the email HTML template used for embedding the report is also available in the repository and can be modified as needed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjlaldv4v7fl6zqbce07n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjlaldv4v7fl6zqbce07n.png" alt=" " width="800" height="760"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  📄 Example Output: Email and Report Preview
&lt;/h2&gt;

&lt;p&gt;Below is an example of the report generated by the solution. The complete output consists of a six-page PDF, but for illustration purposes, the following screenshots show the cover page and a selection of summary tables used to highlight key project insights.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsovka6pvhhzwtorm0qi5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsovka6pvhhzwtorm0qi5.png" alt=" " width="800" height="695"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  Conclusions
&lt;/h1&gt;

&lt;p&gt;This article demonstrates how combining Kanban project data with generative AI can significantly enhance the way teams understand, communicate, and manage complex software projects. Beyond the technical implementation, several key insights and lessons emerged from this use case.&lt;/p&gt;

&lt;h3&gt;
  
  
  📉 Reducing Bias and Improving Decision-Making
&lt;/h3&gt;

&lt;p&gt;One of the main benefits of this approach is the ability to reduce subjective bias in project analysis. By evaluating task metadata, timelines, and written communication through AI-driven semantic analysis, teams gain a more objective view of project status, risks, and bottlenecks.&lt;br&gt;
This enables more focused stakeholder discussions and allows follow-up meetings to be based on concrete, data-driven insights rather than individual perceptions.&lt;/p&gt;

&lt;h3&gt;
  
  
  🗣️ Enhancing Stakeholder Communication
&lt;/h3&gt;

&lt;p&gt;In projects with a large number of tasks and contributors, explaining delays or risks can be challenging. Automatically generated reports help translate complex project data into clear, structured summaries, making it easier to communicate issues, dependencies, and priorities to non-technical stakeholders and leadership teams.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔄 Dataset and Tooling Flexibility
&lt;/h3&gt;

&lt;p&gt;Although this example is based on Trello, the same approach can be applied to other project management tools such as Jira, Azure DevOps, Odoo, or similar platforms. By adapting the data extraction layer, teams can reuse the same analysis and reporting pipeline across different tools and project types.&lt;br&gt;
Selecting only relevant fields remains critical, as passing unnecessary or empty data increases token usage without improving insight quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  💬 Prompt Design as a Key Success Factor
&lt;/h3&gt;

&lt;p&gt;Prompt engineering plays a central role in the quality of the generated insights. Providing better context—such as project goals, roadmap expectations, risks, or delivery constraints—helps the model produce more accurate and actionable conclusions.&lt;br&gt;
During experimentation, iterative prompt refinement proved essential. In some cases, enforcing a strict output format (such as JSON) reduced the depth of the analysis, whereas allowing freer, unstructured responses resulted in richer conclusions. This highlights the importance of testing different prompt strategies rather than assuming a single optimal format.&lt;/p&gt;

&lt;h3&gt;
  
  
  📑 Output Formats and Performance Considerations
&lt;/h3&gt;

&lt;p&gt;While this solution generates Markdown and converts it into a PDF report, alternative output formats such as JSON can also be produced. However, structured formats may negatively impact model performance if they overly constrain the response. Choosing the right output format depends on the downstream use case—human consumption, system integration, or further automation.&lt;/p&gt;

&lt;h3&gt;
  
  
  🧩 Model Selection Matters
&lt;/h3&gt;

&lt;p&gt;Model choice significantly affects the quality of insights. Initial experiments using Amazon Titan did not produce sufficiently meaningful conclusions for this use case. After evaluating multiple options, Amazon Nova proved to be the best fit, offering a better balance between contextual understanding, analytical depth, and consistency.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;AI should not replace project management practices, but it can act as a powerful decision-support layer, helping teams identify risks earlier, communicate more effectively, and focus discussions on what truly matters. With careful dataset selection, prompt design, and model evaluation, this approach can be adapted to a wide range of project environments and organizational needs.&lt;/p&gt;




&lt;p&gt;📚References&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Web Services&lt;/strong&gt;. (n.d.). AWS Glue documentation. &lt;a href="https://docs.aws.amazon.com/glue/" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/glue/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Web Services&lt;/strong&gt;. (n.d.). AWS Bedrock.
&lt;a href="https://aws.amazon.com/en/bedrock/" rel="noopener noreferrer"&gt;https://aws.amazon.com/en/bedrock/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Web Services&lt;/strong&gt;. (n.d.). Amazon Nova: Generative AI models.
&lt;a href="https://aws.amazon.com/es/ai/generative-ai/nova/" rel="noopener noreferrer"&gt;https://aws.amazon.com/es/ai/generative-ai/nova/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Asana. (n.d.)&lt;/strong&gt;. What is Kanban?.
&lt;a href="https://asana.com/es/resources/what-is-kanban" rel="noopener noreferrer"&gt;https://asana.com/es/resources/what-is-kanban&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kanban Tool&lt;/strong&gt;. (n.d.). Kanban history and evolution.
&lt;a href="https://kanbantool.com/kanban-guide/kanban-history" rel="noopener noreferrer"&gt;https://kanbantool.com/kanban-guide/kanban-history&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  📌 How to cite this article
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;APA style&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Mendez Escobar, Romina Elena. (2025). &lt;strong&gt;Data-Driven Project Analysis: Analyzing Trello Kanban Projects with AI on AWS Bedrock&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
&lt;a href="https://dev.to/aws-builders/data-driven-project-analysis-analyzing-trello-kanban-projects-with-ai-on-aws-bedrock-15f4"&gt;https://dev.to/aws-builders/data-driven-project-analysis-analyzing-trello-kanban-projects-with-ai-on-aws-bedrock-15f4&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BibTeX&lt;/strong&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
text
@article{mendez2025aiawstrello,
  title  = {Data-Driven Project Analysis: Analyzing Trello Kanban Projects with AI on AWS Bedrock},
  author = {Mendez Escobar, Romina Elena},
  year   = {2025},
  url    = {https://dev.to/aws-builders/data-driven-project-analysis-analyzing-trello-kanban-projects-with-ai-on-aws-bedrock-15f4}
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>aws</category>
      <category>machinelearning</category>
      <category>python</category>
    </item>
    <item>
      <title>From Raw Clinical Data to AI: Building a Modern Healthcare Data Platform on AWS</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Tue, 09 Dec 2025 10:04:02 +0000</pubDate>
      <link>https://dev.to/aws-builders/from-raw-clinical-data-to-ai-building-a-modern-healthcare-data-platform-on-aws-1mi7</link>
      <guid>https://dev.to/aws-builders/from-raw-clinical-data-to-ai-building-a-modern-healthcare-data-platform-on-aws-1mi7</guid>
      <description>&lt;p&gt;The &lt;strong&gt;OMOP&lt;/strong&gt; Common Data Model (&lt;code&gt;CDM&lt;/code&gt;) is a standard for observational health data that allows the analysis of clinical data in a consistent and reproducible way. Implementing &lt;strong&gt;OMOP CDM&lt;/strong&gt; in &lt;strong&gt;AWS&lt;/strong&gt; requires a &lt;code&gt;robust architecture&lt;/code&gt; that handles everything from data ingestion to advanced AI analysis, maintaining the highest standards of security and regulatory compliance, especially &lt;code&gt;HIPAA&lt;/code&gt; for health data.&lt;/p&gt;

&lt;p&gt;This guide describes a set of components in an architecture within AWS, and these do not define the only possible solution, I am only presenting a proposal of a series of components that you can use among the many services that this platform has available.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbbqidpxxcryscuszrm0k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbbqidpxxcryscuszrm0k.png" alt=" " width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;────────────────────────────────&lt;/p&gt;

&lt;h1&gt;
  
  
  🗂️ What is OMOP CDM?
&lt;/h1&gt;

&lt;p&gt;The &lt;strong&gt;OMOP Common Data Model (CDM)&lt;/strong&gt; is a standard designed by the OHDSI community to represent observational health data in a uniform way. Its main objective is to enable the &lt;strong&gt;standardization of medical data&lt;/strong&gt; where different institutions, clinical systems and databases speak the same “language,” in order to facilitate reproducible analysis, cohort comparisons and multicenter studies.&lt;br&gt;
The model is based on a set of normalized tables, standardized vocabularies and modeling conventions that define how patients, diagnoses, procedures, medication, clinical measurements, visits and temporal events should be represented.&lt;/p&gt;

&lt;p&gt;────────────────────────────────&lt;/p&gt;
&lt;h1&gt;
  
  
  👤 Model Structure: Patient as Central Entity
&lt;/h1&gt;

&lt;p&gt;OMOP organizes the information around &lt;strong&gt;the patient&lt;/strong&gt;, who acts as the central unit of the model, and this structure allows the reconstruction of the patient’s clinical timeline and the analysis of their events in a temporal way.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1m175vfadtsnyajpou9w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1m175vfadtsnyajpou9w.png" alt=" " width="800" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;────────────────────────────────&lt;/p&gt;
&lt;h1&gt;
  
  
  ❤️ Standardized Vocabularies: the semantic heart of OMOP
&lt;/h1&gt;

&lt;p&gt;One of the most important strengths of the CDM is the use of standardized vocabularies, which replace the diversity of ways of writing the same text with numeric IDs. These IDs allow the representation of clinical concepts in a consistent, interoperable and computable way.&lt;br&gt;
In addition, the vocabularies have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hierarchies (for example, “type 2 diabetes mellitus” is a subconcept of “endocrine and metabolic diseases”),&lt;/li&gt;
&lt;li&gt;Semantic relationships,&lt;/li&gt;
&lt;li&gt;Standard and non-standard concepts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thanks to these hierarchies, an analyst can perform broad studies without knowing all the specific codes. For example, to analyze metabolic diseases, they can query the higher category and automatically include all subclasses (including different types of diabetes)&lt;/p&gt;

&lt;p&gt;────────────────────────────────&lt;/p&gt;
&lt;h1&gt;
  
  
  ☁️ OMOP in AWS
&lt;/h1&gt;

&lt;p&gt;The architecture of the &lt;strong&gt;OMOP Common Data Model&lt;/strong&gt; can be implemented in multiple environments (on-premise, hybrid or in different cloud providers). However, AWS offers a particularly robust ecosystem to address the challenges of standardization, integration, governance and advanced clinical data analysis.&lt;/p&gt;

&lt;p&gt;In this section, we explore how to combine &lt;strong&gt;AWS services&lt;/strong&gt; to build a complete pipeline that allows ingesting, transforming, standardizing and analyzing health data under the OMOP standard, maintaining high levels of security, regulatory compliance and operational efficiency.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fivvbq6obrbkwo7fv17on.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fivvbq6obrbkwo7fv17on.png" alt=" " width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ This approach is not intended to be the only way to implement OMOP, but a practical and modular guide that will allow you to understand which AWS services can help you in each phase of the process.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  OMOP in AWS: Services by section
&lt;/h2&gt;
&lt;h3&gt;
  
  
  (1) 📄 Data: Clinical Sources, APIs and Personal Devices
&lt;/h3&gt;

&lt;p&gt;In a modern health ecosystem, data no longer comes only from a hospital’s internal systems. Today, clinical information is distributed across multiple platforms, technologies and devices, requiring architectures capable of integrating, unifying and standardizing heterogeneous sources.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc48ok9rbor9pwo1aseh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc48ok9rbor9pwo1aseh.png" alt=" " width="800" height="597"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  (2) 🔧 Pipeline Services: Data Ingestion and Initial Processing
&lt;/h3&gt;

&lt;p&gt;To build a robust pipeline that enables the standardization of clinical data toward OMOP, it is essential to define how the data is extracted, ingested and prepared before transformation.&lt;br&gt;
In this stage, the main objective is to capture the data from different sources and store them in raw format in &lt;strong&gt;Amazon S3&lt;/strong&gt;, always preserving traceability and the original state of the information.&lt;br&gt;
Below are the key services used in this phase:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon MWAA (Managed Workflows for Apache Airflow)&lt;/strong&gt;&lt;br&gt;
Amazon MWAA allows running Apache Airflow DAGs without managing the underlying infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Kinesis&lt;/strong&gt;&lt;br&gt;
Hospitals and health devices generate more and more real-time data; for these scenarios, Amazon Kinesis offers a highly scalable streaming solution.&lt;br&gt;
The combined use of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Kinesis Data Streams&lt;/strong&gt; (real-time ingestion)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kinesis Data Firehose&lt;/strong&gt; (automated delivery to S3) allows capturing data streams without additional infrastructure and storing them directly in the raw bucket, ready to be processed by Airflow or other services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;AWS Lambda&lt;/strong&gt;&lt;br&gt;
This service allows executing serverless functions without provisioning servers, which makes it ideal for small tasks and specific events within the pipeline.&lt;br&gt;
In this context, it is used for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lightweight pre-validation or normalization processes before sending files to S3.&lt;/li&gt;
&lt;li&gt;Moving or restructuring files when new data arrives.&lt;/li&gt;
&lt;li&gt;Automatic triggers when new objects are detected in S3 (for example, activating notifications).&lt;/li&gt;
&lt;/ul&gt;


&lt;h4&gt;
  
  
  (3) 🗂️ RAW Storage
&lt;/h4&gt;

&lt;p&gt;Once extracted, all data will be stored initially in Amazon S3, which will act as the RAW zone of the data lake. This layer preserves the data in its original format, without transformations, to guarantee traceability, auditing and reprocessing capability.&lt;br&gt;
Storage in S3 must be complemented with a set of key practices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IAM + S3 Bucket Policies ensure role-based access.&lt;/li&gt;
&lt;li&gt;Tags help automate governance and classification.&lt;/li&gt;
&lt;li&gt;Lake Formation adds granular control at table/column level.&lt;/li&gt;
&lt;li&gt;Lifecycle policies ensure retention and cost efficiency.&lt;/li&gt;
&lt;/ul&gt;


&lt;h4&gt;
  
  
  (4) 📌 Orchestration
&lt;/h4&gt;

&lt;p&gt;In this section we describe the key DAGs we need to coordinate the different stages of the pipeline. Orchestration is essential to ensure that the extractions, transformations and loads are executed consistently, auditable and scalable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7omjg590d2uv4922bgnp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7omjg590d2uv4922bgnp.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  (5) 🧠 AI &amp;amp; Unstructured Data
&lt;/h3&gt;

&lt;p&gt;To process clinical notes and other unstructured data, we need to incorporate NLP techniques that allow extracting entities, mapping clinical concepts and automatically encoding information.&lt;br&gt;
For this type of processing, we can rely on the following AWS services:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon SageMaker&lt;/strong&gt;&lt;br&gt;
Allows training, tuning and deploying custom NLP models, from classic models to advanced transformer-based ones. It is ideal when full control of the ML pipeline, preprocessing, fine-tuning and integration with other system components is needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Comprehend Medical&lt;/strong&gt;&lt;br&gt;
Managed service that extracts clinical entities, relationships and conditions directly from medical text.&lt;br&gt;
Important: Comprehend Medical supports a limited set of languages, so it is necessary to validate documentation before integrating it into the project.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In the following article you can find a complete implementation of a batch process using this service&lt;br&gt;
&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/r_elena_mendez_escobar/employing-aws-comprehend-medical-for-medical-data-extraction-in-healthcare-analytics-2dd8" class="crayons-story__hidden-navigation-link"&gt;Employing AWS Comprehend Medical for Medical Data Extraction in Healthcare Analytics&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/r_elena_mendez_escobar" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F719582%2F2d700dae-2335-4c2f-9a32-4435184a4f4f.jpeg" alt="r_elena_mendez_escobar profile" class="crayons-avatar__image" width="200" height="200"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/r_elena_mendez_escobar" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Romina Elena Mendez Escobar
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Romina Elena Mendez Escobar
                
              
              &lt;div id="story-author-preview-content-1947809" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/r_elena_mendez_escobar" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F719582%2F2d700dae-2335-4c2f-9a32-4435184a4f4f.jpeg" class="crayons-avatar__image" alt="" width="200" height="200"&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Romina Elena Mendez Escobar&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/r_elena_mendez_escobar/employing-aws-comprehend-medical-for-medical-data-extraction-in-healthcare-analytics-2dd8" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Aug 7 '24&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/r_elena_mendez_escobar/employing-aws-comprehend-medical-for-medical-data-extraction-in-healthcare-analytics-2dd8" id="article-link-1947809"&gt;
          Employing AWS Comprehend Medical for Medical Data Extraction in Healthcare Analytics
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/aws"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;aws&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/python"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;python&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/datascience"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;datascience&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/nlp"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;nlp&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/r_elena_mendez_escobar/employing-aws-comprehend-medical-for-medical-data-extraction-in-healthcare-analytics-2dd8" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;4&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/r_elena_mendez_escobar/employing-aws-comprehend-medical-for-medical-data-extraction-in-healthcare-analytics-2dd8#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              &lt;span class="hidden s:inline"&gt;Add&amp;nbsp;Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            13 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;



&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Amazon Bedrock integrated with SageMaker&lt;/strong&gt;&lt;br&gt;
Although Bedrock is a separate service, it can be integrated into ML flows in SageMaker. Its main contribution is enabling foundational models and generative AI capabilities, opening the door to new use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatic classification of clinical text.&lt;/li&gt;
&lt;li&gt;Concept normalization assisted by generative models.&lt;/li&gt;
&lt;li&gt;Semantic searches and context retrieval through vector databases (for example, to enrich mapping results or suggest probable clinical codes).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;(6) 🩺 OMOP CDM&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;All processing stages converge in the implementation of the &lt;strong&gt;OMOP Common Data Model (CDM)&lt;/strong&gt;, stored in a relational database optimized for analytical and mixed workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Aurora PostgreSQL&lt;/strong&gt;&lt;br&gt;
The recommended engine for hosting the CDM is &lt;strong&gt;Amazon Aurora PostgreSQL&lt;/strong&gt;, because it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maintains full SQL compatibility and supports OHDSI ecosystem tools.&lt;/li&gt;
&lt;li&gt;Provides high availability, automatic replication, and fast recovery.&lt;/li&gt;
&lt;li&gt;Scales horizontally with read replicas, ideal for analytical and concurrent workloads.&lt;/li&gt;
&lt;li&gt;Integrates seamlessly with ETL/ELT pipelines across AWS services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Depending on the use case, Aurora can be complemented with additional analytics-oriented services.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Redshift&lt;/strong&gt;&lt;br&gt;
For advanced analytics over large datasets derived from the CDM, &lt;strong&gt;Amazon Redshift&lt;/strong&gt; offers a distributed, high-performance environment for complex analytical queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Athena&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Amazon Athena&lt;/strong&gt; enables querying raw data stored in S3 without loading it into a database. It is especially useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quick validations before loading data into the CDM.&lt;/li&gt;
&lt;li&gt;Debugging and data quality checks using SQL.&lt;/li&gt;
&lt;li&gt;Exploring semi-structured files (CSV, JSON, Parquet).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Amazon ElastiCache&lt;/strong&gt;&lt;br&gt;
When the solution requires high-frequency or computationally expensive queries on the OMOP model, adding a cache layer with &lt;strong&gt;Redis&lt;/strong&gt; or &lt;strong&gt;Memcached&lt;/strong&gt; helps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduce latency for repeated queries.&lt;/li&gt;
&lt;li&gt;Store results of heavy computations (e.g., cohort definitions, vocabulary lookups).&lt;/li&gt;
&lt;li&gt;Improve performance for dashboards and clinical applications that require fast responses.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  (7) 📊 Data Visualization
&lt;/h3&gt;

&lt;p&gt;Data visualization is essential not only to consume information but also to analyze, monitor and validate each stage of the pipeline. As we process clinical data, vocabularies, transformations and AI results, we need tools that make the quality, behavior and evolution of the data evident.&lt;/p&gt;

&lt;p&gt;Below are various options depending on the use case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Amazon QuickSight&lt;/strong&gt;: It enables fast, interactive dashboards connected to Aurora, Redshift, Athena or S3. Its in-memory SPICE engine accelerates visualizations at scale while reducing load on source databases, making it ideal for data quality tracking and clinical monitoring.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon SageMaker Model Dashboard&lt;/strong&gt;: The SageMaker Model Dashboard centralizes observability for ML workflows, displaying metrics such as precision, recall and F1-score, along with model versions, drift indicators and execution history. This makes it easier to detect degradation early and maintain reliable NLP or predictive models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Fargate / Amazon EKS&lt;/strong&gt;: When fully custom dashboards are required—such as advanced visualizations, semantic comparisons or interactive analytics—Fargate and EKS provide the compute layer to run applications built with tools like Plotly, Dash, Streamlit or React-based libraries. This allows teams to create&lt;/li&gt;
&lt;/ul&gt;




&lt;h4&gt;
  
  
  (8) 🧭 Data Governance
&lt;/h4&gt;

&lt;p&gt;Data governance is critical when working with sensitive health information, ensuring that data remains cataloged, documented and protected throughout every stage of the pipeline. &lt;strong&gt;A strong governance layer enforces access policies&lt;/strong&gt;, allowing only authorized users to interact with clinical datasets under strict regulatory requirements. &lt;strong&gt;It also guarantees full traceability&lt;/strong&gt;, enabling auditing of how data is accessed, transformed and shared across environments. &lt;strong&gt;Finally, governance provides controlled discoverability&lt;/strong&gt;, ensuring that curated datasets can be safely searched and consumed while maintaining consistent metadata.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsrq87wstyvwunkxfbnls.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsrq87wstyvwunkxfbnls.png" alt=" " width="800" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS Lake Formation&lt;/strong&gt;&lt;br&gt;
AWS Lake Formation centralizes governance for data stored in S3, offering fine-grained permissions at the table, column or row level, enforcing traceability and integrating tightly with the Glue Data Catalog to maintain consistent metadata.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon DataZone&lt;/strong&gt;&lt;br&gt;
Amazon DataZone supports the organized publication and controlled sharing of datasets across the organization, enabling teams to work within structured data domains—such as Clinical, NLP, OMOP or Research—while unifying cataloging, governance and collaboration in one environment.&lt;/p&gt;




&lt;h3&gt;
  
  
  (9) 🔐 Security and Networking
&lt;/h3&gt;

&lt;p&gt;Security and connectivity are fundamental pillars in any health data architecture, especially to comply with regulations such as HIPAA. In AWS, there are multiple services that protect both data and infrastructure. Below we describe the main components and their role within our OMOP CDM architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1580554hlnuonqvqp8vd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1580554hlnuonqvqp8vd.png" alt=" " width="800" height="679"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;(10) 🎚️ Monitoring and Billing&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Monitoring and cost control are essential in health data architectures, especially when processing large clinical datasets or running AI workloads where training and inference can be resource-intensive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔍 Monitoring&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;AWS CloudWatch&lt;/strong&gt; provides centralized metrics, logs and events from all AWS services, enabling teams to track infrastructure health, Airflow DAG execution and the behavior of ETL/ELT pipelines while receiving alerts for anomalies. For deeper inspection, &lt;strong&gt;AWS X-Ray&lt;/strong&gt; traces requests across distributed systems—such as containerized services on &lt;strong&gt;ECS/EKS&lt;/strong&gt; or &lt;strong&gt;APIs&lt;/strong&gt; that expose OMOP data—making it easier to detect bottlenecks and debug complex data flows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧾 Billing&lt;/strong&gt;&lt;br&gt;
To maintain financial visibility and prevent cost overruns, &lt;strong&gt;AWS Cost Explorer&lt;/strong&gt; offers detailed insights into usage patterns across services, including AI and data-intensive components. Complementing this, &lt;strong&gt;AWS Budgets&lt;/strong&gt; allows setting custom spending limits and automated alerts, ensuring that project costs remain predictable and aligned with operational goals.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;(11)🧱 Code &amp;amp; Deployment&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Managing code and deploying infrastructure is essential to guarantee reproducibility, traceability and security in cloud-based health projects. This includes not only provisioning resources, but also maintaining reliable pipelines, consistent environments and well-governed ML assets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔧 Infrastructure as Code&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;Terraform&lt;/code&gt; allows defining the entire AWS architecture in a declarative way, ensuring that environments remain consistent and reproducible across development, staging and production. It supports provisioning core components such as S3 buckets, VPCs, databases and IAM roles while enforcing infrastructure governance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🗂️ Versioning &amp;amp; CI/CD&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;GitHub&lt;/code&gt; serves as the central platform for code collaboration, offering pull requests, reviews and issue management. With &lt;code&gt;GitHub Advanced Security&lt;/code&gt;, teams &lt;strong&gt;can catch vulnerabilities&lt;/strong&gt; early through dependency scanning and code analysis. &lt;br&gt;
&lt;code&gt;GitHub Actions&lt;/code&gt; complements this by automating &lt;strong&gt;CI/CD pipelines&lt;/strong&gt; building containers, validating data quality, deploying Airflow DAGs or updating infrastructure definitions—ensuring that each change is tested and safely promoted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🏷️ Models &amp;amp; Containers&lt;/strong&gt;&lt;br&gt;
For containerized workloads, &lt;code&gt;Amazon ECR&lt;/code&gt; provides a secure and scalable registry for images used in &lt;code&gt;ECS&lt;/code&gt;, &lt;code&gt;EKS&lt;/code&gt; or &lt;code&gt;Fargate&lt;/code&gt;, ensuring consistency across environments. In parallel, the &lt;code&gt;Amazon SageMaker Model Registry&lt;/code&gt; manages &lt;strong&gt;ML model versions&lt;/strong&gt;, capturing lineage, approvals and metadata so that each model deployed into production remains auditable and reproducible.&lt;/p&gt;




&lt;h3&gt;
  
  
  (12) 🚀 AI Consume
&lt;/h3&gt;

&lt;p&gt;Once the data is standardized and loaded into the OMOP CDM, it becomes the foundation for advanced analytics, AI-driven insights and secure data consumption. This unlocks opportunities for clinical research, decision support and the development of intelligent health applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;☁️ Data Consumption through APIs&lt;/strong&gt;&lt;br&gt;
Standardized OMOP data can be exposed through secure API layers, enabling internal and external systems to retrieve curated clinical information. Services such as Amazon API Gateway combined with AWS Lambda provide scalable, low-latency endpoints that support both real-time and batch consumption.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📊 Advanced Analysis and Machine Learning&lt;/strong&gt;&lt;br&gt;
Amazon SageMaker enables training, evaluating and deploying Machine Learning models directly on top of OMOP data. This supports use cases such as predicting clinical risks, classifying patients by comorbidities or analyzing treatment response patterns, all while integrating seamlessly with the existing data pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧩 Vector Search with Aurora and pgvector&lt;/strong&gt;&lt;br&gt;
By storing patient feature vectors in Aurora PostgreSQL using pgvector, the system can perform semantic similarity searches between patients or clinical cases. This capability enhances cohort discovery and enables personalized recommendation workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧠 Generative AI with Amazon Bedrock&lt;/strong&gt;&lt;br&gt;
Amazon Bedrock provides access to foundational models that can summarize clinical notes, extract information from unstructured text or augment concept mapping processes, expanding analytical depth through generative AI.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Researchers can query patients with similar disease profiles using pgvector, deploy readmission prediction models in SageMaker or generate automated insights from clinical notes using Bedrock-powered NLP.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  📚 Conclusions
&lt;/h1&gt;

&lt;p&gt;This guide presents a compact proposal for implementing OMOP CDM on AWS, showing how its services can support secure, scalable and efficient clinical data processing. The architecture is flexible and can be adapted to different project needs.&lt;/p&gt;

&lt;p&gt;AWS provides an ecosystem that covers the entire data lifecycle, allowing integration with open-source tools and containerized workloads while maintaining control over performance and costs. This balance is especially important in health and AI-driven environments.&lt;/p&gt;

&lt;p&gt;Building on strong governance and security practices, the proposed approach demonstrates that AWS enables compliant and reliable data workflows. With the right configuration, clinical data can be transformed into meaningful insights for research, analytics and innovation.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aws</category>
      <category>cloud</category>
      <category>architecture</category>
    </item>
    <item>
      <title>AWS re:Invent 2025: Updates in Infrastructure, Security, and Compute + Learning Path Summary</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Mon, 08 Dec 2025 09:52:24 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-reinvent-2025-updates-in-infrastructure-security-and-compute-learning-path-summary-3i72</link>
      <guid>https://dev.to/aws-builders/aws-reinvent-2025-updates-in-infrastructure-security-and-compute-learning-path-summary-3i72</guid>
      <description>&lt;h1&gt;
  
  
  📖 Introduction
&lt;/h1&gt;

&lt;p&gt;At &lt;code&gt;re:Invent 2025&lt;/code&gt;, AWS placed &lt;strong&gt;Generative AI&lt;/strong&gt; at the center, moving from simple chats to agents that understand context, execute tasks, and integrate natively with infrastructure, security, and data services. Within this approach, AWS launched in &lt;a href="https://skillbuilder.aws/" rel="noopener noreferrer"&gt;Skill Builder&lt;/a&gt; a learning path with &lt;strong&gt;33 courses&lt;/strong&gt; and more than 60 hours to learn these new services, from fundamental to advanced level.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6kogvjso94nros4kahdo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6kogvjso94nros4kahdo.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  🔍 Why is this re:Invent a turning point?
&lt;/h1&gt;

&lt;p&gt;The big novelty this year is how generative AI stops being an isolated component and becomes a central engine that drives automation, security, infrastructure, and operations. We are entering a stage where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🤖 &lt;strong&gt;Agents&lt;/strong&gt; not only process language: they execute real actions in AWS.&lt;/li&gt;
&lt;li&gt;🔧 &lt;strong&gt;IaC&lt;/strong&gt; automation is complemented by intelligent flows that detect, decide, and act.&lt;/li&gt;
&lt;li&gt;🔓 &lt;strong&gt;Securit&lt;/strong&gt; y is transformed thanks to the ability to analyze large volumes of logs in seconds, where every minute is critical.&lt;/li&gt;
&lt;li&gt;🗂️ &lt;strong&gt;Data engineering&lt;/strong&gt; and &lt;strong&gt;observability&lt;/strong&gt; are rewritten with agents that contextualize, correlate, and recommend.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To support this technological leap, AWS launched new services (some very recent) and updated others, which motivated the design of an integrated learning path to learn them in a structured way.&lt;/p&gt;

&lt;h3&gt;
  
  
  🛠️Learning path details
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;33 total courses and more than 60 hours of content.&lt;/li&gt;
&lt;li&gt;26 fundamental-level courses, 4 intermediate, and 3 advanced, combining updates of existing services with completely new launches.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  📘 Service Overviews &amp;amp; Course Levels
&lt;/h1&gt;

&lt;p&gt;The learning path organizes 33 courses by technical depth to help learners navigate new AWS services efficiently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link&lt;/strong&gt; 👉 &lt;a href="https://skillbuilder.aws/learning-plan/JZQY2Z8DG4/aws-reinvent-2025-announcements-learning-plan/VWQU3VK65K" rel="noopener noreferrer"&gt;https://skillbuilder.aws/learning-plan/JZQY2Z8DG4/aws-reinvent-2025-announcements-learning-plan/VWQU3VK65K&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz1jgiuw770usz5nesvs6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz1jgiuw770usz5nesvs6.png" alt=" " width="800" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Course Levels:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🟢 &lt;strong&gt;Beginner&lt;/strong&gt; (&lt;code&gt;26 courses&lt;/code&gt;): Introduces core services and fundamental concepts.&lt;/li&gt;
&lt;li&gt;🟡 &lt;strong&gt;Intermediate&lt;/strong&gt; (&lt;code&gt;4 courses&lt;/code&gt;): Covers integration, automation, and real-world deployments.&lt;/li&gt;
&lt;li&gt;🔴 &lt;strong&gt;Advanced&lt;/strong&gt; (&lt;code&gt;3 courses&lt;/code&gt;): Focuses on autonomous agents, high-performance compute, and advanced security.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Kiro
&lt;/h3&gt;

&lt;p&gt;It is a development environment (IDE) with AI agents that start from a written specification and generate code, tests, and documentation, helping to design and maintain applications more quickly and consistently.&lt;br&gt;
⏱ 3:30 hours | 📚 3 courses&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Kiro Getting Started&lt;/li&gt;
&lt;li&gt;🟢 Introduction to Kiro powers (Update)&lt;/li&gt;
&lt;li&gt;🟡 Spec-Driven Development with Kiro&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Amazon Nova 2
&lt;/h3&gt;

&lt;p&gt;It is a family of multimodal generative AI models (text, image, audio, video) designed for advanced reasoning, conversational assistants, and content generation in enterprise applications.&lt;br&gt;
⏱ 04:15 hours | 📚 4 courses&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Amazon Nova 2: Understanding Models (New)&lt;/li&gt;
&lt;li&gt;🟢 Amazon Nova 2 Sonic: Next-Generation Conversational AI (Update)&lt;/li&gt;
&lt;li&gt;🟢 Introduction to Amazon Nova Forge (New)&lt;/li&gt;
&lt;li&gt;🟡 Extended Thinking with Amazon Nova (Update)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Amazon Quick Suite
&lt;/h3&gt;

&lt;p&gt;An integrated analytics and business intelligence platform powered by generative AI that unifies agents for research, data visualization, and workflow automation, accessible via chat and embedded in tools like browser, Slack, or Office.&lt;br&gt;
⏱ 03:10 hours | 📚 3 courses&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Introduction to Amazon Quick Suite&lt;/li&gt;
&lt;li&gt;🟢 Getting Started with Administering Amazon Quick Suite&lt;/li&gt;
&lt;li&gt;🟡 Amazon Quick Automate – Building Intelligent Workflows (Update)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  AWS DevOps Agent
&lt;/h3&gt;

&lt;p&gt;AI agent for operations that analyzes events and metrics, automates incident response, assists with root cause analysis, and suggests preventive actions to improve reliability.&lt;br&gt;
⏱ 1:00 hour | 📚 1 course&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Introduction to AWS DevOps Agent (New)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  AWS AI Factories
&lt;/h3&gt;

&lt;p&gt;It is a dedicated AI infrastructure solution deployed in the customer’s data center, with specialized hardware to train and run models while maintaining data sovereignty.&lt;br&gt;
⏱ 00:30 minutes | 📚 1 course&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Introduction to AWS AI Factories (New)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Amazon SageMaker
&lt;/h3&gt;

&lt;p&gt;It is AWS’s managed machine learning platform that offers notebooks, data preparation tools, model training, and model deployment, now with more serverless options and a focus on foundation models. In this latest update, it includes a set of “SageMaker AI” capabilities such as serverless notebooks, simplified customization of foundation models, and elastic training with HyperPod to scale without managing infrastructure.&lt;br&gt;
⏱ 03:30 hours | 📚 4 courses&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Introduction to Amazon SageMaker Notebooks (Update)&lt;/li&gt;
&lt;li&gt;🟢 Introduction to Model Customization in Amazon SageMaker AI (Update)&lt;/li&gt;
&lt;li&gt;🔴 Elastic Training on Amazon SageMaker HyperPod (New)&lt;/li&gt;
&lt;li&gt;🔴 Checkpointless Training on Amazon SageMaker HyperPod (New)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  AWS Security Agent
&lt;/h3&gt;

&lt;p&gt;Security agent that reviews from code to production environment, automates configuration assessments and penetration tests, and generates recommendations to reduce risk throughout the development lifecycle.&lt;br&gt;
⏱ 00:30 minutes | 📚 1 course&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Introduction to AWS Security Agent (Tech Preview) (Update)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Amazon Bedrock
&lt;/h3&gt;

&lt;p&gt;It is the service that allows building and operating AI agents based on foundation models, with security controls, continuous evaluation, and policies to govern their behavior.&lt;br&gt;
⏱ 02:10 hours | 📚 2 courses&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 AgentCore Evaluation on Amazon Bedrock (New)&lt;/li&gt;
&lt;li&gt;🟢 AgentCore Policy on Amazon Bedrock (New)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Amazon EC2
&lt;/h3&gt;

&lt;p&gt;This service has new compute instances with next-generation GPUs designed to train and serve large AI models with high performance. The new instances are optimized for frontier model training, combining next-generation GPUs with network and storage improvements to offer several times more performance than previous generations.&lt;br&gt;
⏱ 02:30 hours | 📚 5 courses&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Introduction to Amazon EC2 P6e-GB300 UltraServers (Update)&lt;/li&gt;
&lt;li&gt;🟢 Introduction to Capacity Manager for Amazon EC2 (Update)&lt;/li&gt;
&lt;li&gt;🟢 Introduction to Amazon EC2 Instance Attestation (New)&lt;/li&gt;
&lt;li&gt;🟢 Introduction to Amazon EC2 P6-B300 Instances (New)&lt;/li&gt;
&lt;li&gt;🟢 Introduction to Capacity Manager for Amazon EC2 (New)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Amazon S3 Vectors
&lt;/h3&gt;

&lt;p&gt;It is an S3 capability to store vectors (embeddings) and perform semantic and similarity searches on documents, images, or other objects.&lt;br&gt;
⏱ 1:00 hour | 📚 1 course&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Amazon S3 Vectors Getting Started (Update)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Amazon FSx for NetApp ONTAP
&lt;/h3&gt;

&lt;p&gt;Fully managed service that provides ONTAP file systems with enterprise features (snapshots, clones, replication) and the elasticity and pay-as-you-go model of AWS cloud.&lt;br&gt;
⏱ 1:15 hours | 📚 1 course&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Amazon FSx for NetApp ONTAP Primer (Update)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Amazon Aurora PostgreSQL&lt;br&gt;
It is a relational database compatible with PostgreSQL that adds policies to hide or transform sensitive data. This new functionality allows defining dynamic masking policies so that sensitive data is displayed differently depending on the user’s role, reinforcing access control at column and row level.&lt;br&gt;
⏱ 1:30 hours | 📚 1 course&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔴 Dynamic Data Masking in Aurora PostgreSQL (New)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AWS Transform
&lt;/h3&gt;

&lt;p&gt;It is a suite of AI-powered tools to modernize .NET applications, full-stack Windows, and custom code, automating analysis, refactoring, and migration to accelerate legacy modernization toward cloud-native architectures.&lt;br&gt;
⏱ 03:00 hours | 📚 3 courses&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 AWS Transform for .NET Getting Started (Update)&lt;/li&gt;
&lt;li&gt;🟢 AWS Transform Custom (New)&lt;/li&gt;
&lt;li&gt;🟢 AWS Transform Full-Stack Windows (New)&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🚀 Conclusion: AI as the Engine of the Cloud Ecosystem
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;AWS re:Invent 2025&lt;/strong&gt; marks a decisive turning point: &lt;code&gt;Generative AI&lt;/code&gt; has moved beyond being an isolated tool to become the central engine that drives the transformation of the cloud ecosystem.&lt;/p&gt;

&lt;p&gt;This learning path of &lt;strong&gt;33 courses&lt;/strong&gt; is not just a set of trainings but a strategic roadmap showing how &lt;code&gt;infrastructure&lt;/code&gt;, &lt;code&gt;security&lt;/code&gt;, and &lt;code&gt;operations&lt;/code&gt; converge with &lt;code&gt;AI&lt;/code&gt; to enable a new generation of solutions.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;incorporation of agents&lt;/strong&gt;, along with the evolution of compute and security improvements, is creating environments that are much more autonomous, efficient, and prepared for new use cases.&lt;/p&gt;

&lt;p&gt;Specialized infrastructure plays a key role, where AWS AI Factories ensure data sovereignty in regulated industries, while the new EC2 instances optimized for AI increase performance for model training and deployment at scale. In this set of updates, it is clear that foundation models are becoming more powerful and are a fundamental part of decision-making, intelligent automation, and the creation of AI-powered products, generating real competitive advantage for organizations that can combine AI + infrastructure + security as a single strategy.&lt;/p&gt;

&lt;p&gt;Therefore, this learning path is the ideal starting point to learn the new features, prepare your skills, and put them into practice in your next project within the AWS ecosystem.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>devops</category>
      <category>security</category>
    </item>
    <item>
      <title>From Search to Story: Using Gemini API to Automate Brand Content Analysis with Python</title>
      <dc:creator>Romina Elena Mendez Escobar</dc:creator>
      <pubDate>Mon, 20 Oct 2025 17:54:15 +0000</pubDate>
      <link>https://dev.to/r_elena_mendez_escobar/from-search-to-story-using-gemini-api-to-automate-brand-content-analysis-with-python-2i1a</link>
      <guid>https://dev.to/r_elena_mendez_escobar/from-search-to-story-using-gemini-api-to-automate-brand-content-analysis-with-python-2i1a</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;In a hyperconnected world, every post, comment, or interaction contributes to building a brand's reputation. Therefore, identifying what people are talking about and turning it into stories that inform, inspire, and connect is essential for any modern communication strategy.&lt;/p&gt;

&lt;p&gt;This article was born from a concrete question: &lt;strong&gt;how can Generative AI be used to discover what is being said about a company and transform that information into relevant stories?&lt;/strong&gt; Stories that reflect real experiences and concerns, turning them into inspiring narratives that strengthen brand identity.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiinwni9w2ltd5h22jjqh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiinwni9w2ltd5h22jjqh.png" alt=" " width="799" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this tutorial, you will learn how to use Google Gemini to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔍 Search for information&lt;/strong&gt; using generative AI integrated with Google Search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;✍️ Transform findings&lt;/strong&gt; into structured journalistic narratives&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;📊 Generate visual reports&lt;/strong&gt; with graphics and automated storytelling&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  &amp;nbsp;What is Brand Journalism?
&lt;/h1&gt;

&lt;p&gt;According to an article by &lt;a href="https://nytlicensing.com/latest/marketing/brand-journalism-and-why-it-matters/" rel="noopener noreferrer"&gt;The New York Times Licensing Group&lt;/a&gt;, readers experience significant content fatigue: there are more than 1.8 billion websites and over 70 million blogs published each month.&lt;/p&gt;

&lt;p&gt;Brand Journalism is a communication strategy where brands adopt journalistic techniques to tell relevant and engaging stories. Instead of direct advertising messages, content is created with a narrative, informative, and value-added approach, similar to traditional media.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cr9w8fpz7gddlzqu3tq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cr9w8fpz7gddlzqu3tq.png" alt=" " width="799" height="337"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Journalistic techniques:&lt;/strong&gt; Application of rigorous journalistic methods to create credible and well-structured content.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audience interests:&lt;/strong&gt; Focus on the real interests of the audience, not just the messages the brand wants to convey.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quality and useful information&lt;/strong&gt;: Content that educates, informs, or solves concrete problems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use of different formats:&lt;/strong&gt; Variety of formats (reports, interviews, analyses, infographics, videos) to maintain engagement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storytelling:&lt;/strong&gt; Narratives that connect emotionally with values, experiences, and social impact.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1905tnjax72ha9hxju1r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1905tnjax72ha9hxju1r.png" alt=" " width="800" height="407"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Benefits
&lt;/h2&gt;

&lt;p&gt;The benefits we can identify based on this are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Brand Positioning:&lt;/strong&gt; Establish yourself as a thought leader in your industry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audience Loyalty:&lt;/strong&gt; Build authentic and lasting relationships with your audience.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Differentiation against the Competition:&lt;/strong&gt; Stand out from competitors through higher-quality editorial content.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Greater Organic Reach:&lt;/strong&gt; Valuable content is naturally shared, amplifying reach without direct advertising investment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8vyxb7bqplqzjr1marfn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8vyxb7bqplqzjr1marfn.png" alt=" " width="800" height="306"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  What is Generative AI?
&lt;/h1&gt;

&lt;p&gt;Generative AI is a branch of artificial intelligence focused on creating new and original content: text, images, audio, video, or synthetic data. Its development has been possible thanks to deep learning, especially through advanced architectures such as transformers, which process information in parallel and capture complex relationships in large data volumes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Additional Resources on GenAI
&lt;/h2&gt;

&lt;p&gt;I have written a series of articles on the fundamentals of generative AI&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkkmxi7r8nyuejaornk27.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkkmxi7r8nyuejaornk27.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://dev.to/r0mymendez/genai-foundations-chapter-1-prompt-basics-from-theory-to-practice-1a5"&gt;GenAI Foundations – Chapter 1: Prompt Basics: From Theory to Practice&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/r0mymendez/genai-foundations-chapter-2-prompt-engineering-in-action-unlocking-better-ai-responses-l28"&gt;GenAI Foundations – Chapter 2: Prompt Engineering in Action – Unlocking Better AI Responses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/r0mymendez/genai-foundations-chapter-3-rag-patterns-and-best-practices-cpc"&gt;GenAI Foundations – Chapter 3: RAG Patterns and Best Practices&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/r0mymendez/genai-foundations-chapter-4-model-customization-evaluation-can-we-trust-the-outputs-i21"&gt;GenAI Foundations – Chapter 4: Model Customization &amp;amp; Evaluation – Can We Trust the Outputs?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/r0mymendez/genai-foundations-chapter-5-project-planning-with-the-generative-ai-canvas-2o73"&gt;GenAI Foundations – Chapter 5: Project Planning with the Generative AI Canvas&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Gemini
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Gemini&lt;/strong&gt; is a family of multimodal AI models developed by Google DeepMind. It integrates into multiple Google products and can process text, images, and other data types simultaneously.&lt;/p&gt;

&lt;h3&gt;
  
  
  Grounding with Google Search
&lt;/h3&gt;

&lt;p&gt;For this use case, we will use the Grounding with Google Search functionality, which connects the model directly to Google to perform searches and obtain up-to-date information.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdazimmkpdxzdnybj6ej.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdazimmkpdxzdnybj6ej.png" alt=" " width="799" height="389"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Main Advantages:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;📏Increased Accuracy:&lt;/strong&gt; Reduces model hallucinations by accessing verifiable information.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⚡️Real-Time Information:&lt;/strong&gt; Access to current data, reducing uncertainty about the model's knowledge.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;📚Citations and References:&lt;/strong&gt; Retrieves source links and provides control over consulted data sources.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Use Case
&lt;/h1&gt;

&lt;p&gt;Brand Journalism is a strategic tool for companies to communicate their values from an authentic perspective. However, we often need to find topics that might interest our target audience, so it is essential to search for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mentions of the company on different sites&lt;/li&gt;
&lt;li&gt;Reputation and notable aspects&lt;/li&gt;
&lt;li&gt;Trends and relevant conversations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This starting point helps those who write articles or create storytelling based not only on what the company wants to show but also on the external perspective others have of it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Practical Example: 📱iPhone 17
&lt;/h2&gt;

&lt;p&gt;Using the latest iPhone launch as an example, we will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Search for recently published articles&lt;/li&gt;
&lt;li&gt;Classify and analyze these documents&lt;/li&gt;
&lt;li&gt;Generate a report with visualizations, conclusions, and structured narratives&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;Next, we will see how to implement this strategy through an automated workflow that integrates AI and data analysis.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Implementation Process
&lt;/h2&gt;

&lt;p&gt;The following diagram illustrates how our automated analysis system works.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fptgtujbp39akqmt6szbg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fptgtujbp39akqmt6szbg.png" alt=" " width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1️⃣ Search with Google Search
&lt;/h3&gt;

&lt;p&gt;We use &lt;strong&gt;Grounding with Google Search&lt;/strong&gt; to find relevant articles and request output in JSON format using this structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; 
   &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"full article title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"source_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"media name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"publication date"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"article link"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"site_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"website name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2-4 line summary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"sentiment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"positive/negative/neutral"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rumor/analysis/comparison/market/technical"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"sentiment_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1-10 score"&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  2️⃣ Storytelling Generation
&lt;/h3&gt;

&lt;p&gt;We use another prompt to generate different types of narratives based on the articles found:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Analytical Insights:&lt;/strong&gt; Compact analytical summary with concrete data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storytelling Narrative:&lt;/strong&gt; Engaging mini-narrative based on dataset evidence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tone Variants (A/B/C):&lt;/strong&gt; Three versions with different focuses: objective, emotional, and strategic.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  3️⃣ Report Creation
&lt;/h3&gt;

&lt;p&gt;We generate a PDF report including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Charts created with Seaborn and Matplotlib&lt;/li&gt;
&lt;li&gt;Visual trend analyses&lt;/li&gt;
&lt;li&gt;Narrative conclusions based on generated storytelling&lt;/li&gt;
&lt;li&gt;Customizing the layout using ReportLab&lt;/li&gt;
&lt;/ul&gt;


&lt;h1&gt;
  
  
  Tutorial
&lt;/h1&gt;
&lt;h2&gt;
  
  
  How Does Gemini Work with Google Search?
&lt;/h2&gt;

&lt;p&gt;When performing a query, Gemini not only relies on its internal knowledge but also actively searches updated information on Google Search. This grounding capability allows the model to access real-time data, verify facts, and provide responses based on concrete sources, reducing hallucination risk and ensuring relevance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxhe427glwz3m59rfbvto.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxhe427glwz3m59rfbvto.png" alt=" " width="799" height="469"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Pre-requisite: Access to Gemini API
&lt;/h2&gt;

&lt;p&gt;Before starting, you need to get access to the Gemini API:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create an account in &lt;a href="https://aistudio.google.com/" rel="noopener noreferrer"&gt;Google AI Studio&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Create or log in with your Google account&lt;/li&gt;
&lt;li&gt;Generate your API key from the control panel&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: You can use Gemini's free tier to test this project.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvm22ktvs6q2my515n4ai.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvm22ktvs6q2my515n4ai.png" alt=" " width="800" height="691"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you have your API key, configure it in a .env file:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;API_KEY &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"tu_api_key_de_gemini"&lt;/span&gt;
MODEL_ID &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"gemini-2.5-flash"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;We use Gemini 2.5 Flash because it is the most cost-efficient model optimized for frequent, low-cost tasks.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Repository Structure
&lt;/h2&gt;

&lt;p&gt;For this tutorial you must clone the following repository and you can get the complete code from this tutorial.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/RominaElenaMendezEscobar" rel="noopener noreferrer"&gt;
        RominaElenaMendezEscobar
      &lt;/a&gt; / &lt;a href="https://github.com/RominaElenaMendezEscobar/brand-journalism-gemini" rel="noopener noreferrer"&gt;
        brand-journalism-gemini
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Tutorial about Brand Journalism Code Using Google Gemini
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;p&gt;&lt;a href="https://www.buymeacoffee.com/r0mymendez" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/b96fd4ea89ea15fcec30a4f86382eef0bbd17454aa3a8d4de8c8c5e92b55cf6c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4275792532304d6525323041253230436f666665652d737570706f72742532306d79253230776f726b2d4646444430303f7374796c653d666c6174266c6162656c436f6c6f723d313031303130266c6f676f3d6275792d6d652d612d636f66666565266c6f676f436f6c6f723d7768697465" alt="Buy Me A Coffee"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;How to Use AI in Brand Journalism with Gemini to Transform Digital Information into Strategic Editorial Content?&lt;/h1&gt;
&lt;/div&gt;

&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Introduction&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;In a hyperconnected world, every post, comment, or interaction contributes to building a brand's reputation. Therefore, identifying what people are talking about and turning it into stories that inform, inspire, and connect is essential for any modern communication strategy.&lt;/p&gt;

&lt;p&gt;This repository was born from a concrete question: &lt;strong&gt;how can Generative AI be used to discover what is being said about a company and transform that information into relevant stories?&lt;/strong&gt; Stories that reflect real experiences and concerns, turning them into inspiring narratives that strengthen brand identity.&lt;/p&gt;

&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/RominaElenaMendezEscobar/brand-journalism-gemini/img/readme/1.google-search.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FRominaElenaMendezEscobar%2Fbrand-journalism-gemini%2FHEAD%2Fimg%2Freadme%2F1.google-search.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;In this tutorial, you will learn how to use Google Gemini to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🔍 Search for information&lt;/strong&gt; using generative AI integrated with Google Search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;✍️ Transform findings&lt;/strong&gt; into structured journalistic narratives&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;📊 Generate visual reports&lt;/strong&gt; with graphics and automated storytelling&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;#&amp;nbsp;What is Brand Journalism
According to an…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/RominaElenaMendezEscobar/brand-journalism-gemini" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;If you find this tutorial helpful, feel free to leave a star ⭐️ and follow me to get notified about new articles. Your support helps me grow within the tech community and create more valuable content! 🚀&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;project/
   ├── img/                    &lt;span class="c"&gt;# Generated graphics&lt;/span&gt;
   ├── prompt/
   │   ├── prompt_search.txt       &lt;span class="c"&gt;# Search Prompt&lt;/span&gt;
   │   └── prompt_storytelling.txt &lt;span class="c"&gt;# Prompt for narrative&lt;/span&gt;
   ├── report/                &lt;span class="c"&gt;# PDFs generated&lt;/span&gt;
   ├── brand_journalist_analyzer.py
   ├── report_plots.py
   ├── report_analysis.py
   └── main.py
   └── .env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Main Files
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1️⃣ Prompts (/prompt)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🗒&lt;code&gt;prompt_search.txt&lt;/code&gt;&lt;/strong&gt;: Here we define how to perform the search in Google Search and structure the results in JSON. This prompt instructs the model to return structured information with fields such as the article's title, source, date, URL, summary, sentiment, and category.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🗒&lt;code&gt;prompt_storytelling.txt&lt;/code&gt;&lt;/strong&gt;: In this file, we define how to generate conclusions and storytelling based on the articles found. It requests different types of outputs, including objective analysis, immersive narratives, and three tone variants (objective, emotional, and emotional).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;2️⃣ Brand Journalism Analyzer&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🗒&lt;code&gt;brand_journalist_analyzer.py&lt;/code&gt;&lt;/strong&gt;: This class is the core of the application and handles all interaction with the Gemini API. It implements three main functionalities: news retrieval using Google Search, structured storytelling generation, and analytical insights extraction. 
The most important method is &lt;strong&gt;search_news()&lt;/strong&gt;, which executes real-time searches and returns structured data in JSON format. To use integrated Google Search, simply set &lt;code&gt;config={"tools": [{"google_search": {}}]}&lt;/code&gt; in the API call.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_news&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;create_dataframe&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Search for news on a topic using Google Search.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search_prompt&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;google_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}}]}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Process and clean JSON response
&lt;/span&gt;    &lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="n"&gt;clean_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_clean_json_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;3️⃣ Visualization Generator&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;report_plots.py:&lt;/code&gt; This class creates all the report visualizations using Seaborn and Matplotlib. It generates three essential chart types: a bar chart showing which media outlets publish the most on the topic, a timeline visualizing the evolution of publications over time, and a heatmap that cross-references sentiment with content categories. 
All visual aspects are customizable: color palette, titles, axis labels, and save paths. The methods first prepare the data with Pandas aggregations and then generate the visualizations, automatically saving them as PNG files.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;4️⃣ PDF Report Generator&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;report_analysis.py&lt;/code&gt;: This class assembles the final report in professional PDF format using ReportLab. It combines multiple elements: a customizable logo, corporate-style headers, informative tables about the analyzed dataset, pre-generated visualizations, formatted narratives with full Markdown support (including headings, lists, code, and emphasis), and conclusions and storytelling sections with different tone variations. &lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🎯Process Orchestration
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;main.py&lt;/code&gt; file constitutes the application's main entry point, orchestrating the entire Brand Journalism pipeline. This script coordinates the interaction between all the developed classes, managing the flow from real-time information retrieval to the generation of the final document, ensuring that each component executes in the correct order and with the necessary dependencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🐍main.py&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;brand_journalist_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BrandJournalistAnalyzer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;report_analysis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ReportAnalysis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;report_plots&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataVisualizer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Cargar variables de entorno
&lt;/span&gt;    &lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;MODEL_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MODEL_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Generar ruta con timestamp
&lt;/span&gt;    &lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;%Y%m%d%H%M%S&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;output_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;report/news_report_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# Inicializar analizador
&lt;/span&gt;    &lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BrandJournalistAnalyzer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Buscar o cargar noticias (usa caché si existe)
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_load_or_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;force_refresh&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;search_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Generar storytelling y conclusiones
&lt;/span&gt;    &lt;span class="n"&gt;storytelling&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_storytelling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conclusion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_conclusion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Crear visualizaciones
&lt;/span&gt;    &lt;span class="n"&gt;visualizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataVisualizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;search_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;visualizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot_news_by_source&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;visualizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot_news_over_time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;visualizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot_sentiment_category_heatmap&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Generar reporte PDF
&lt;/span&gt;    &lt;span class="n"&gt;report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ReportAnalysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;search_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;conclusion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;conclusion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;storytelling&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;storytelling&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_report&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🗒 Report Generation
&lt;/h3&gt;

&lt;p&gt;The system automatically generates a professional PDF report using Seaborn/Matplotlib for visuals and ReportLab for document layout. It includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Media coverage charts&lt;/li&gt;
&lt;li&gt;Temporal trends&lt;/li&gt;
&lt;li&gt;Heatmap crossing content categories with sentiment&lt;/li&gt;
&lt;li&gt;Structured storytelling and analytical conclusions&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Final Report Structure
&lt;/h3&gt;

&lt;p&gt;In this use case, we generated a four-page PDF report that provides a comprehensive overview of the analysis, starting with complete details of the websites and media outlets where relevant news stories on the researched topic were found.&lt;/p&gt;

&lt;p&gt;The document includes graphical visualizations specifically designed to analyze temporal publishing trends, allowing for the identification of patterns of interest over time, as well as categorical classifications based on the criteria identified by the AI ​​model following the instructions defined in the search prompt.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpl8wktg9u3u4z7htzjok.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpl8wktg9u3u4z7htzjok.png" alt=" " width="800" height="245"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The final section of the report presents analytical conclusions based on quantitative data and storytelling narratives structured in different tones, providing multiple perspectives on the same information.&lt;/p&gt;




&lt;h1&gt;
  
  
  💡 Conclusions
&lt;/h1&gt;

&lt;p&gt;AI can be a powerful tool for optimizing research and analysis processes, but I still believe that authentic company communication requires the perspective, sensitivity, and values ​​that only people can provide.&lt;/p&gt;

&lt;p&gt;This tutorial offers an automated &lt;strong&gt;starting point&lt;/strong&gt; that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Collects and structures scattered information&lt;/li&gt;
&lt;li&gt;Identifies patterns and trends in large data volumes&lt;/li&gt;
&lt;li&gt;Generates evidence-based insights&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, Brand Journalism work should remain in the hands of professionals who can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Interpret data within the organizational context&lt;/li&gt;
&lt;li&gt;Align narratives with real corporate values&lt;/li&gt;
&lt;li&gt;Add nuances, experiences, and internal perspectives&lt;/li&gt;
&lt;li&gt;Ensure the message genuinely reflects brand identity&lt;/li&gt;
&lt;li&gt;Humanize content with empathy and authentic connection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;AI&lt;/strong&gt; provides the knowledge foundation, but people create the true connection with the audience. Therefore, effective storytelling emerges from combining automated analysis with human narrative craftsmanship.&lt;/p&gt;




&lt;h1&gt;
  
  
  📚 References:
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What Is Brand Journalism — and Why It Matters.&lt;br&gt;
The New York Times Licensing Group.&lt;/strong&gt;&lt;br&gt;
Retrieved from &lt;a href="https://nytlicensing.com/latest/marketing/brand-journalism-and-why-it-matters/" rel="noopener noreferrer"&gt;https://nytlicensing.com/latest/marketing/brand-journalism-and-why-it-matters/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Gemini About.&lt;/strong&gt;&lt;br&gt;
Google.&lt;br&gt;
Retrieved from &lt;a href="https://gemini.google/about/" rel="noopener noreferrer"&gt;https://gemini.google/about/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pichai, S., &amp;amp; Hassabis, D. (2023, December 6). Introducing Gemini: Our largest and most capable AI model.&lt;/strong&gt;&lt;br&gt;
Google Blog.&lt;br&gt;
Retrieved from &lt;a href="https://blog.google/technology/ai/google-gemini-ai/#sundar-note" rel="noopener noreferrer"&gt;https://blog.google/technology/ai/google-gemini-ai/#sundar-note&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Grounding with Google Search.&lt;/strong&gt;&lt;br&gt;
Google AI Documentation.&lt;br&gt;
Retrieved from &lt;a href="https://ai.google.dev/gemini-api/docs/google-search?hl=es-419" rel="noopener noreferrer"&gt;https://ai.google.dev/gemini-api/docs/google-search?hl=es-419&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Do you have any other thoughts or suggestions?&lt;/strong&gt; Leave them in the comments.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>cloud</category>
      <category>gemini</category>
      <category>ai</category>
      <category>python</category>
    </item>
  </channel>
</rss>
