DEV Community: Romina Elena Mendez Escobar

From Hype to Product: How AI Is Being Used Today

Romina Elena Mendez Escobar — Fri, 10 Apr 2026 12:10:27 +0000

For years, we talked about artificial intelligence as a promise, and over the last two years in the era of LLMs… AI has stopped being the protagonist and has become invisible infrastructure.

Today we don’t see it as “AI”, it becomes invisible and already forms part of our interactions and experiences without us noticing. This means when an app understands what you need before you search, when a recommendation hits the mark effortlessly, or when a decision happens in real time without an explicit interface.

In this article, we explore some concrete examples of how artificial intelligence is moving from hype to product, and the patterns that are starting to emerge.

💄Sephora Transforms User Experience with AI in ChatGPT

Image Reference: https://newsroom.sephora.com

Sephora, a global leader in premium beauty products, has launched its app within ChatGPT, currently in a pilot phase in the United States. Participating users can receive personalized beauty product recommendations by linking their Beauty Insider account, explore solutions tailored to their needs, and take advantage of benefits such as free shipping and samples.

🔗 Link: https://newsroom.sephora.com/sephora-app-in-chatgpt-brings-a-new-personalized-beauty-experience/

☕️ What’s Behind Every Cup of Coffee You Enjoy at Starbucks?

Starbucks, the global coffeehouse chain, is using artificial intelligence to enhance the experience for both customers and partners without replacing human interaction. Tools like Green Dot Assist help baristas get quick answers about recipes, routines, and service standards, while Smart Queue optimizes the order flow across in-store, drive-thru, mobile, and delivery channels.

Additionally, the Starbucks Ordering Companion, still in development, will guide customers to discover their ideal drink and locate nearby stores, while maintaining the warmth and personalization that define the brand.

🔗 Link: https://about.starbucks.com/press/2026/supporting-the-moments-that-matter-with-artificial-intelligence/

🖼️ Pinterest AI Turns Your Ads into Scalable Results

Pinterest, the visual discovery and search platform, is integrating artificial intelligence into its Performance+ campaigns, allowing advertisers to optimize results without micromanaging every detail. AI automation helps intelligently combine content, audiences, and placements, making ads more relevant and scalable while advertisers focus on providing content and product signals.

🔗 Link: https://business.pinterest.com/es/blog/pinterest-performance-plus-campaigns/

✈️ Delta Reveals Why We Still Seek Real Experiences When Traveling

Reference image: https://news.delta.com/deltas-inaugural-connection-index-finds-why-travelers-are-priorizing-real-wold-experiences

A new global trends report from Delta Air Lines, Connection Index: Why We Fly, examines why travelers choose to fly and how these experiences impact their emotions. The study shows that even in an increasingly digital world, travelers prioritize authentic experiences: 84% of international travelers report a strong desire to explore new places and meet people, and 73% have traveled specifically to see in person something they first discovered online.

🔗 Link: https://news.delta.com/deltas-inaugural-connection-index-finds-why-travelers-are-prioritizing-real-world-experiences

📊 Agentic AI in Action: How Amazon Helps Sellers Make Real-Time Decisions

Amazon, a global leader in e-commerce and technology, is integrating generative and agentic AI into its Seller Central platform to improve how sellers manage and scale their businesses. Through a new experience called Canvas, sellers can create interactive visual spaces that combine business data, insights, and recommended actions in real time.

Reference Image: https://www.aboutamazon.com/news/innovation-at-amazon

This experience is based on the agentic architecture of Seller Assistant, powered by Amazon Bedrock and models such as Amazon Nova and Anthropic Claude. Sellers can ask questions or select suggestions, and the system automatically builds a personalized dashboard that enables performance analysis, scenario simulations (such as changes in demand or inventory), and optimization of marketing campaigns with concrete projections.

🔗 Link: https://www.aboutamazon.com/news/innovation-at-amazon/amazon-sellers-canvas-artificial-intelligence

📱Apple Unifies Business Management on a Single Platform

Apple, a leading technology company, has launched Apple Business, an all-in-one platform that integrates device management, collaboration tools, and brand presence into a single environment. The solution allows companies to automatically configure devices through Blueprint projects, manage users and apps, and centralize services such as email, calendar, and directory with their own domains.

🔗 Link: https://www.apple.com/es/newsroom/2026/03/introducing-apple-business/

🛒 How Instacart Connects Physical Stores and Real-Time Data with AI

Instacart, a technology platform for retail and supermarkets, is developing a “Physical AI” system that connects real-world data with cloud models to enhance the shopping experience. Through devices like Caper Carts, equipped with sensors, cameras, and edge processing (NVIDIA Jetson), the company captures real-time information about products, cart location, and user behavior within the store. This data is combined with cloud systems that use recommendation models and transformer-based architectures to generate insights and suggestions at the exact moment of purchase.

Reference: https://www.instacart.com/company/enterprise-blog

🔗 Link: https://www.instacart.com/company/enterprise-blog/connecting-stores-from-edge-to-cloud-reinventing-retail-with-physical-ai

💳 How Mastercard Is Building Trust in the Era of AI Agents
Mastercard is a global payments technology company that connects consumers, merchants, and financial institutions, developing infrastructures that enable secure and scalable transactions worldwide.

In this context, it introduces Verifiable Intent, a new layer of trust for commerce with AI agents, developed in partnership with Google. This system creates a cryptographically verifiable record of what a user authorized before an agent acts on their behalf, connecting identity, intent, and action in a single source of truth. As agents move from assisting to executing purchases, this solution addresses a key challenge for bringing AI into production: ensuring traceability, authorization, and dispute resolution in autonomous transactions, making trust a central component of the product.

🔗 Link: https://www.mastercard.com/global/en/news-and-trends/stories/2026/verifiable-intent.html

Before You Go

As we wrap up this edition, here’s a tool to help you put AI into action in your own projects:

🧩 Recommended App

Stitch is an experimental AI tool from Google Labs that lets you quickly turn text prompts into functional app designs for mobile and desktop. It supports interactive prototyping, allows real-time collaboration, and can export your designs to popular platforms like Figma or as HTML code. In short, Stitch makes it easy to transform ideas into high-fidelity UI prototypes without complex setup.

Reference image: https://stitch.withgoogle.com

📢 Join the Conversation

What do you think about the ways AI is transforming experiences across industries, from retail to travel to design? I’d love to hear your thoughts, ideas, or favorite AI tools.

Hit reply and share your perspective!

More women in Tech. Fewer women leading

Romina Elena Mendez Escobar — Tue, 31 Mar 2026 13:41:07 +0000

Every March invites me to pause, and on a personal level, it’s a moment to acknowledge the progress made toward equality, but also to reflect honestly on the challenges that still remain.

In recent years, we have seen encouraging signs: more women are pursuing careers in technology, science, and data. At the same time, initiatives to promote diversity within organizations have grown, along with conversations around female leadership and inclusion programs across the sector.

However, when we look at who occupies decision-making roles in technology (who leads teams, defines strategy, or drives innovation) the reality still reflects an uneven path.

From my experience working in IT, one question keeps coming up: if more women are studying STEM fields (science, technology, engineering, and mathematics) and developing technical skills, why is it still so difficult to see them in technical leadership roles?

With that question in mind, I reviewed several recent reports and what I found is that there is no single cause, but rather a combination of structural and cultural factors that reinforce one another.

Understanding them together is key to explaining why progress remains so slow.

A persistent gap: the numbers behind the reality

To frame the conversation, it is worth starting with a few recent data points:

〰️ (1) Globally, women represent around 50% of the working-age population, yet they hold only 40% of total employment and approximately 35.4% of management positions, according to the International Labour Organization [1].
〰️ (2) Within the technology sector, the situation is even more pronounced. In Europe, women account for fewer than one in five tech workers [6], and according to McKinsey’s analysis, their presence in core technical roles has not only failed to improve over time but has actually declined: from 22% in earlier reports to approximately 19% in more recent ones. This suggests that, rather than closing, the gap may in fact be widening [2].
〰️ (3) At the highest levels, the numbers are equally telling. In 2025, women lead just 11% of Fortune 500 companies, compared to 10.4% the previous year [3], a modest increase that, in perspective, highlights the slow pace of progress.
〰️ (4) According to the 2025 Women’s Power Gap report, of the 64 new CEOs appointed in the S&P 500 in 2024, only 11 were women (17% of the total), and none were founders of the companies they were set to lead [4].
〰️ (5) The gender pay gap adds another layer to this picture: in the European Union, women earn on average around 12% less than men [9].
〰️ (6) The 82% of the female leaders surveyed say they have had to change companies at least once in order to take the next step in their professional career [9].

These figures describe the outcome, but not the process. To understand why this situation persists, we need to look inside organizations and examine the mechanisms shaping women’s career progression.

The broken rung: when careers start at a disadvantage

One of the most useful concepts for explaining this gap is called Broken Rung. The image is precise: it is not about a glass ceiling preventing women from reaching the top, but rather a damaged step at the very beginning that makes it harder for many women to take their first step into leadership.

According to a McKinsey study conducted in the United States, for every 100 men promoted to their first management role, only around 80 women achieve the same advancement [2]. At first glance, this may seem like a small difference, but its consequences compound over time. If fewer women reach the first step of leadership, there will also be fewer candidates at the next level, and even fewer at the level above.

With each promotion, the starting pool shrinks, and female representation gradually diminishes as one moves up the hierarchy.

This cascading effect largely explains why executive levels in technology companies show such limited representation. The problem is not the final barrier before reaching roles such as CEO or CTO, it lies in that initial moment when decisions are made about who takes on early management and leadership responsibilities and who does not.

At this point, it is worth adding another insight highlighted by McKinsey: 49% of women in the European technology sector reported experiencing sexism or bias in the past year, and 82% said they feel the need to prove their competence more than their male peers in order to be recognized [2].

These are not just individual experiences; they are indicators of an environment where the standards of evaluation are not the same for everyone, and where promotion decisions may be influenced by different expectations based on gender.

Invisible work: tasks that consume time without building careers

Alongside the broken rung, there is a second mechanism that operates more quietly but just as effectively: non-promotable work. This refers to all the tasks that are necessary for the day-to-day functioning of organizations but are not recognized in performance evaluations nor contribute to career advancement.

The list is familiar to anyone who has worked in an organization: taking meeting notes, organizing team events, coordinating onboarding logistics for new hires, managing recognition initiatives or gifts, or participating in committees that have no direct impact on the business. These tasks are essential, yet they are not reflected in any performance metric and, when it comes to evaluating promotions, they simply do not count.

The issue is not only that these tasks go unrecognized, but also that they are not distributed equitably. According to an analysis published by The Guardian in 2022 [7], women tend to take on these responsibilities more frequently. This results in less time available for strategic projects and reduced visibility within the organization. In some cases, this difference can amount to nearly a month of work per year spent on tasks that do not contribute to professional growth, compared to their male counterparts.

Over time, this pattern not only limits individual development but also structurally reinforces the gap in access to leadership roles.

Learning to stay relevant: the challenge of continuous upskilling

In this context, one of the most important responses is reskilling: the ability to learn new skills and adapt to ongoing market transformations. Developing capabilities in areas such as AI, data, cloud, infrastructure, cloud computing, DevOps and security will be critical in the coming years for those who want to remain relevant and grow professionally.

However, technical training, while necessary, is not sufficient on its own. It is equally essential to develop a deep understanding of the industries where technology is applied: understanding the real challenges organizations face, identifying the most appropriate solutions for each context, and being able to design realistic implementation paths. In this sense, training in project management, agile methodologies, and research and development practices is not an optional complement, but a core component of the professional profile the market will demand.

As Meirav Oren, CEO and co-founder of Versatile, noted during the World Economic Forum:

This insight points to a well-documented phenomenon: many women tend to apply for new positions only when they feel they meet all the requirements, whereas men often apply when they meet only part of them. This is not a difference in capability, but rather a reflection of how the environment has shaped confidence and risk perception.

For this reason, fostering environments where women can take on challenges, learn through the process, and make their work visible is just as important as any technical training program.

Systemic barriers in transition: the added impact of AI

When viewed together, what emerges is not a list of isolated issues, but a system of barriers that reinforce one another. The broken rung reduces, from the outset, the number of women who enter leadership, while non-promotable work consumes the time and energy that could otherwise be invested in building visibility and career progression.

And to this already complex system, we must now add a new and accelerating force: artificial intelligence.

AI is redefining skills, roles, and organizational dynamics. As new opportunities emerge, others evolve or transform at an increasing pace.

However, this transformation also presents a specific challenge for women's participation in technology. In many teams, women have historically had stronger representation in areas such as design, user experience, and product management. According to McKinsey, women represent approximately 53% of design roles and 39% of product management positions [2].

These same areas are among those most affected by the adoption of AI-driven tools. In particular, early-career roles are already showing signs of decline, with a 3% decrease in design and a 2% decrease in product roles between 2024 and 2025 [2].

This does not mean these roles will disappear, but rather that they are evolving rapidly and demanding new technical and strategic capabilities. Entry-level profiles, in particular, face greater challenges, as they require structured support, continuous learning, and real opportunities to adapt.

In this context, the risk is not technological but structural: if women do not have equitable access to reskilling, upskilling, and leadership opportunities within these transformations, the gap may widen even further in the coming years.

None of these dynamics operate in isolation. Rather, it is their combination that explains why, despite the growing number of women entering the technology sector, representation in leadership roles remains so limited.

And precisely because the problem is systemic, the solutions must be as well.

Building the future of technology is also a matter of diversity

Technological progress opens up enormous opportunities for society, but it also raises a question we cannot ignore: who is designing the systems we will use in the future?

Algorithms, digital platforms, and artificial intelligence systems are not neutral. They are shaped by the decisions, experiences, and contexts of those who build them.

In software architecture, there is a principle known as Conway’s Law, which states that organizations design systems that mirror their communication structures. Applied to diversity, this means that if technology teams are not diverse — or if communication is hierarchical and limited — those same constraints may be reflected in the solutions we create.

This is not only a matter of equality, but also of innovation, social impact, and the quality of the technology we bring into the world. Diverse teams make better decisions, consider more perspectives, and ultimately build more robust solutions.

March 8 serves as a reminder that, although progress has been made, the path toward equitable participation in technology leadership is still ongoing. And this challenge does not belong to a single day or a single sector: it is part of an ongoing responsibility.

Promoting inclusion, supporting the professional development of women in technology, and creating real pathways to leadership are not just goals. They are ways of building teams where different perspectives can coexist and enrich the decisions we shape — now more than ever — in technology.

Because the future of technology will not only be defined by what we build... but by who is given the opportunity to build it.

📚References

[1] Deloitte. (n.d.). Women at work: Global outlook. https://www.deloitte.com/global/en/issues/work/content/women-at-work-global-outlook.html
[2] McKinsey & Company. (n.d.). Women in tech and AI in Europe: Can the region close its gender gap? https://www.mckinsey.com/capabilities/mckinsey-technology/our-insights/women-in-tech-and-ai-in-europe-can-the-region-close-its-gender-gap#/
[3] Fortune. (2025, June 2). Fortune 500 female CEOs 2025. https://fortune.com/2025/06/02/fortune-500-female-ceos-2025/
[4] Women’s Power Gap. (2025). CEO report 2025. https://www.womenspowergap.org/wp-content/uploads/2025/05/WPG_CEO-Report_2025.pdf
[5] Council of the European Union. (n.d.). The EU’s gender pay gap: Facts and figures. https://www.consilium.europa.eu/en/policies/the-eu-s-gender-pay-gap-facts-and-figures/
[6] Euronews. (2026, March 8). Why women are disappearing from Europe’s tech workforce. https://www.euronews.com/next/2026/03/08/why-women-are-disappearing-from-europes-tech-workforce
[7] The Guardian. (2022, May 9). They feel guilty: Why women should say no to office housework. https://www.theguardian.com/society/2022/may/09/they-feel-guilty-why-women-should-say-no-to-office-housework
[8] World Economic Forum. (2025, June). What to know about AI and the gender gap. https://www.weforum.org/stories/2025/06/amnc25-what-to-know-about-ai-and-the-gender-gap/
[9] KPMG. (2025). Global female leaders outlook 2025. https://assets.kpmg.com/content/dam/kpmgsites/pt/pdf/kpmg-global-female-leaders-outlook-2025.pdf.coredownload.inline.pdf

AI in healthcare: how OpenAI is transforming medical care

Romina Elena Mendez Escobar — Mon, 19 Jan 2026 10:17:37 +0000

Introduction
Artificial intelligence is increasingly being adopted in highly regulated industries, and healthcare is a clear example of how this technology can improve processes, access to information, and the quality of care.
According to OpenAI’s latest product announcements, more than 230 million people worldwide use ChatGPT every week to ask questions related to health and wellbeing. This growing adoption reflects a broader shift in how individuals and professionals seek medical information and support.

Healthcare systems face significant challenges: clinical staff are often overwhelmed, medical knowledge is highly fragmented, and administrative complexity continues to grow. AI is beginning to address these issues by supporting decision-making, reducing operational burdens, and making medical information more accessible.

This month, OpenAI introduced new healthcare-focused capabilities designed to support both medical professionals and patients. These services aim to bring trusted information and care-related workflows closer to people, while prioritizing security, compliance, and responsible use in one of the most sensitive and regulated industries.

OpenAI for Healthcare: Operationalizing AI in Healthcare Organizations

OpenAI for Healthcare is specifically designed for healthcare organizations such as hospitals, research centers, clinic networks, and integrated health systems. Its primary goal is to provide a secure, enterprise-grade platform that enables these institutions to deliver more consistent, high-quality care, while reducing the administrative burden that consumes a significant amount of clinicians’ time.

One of the platform’s most distinctive capabilities is its evidence retrieval with clear citations. Responses are grounded in trusted medical sources, including millions of peer-reviewed studies, public health guidelines, and up-to-date clinical directives. This allows healthcare professionals to verify information more easily and support clinical decisions with reliable, evidence-based insights.

Another particularly valuable feature is the use of reusable templates to streamline workflows. These shared templates support common tasks such as drafting discharge summaries, patient instructions, clinical letters, and prior authorization requests. As a result, clinical teams spend less time rewriting repetitive documentation and searching for information, while patients benefit from clearer guidance and smoother transitions of care.

image source: https://openai.com/es-419/index/openai-for-healthcare/

Below is an overview of the main capabilities offered by this solution.

ChatGPT Health: A Smarter Way to Understand Your Health

ChatGPT Health is designed for individual users who want to better understand their own health and navigate a complex healthcare system. Health is already one of the most popular topics on ChatGPT, because every week, users ask questions about health and wellbeing.

Users can securely connect personal health data from multiple sources, including electronic health records and wellness apps. They can also upload their own documents or images, such as lab results or medical reports. This centralization allows ChatGPT Health to provide more relevant, personalized responses, helping users interpret information, summarize results, and prepare for appointments.

The tool is designed for practical, everyday use. It can help users review lab results, prepare questions for medical visits, provide guidance on diet, exercise, or wellness routines, and support understanding of insurance options based on personal health habits. It also includes features like voice input, dictation, and advanced search, making the experience more accessible and tailored to individual needs.

image source: https://openai.com/es-ES/index/introducing-chatgpt-health/

Below is an overview of the types of data sources users can integrate with ChatGPT Health.

Comparative Overview

A side-by-side look at how OpenAI for Healthcare and ChatGPT Health support clinical teams and individual users with AI-driven health insights.

Conclusion

The introduction of AI in healthcare is showing real potential, not only as a tool to support clinical workflows, but also as a way to provide reliable information and guidance to people who may not have easy access to specialized care. OpenAI for Healthcare and ChatGPT Health represent a major step forward in applying AI to one of the most regulated and sensitive industries.

Currently, these tools are limited in availability: OpenAI for Healthcare serves select institutions, and ChatGPT Health operates through a waitlist. How and when these solutions expand to smaller clinics, rural areas, or other countries will be key in determining their ability to truly democratize access to high-quality health support.

Healthcare is constantly evolving, with new scientific evidence, clinical guidelines, and regulatory updates. AI solutions like these can help by keeping pace with these changes, providing relevant and accurate information over time.

While AI will not replace healthcare professionals, these tools offer opportunities to reduce administrative burdens, improve efficiency, and empower both clinicians and patients with personalized insights. By making healthcare more accessible, understandable, and responsive, AI can complement human care, helping to achieve better outcomes while supporting professionals rather than replacing them.

📚Referencias

OpenAI. (2025). OpenAI for Healthcare. OpenAI. https://openai.com/es-419/index/openai-for-healthcare/
OpenAI. (2025). Introducing ChatGPT Health. OpenAI. https://openai.com/es-ES/index/introducing-chatgpt-health

📌 How to cite this article

APA style

Mendez Escobar, Romina Elena. (2025). AI in healthcare: how OpenAI is transforming medical care.

https://dev.to/r_elena_mendez_escobar/ai-in-healthcare-how-openai-is-transforming-medical-care-ffn

BibTeX


text
@article{mendez2025aihealthcare,
  title  = {AI in healthcare: how OpenAI is transforming medical care},
  author = {Mendez Escobar, Romina Elena},
  year   = {2025},
  url    = {https://dev.to/r_elena_mendez_escobar/ai-in-healthcare-how-openai-is-transforming-medical-care-ffn}
}

TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?

Romina Elena Mendez Escobar — Mon, 05 Jan 2026 08:41:45 +0000

Introduction

Over the past months, I came across several articles claiming that TOON can significantly reduce token usage in LLM prompts compared to traditional JSON.

That raised a few questions for me:

Does TOON still provide benefits with real-world API responses?
How much does it actually reduce tokens?
And more importantly: does changing the format affect how an LLM interprets the data or the quality of the response?

Answering these questions isn’t simple, and the results can vary depending on the dataset, the structure of the data, and even the LLM itself. It’s also not a simple matter of counting token, different formats may influence how the model understands and processes the information.

In this article, I aim to run a practical benchmark to explore whether TOON could be useful in production pipelines, in what contexts it performs best, and whether it works well across different types of JSON.

This article walks through the experiment, the results, and the conclusions.

What Is TOON (and How Is It Different from JSON)?

TOON (Terse Object-Oriented Notation) is a data serialization format designed specifically for LLM prompts. The goal is simple: reduce syntactic overhead while remaining readable for both humans and machines.

The Experiment

This experiment evaluates whether alternative data serialization formats can reduce token usage in LLM prompts without degrading response quality.

The experiment follows four main stages:

Dataset Fetching: Data is retrieved from public APIs and prepared for downstream processing.
Token Benchmarking: Each dataset is encoded in JSON and TOON, and token counts are computed using a tokenizer to measure size differences across formats.
LLM Interaction: The serialized data is sent to an LLM via Amazon Bedrock to generate responses and embeddings under deterministic settings.
Semantic Evaluation: Outputs generated from JSON and TOON prompts are compared using semantic (cosine similarity) and lexical (ROUGE, BLEU) metrics to assess equivalence.

The goal is not to optimize prompt content, but to isolate the impact of serialization format on token efficiency and response consistency.

Datasets

In this experiment, I wanted to test TOON with realistic, publicly available data, rather than small, manually created datasets. Using real API responses allows us to see how token savings and LLM behavior hold up in practical scenarios.
I selected two public APIs with very different characteristics:

GitHub Events API: Returns a stream of recent public events on GitHub, such as pushes, pull requests, issues, and comments.
- 🔗 URL: https://api.github.com/events
- 🧩 Data structure: Deeply nested, heterogeneous objects with multiple levels of dictionaries and arrays.
- 💡 Why this matters: Represents the kind of complex operational API data you might send to an LLM in real projects.
Wikipedia Page Views API:Returns the** top-viewed articles on English Wikipedia** for a given day.
- 🔗 URL: https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/all-access/2024/01/01
- 🧩 Data structure: Flat, repetitive lists of articles, each with numeric metrics (title, views, category).
- 💡 Why this matters: Ideal for testing TOON’s efficiency with flat, repetitive data, where token savings are expected to be highest. Using these two APIs allows us to evaluate TOON in both complex nested and flat list scenarios, giving a more comprehensive view of its performance in real-world LLM prompts.

Fetching the Data

To extract data from these APIs, we created the following utility class:

class DatasetFetcher:
    """Fetch datasets from different sources"""

    @staticmethod
    def fetch_github_events(limit: int = 30) -> List[Dict]:
        """Fetch recent GitHub events"""
        url = "https://api.github.com/events"
        response = requests.get(url)
        response.raise_for_status()
        return response.json()[:limit]

    @staticmethod
    def fetch_wikipedia_pages(limit: int = 30) -> List[Dict]:
        """Fetch popular Wikipedia pages"""
        headers = {
            "User-Agent": "TOON-Benchmark/1.0 (Research)"
        }
        url = "https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/all-access/2024/01/01"
        response = requests.get(url, headers=headers)
        response.raise_for_status()

        data = response.json()
        articles = data["items"][0]["articles"][:limit]
        return articles

This class allows you to quickly fetch sample datasets for testing token efficiency with TOON and JSON formats.

Part 1: Token Reduction

🏗️ Methodology

To measure token usage, I used tiktoken, the same tokenizer employed by many OpenAI-compatible models. This allows us to estimate how many tokens are consumed by the prompt payload itself, independent of the model’s output.
For TOON generation, I used the toon-format library, which converts Python objects into TOON while preserving structure and ordering.
The following classes implement token counting and incremental benchmarking using these libraries:

class TokenCounter:
    """Count tokens using tiktoken"""

    def __init__(self, model: str = "gpt-4"):
        self.encoder = tiktoken.encoding_for_model(model)

    def count(self, text: str) -> int:
        """Count tokens in a text string"""
        return len(self.encoder.encode(text))

This class allows you to quickly count tokens in any string, whether it’s JSON, TOON, or plain text.

class TokenBenchmark:
    """Benchmark token reduction: JSON vs TOON"""

    def __init__(self, config: BenchmarkConfig):
        self.config = config
        self.counter = TokenCounter(config.model)

    def incremental_benchmark(self, data_list: List[Dict], dataset_name: str) -> pd.DataFrame:
        """
        Perform incremental benchmark comparing JSON vs TOON

        Args:
            data_list: List of objects to analyze
            dataset_name: Name of dataset for identification

        Returns:
            DataFrame with benchmark results
        """
        results = []
        accum = []

        for idx, item in enumerate(data_list, start=1):
            accum.append(item)

            # Encode in both formats
            json_prompt = json.dumps(accum, ensure_ascii=False)
            toon_prompt = toon_encode(accum)

            # Count tokens
            json_tokens = self.counter.count(json_prompt)
            toon_tokens = self.counter.count(toon_prompt)

            # Calculate reduction
            saved = json_tokens - toon_tokens
            reduction_pct = (saved / json_tokens) * 100 if json_tokens else 0

            results.append({
                "num_items": idx,
                "JSON_tokens": json_tokens,
                "TOON_tokens": toon_tokens,
                "tokens_saved": saved,
                "reduction_pct": round(reduction_pct, 2),
                "dataset": dataset_name
            })

        return pd.DataFrame(results)

These classes allow us to incrementally benchmark token usage, providing a detailed view of how much TOON reduces tokens compared to JSON as items accumulate in a prompt.

🧪 Results: Token Reduction Metrics

In the code available in the repository, you can see the classes used to compute these results.

What stands out, however, is that token reduction is not uniform across datasets.

dataset	mean	std	min	max
github_events	2.77	0.26	2.60	4.02
wikipedia_pages	42.61	6.66	13.64	46.70

GitHub Events (complex, nested data) - Average token reduction: ~3%
Wikipedia Pages (flat, repetitive data) - Average token reduction: ~43%

💡 Why the difference?

For GitHub Events, the reduction is only ~3%, which means that using TOON instead of JSON does not significantly reduce token usage. The reason is that deep nesting and heterogeneous keys limit how much syntactic overhead can be removed.
For Wikipedia Pages, the reduction is ~43% because flat, repetitive lists benefit greatly from removing braces, commas, and repeated field names.

Part 2: Does Response Quality Stay the Same?

The second experiment focuses on response quality, the goal is to verify whether using the same prompt, but providing the data encoded in JSON versus TOON, produces equivalent outputs from the LLM.
For this experiment, I used the Wikipedia dataset, since it showed the highest token reduction (~45%). This makes it an ideal candidate to evaluate whether aggressive token savings have any negative impact on output quality.
To compare the responses, I generated outputs using both formats and evaluated them using several text similarity metrics.

🧪 Results: Evaluation Metrics

To assess output quality, I used the following metrics, each capturing a different aspect of similarity.

LLM and Embeddings Setup (AWS Bedrock)

All responses and embeddings were generated using AWS Bedrock, Amazon’s fully managed service for accessing foundation models.
The following models were used:

⚡ Amazon Nova Lite (amazon.nova-lite-v1:0): A lightweight, cost-efficient LLM optimized for fast inference. In this experiment, it was used for prompt completion and response generation.
⚡ Amazon Titan Embeddings (amazon.titan-embed-text-v2:0): A text embedding model that converts text into high-dimensional vectors. It was used to generate vector representations of the responses for semantic similarity comparison.

Bedrock Client Implementation

The following class encapsulates interaction with AWS Bedrock for both prompt generation and embedding extraction.

`invoke_prompt`

This method sends a prompt to the LLM and returns the generated response.
It accepts the following parameters:

💬 prompt: The base instruction or question provided to the model.
📄 dataset: The data to analyze, encoded either in JSON or TOON, which is appended to the prompt.
🌡️ temperature: Controls the randomness of the model’s output.

🌡️ Why `temperature = 0`?

The temperature parameter with this value is due to:

It reduces randomness in model outputs
It makes responses deterministic across multiple runs
It ensures that any differences in the outputs are due to the input format (JSON vs TOON), not sampling variability

Without fixing the temperature, it would be impossible to reliably attribute differences in response quality to the serialization format alone.

`get_embeddings`

This method generates vector embeddings for a given text using the embedding model.
The resulting vectors are later used to compute cosine similarity, allowing us to measure semantic equivalence between responses generated from JSON and TOON inputs.

Overall, these parameters allow us to control model behavior and isolate the impact of input serialization, with temperature being the most important variable for this experiment.

class AWSBedrockClient:
    """Client to interact with AWS Bedrock"""

    def __init__(self, region: str, model_prompt: str, model_embedding: str,
                 aws_access_key_id: str = None, aws_secret_access_key: str = None):
        self.region = region
        self.model_prompt = model_prompt
        self.model_embedding = model_embedding
        self.client = boto3.client(
            service_name='bedrock-runtime',
            region_name=region,
            aws_access_key_id=aws_access_key_id,
            aws_secret_access_key=aws_secret_access_key
        )

    def invoke_prompt(self, prompt: str, dataset: str = "", temperature: float = 0.0) -> str:
        """
        Invoke model with prompt

        Args:
            prompt: Base prompt
            dataset: Data to analyze (JSON or TOON encoded)
            temperature: 0 = deterministic, higher = more random
        """
        prompt_final = f"{prompt} {dataset}".strip()

        payload = {
            "messages": [
                {
                    "role": "user",
                    "content": [{"text": prompt_final}]
                }
            ],
            "inferenceConfig": {
                "max_new_tokens": 5000,
                "temperature": temperature,
                "top_p": 0.9
            }
        }

        try:
            response = self.client.invoke_model(
                modelId=self.model_prompt,
                body=json.dumps(payload)
            )
            response_body = json.loads(response['body'].read())
            return response_body['output']['message']['content'][0]['text']
        except Exception as e:
            raise Exception(f"Error invoking prompt model: {e}")

    def get_embeddings(self, text: str) -> List[float]:
        """Generate embeddings for a text"""
        payload = {"inputText": text}

        try:
            response = self.client.invoke_model(
                modelId=self.model_embedding,
                body=json.dumps(payload)
            )
            response_body = json.loads(response['body'].read())
            return response_body['embedding']
        except Exception as e:
            raise Exception(f"Error generating embeddings: {e}")

Experimental Setup

The experiment is based on the following principles:

Same prompt structure, changing only the data serialization format (JSON vs TOON)
25 independent runs per format to capture variability and compute robust statistics
Temperature = 0 to minimize randomness and ensure deterministic model behavior

This setup allows us to isolate the impact of the serialization format on the model’s output.

Prompt Design

The following is the prompt we will use for testing. The same prompt will be used in all executions, and we will only modify the data attached to the prompt for testing with toon and json format.

By concatenating the dataset directly to the prompt, we ensure that the instruction remains identical, and any differences in the response are attributable solely to the input format.

Evaluation Procedure

To assess response equivalence between JSON and TOON, the experiment relies on the SemanticEvaluator class, which encapsulates response generation and similarity evaluation.
At the core of the evaluation is the comparison of two responses per run, generated using the same prompt but different data encodings (JSON vs TOON), with temperature fixed at 0 to ensure deterministic behavior.
The evaluation is structured as follows:

cosine_similarity computes semantic similarity between the two responses using embedding vectors generated by Amazon Titan. This metric captures meaning-level equivalence and is insensitive to surface-level wording changes.
evaluate_single_run performs a full comparison for one run. It invokes the LLM twice (JSON and TOON), generates embeddings, and computes cosine similarity along with lexical overlap metrics (ROUGE-1, ROUGE-2, ROUGE-L) and BLEU. The output is a consolidated set of similarity scores for that run.
evaluate_multiple_runs repeats the single-run evaluation 25 times using the same prompt and dataset. Results from all runs are aggregated into a DataFrame, enabling statistical analysis such as mean values, variance, and stability across runs.

This design allows us to determine whether TOON’s token savings preserve response quality, both semantically and lexically, across multiple deterministic evaluations.

Results

After running 25 deterministic evaluations (temperature = 0), the analysis focused exclusively on response equivalence, measuring whether JSON and TOON produce comparable outputs when token savings are significant.

Semantic Equivalence (Cosine Similarity ≈ 0.991)
The most important signal comes from cosine similarity, computed using embeddings generated by Amazon Titan.

An average score of 0.991 indicates that, for the LLM, responses generated from TOON-encoded data are semantically equivalent to those generated from JSON.

Despite the removal of structural syntax such as braces, quotes, and repeated field names, the model preserved its ability to reason over the data and extract the same insights.
Across all runs, the meaning of the responses remained consistent.

Lexical Variability vs. Data Accuracy

Lexical similarity metrics such as ROUGE-1 and BLEU report lower absolute values:

ROUGE-1 F1 = 0.747
ROUGE-L F1 = 0.608
BLEU = 0.563

These scores indicate a moderate degree of lexical and structural variation between responses generated from JSON and TOON inputs. In particular, ROUGE-1 suggests partial overlap at the word level, while the lower ROUGE-L score highlights differences in sentence structure and ordering, consistent with paraphrasing and reformulation rather than content loss. Similarly, BLEU, which is sensitive to exact n-gram matches and word order, penalizes these variations even when responses remain correct and informative.

Importantly, these lexical differences do not correspond to a degradation in response quality. When inspecting the actual content of the responses, including rankings, averages, and detected trends, the results were numerically and logically consistent across formats.

🗂️ Code repository
If you want to analyze my code and see all these experiments performed, you can consult them from my repository, where all the code is available.
If you find this tutorial useful, do not forget to leave a star ⭐️ on the repository and follow me to receive notifications about new articles. Your support helps keep creating valuable technical content for the community 🚀

RominaElenaMendezEscobar / experiment-toon-vs-json

This repository contains a practical benchmark comparing JSON and TOON (Terse Object Oriented Notation) as data serialization formats for LLM prompts.

TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?

A practical benchmark comparing TOON and JSON formats for LLM prompts

|Tags: `llm`, `ai`, `optimization`, `python`|

Introduction

Over the past months, I came across several articles claiming that TOON can significantly reduce token usage in LLM prompts compared to traditional JSON. Most of these examples, however, relied on small or artificial datasets. That raised a few questions for me:

Does TOON still provide benefits with real-world API responses?
How much does it actually reduce tokens?
And more importantly: does changing the format affect how an LLM interprets the data or the quality of the response?

In this article, I aim to run a practical benchmark to explore whether TOON could be useful in production pipelines, in what contexts it performs best, and whether it works well across…

View on GitHub

Conclusions

This experiment shows that TOON can significantly reduce token usage while preserving response quality, as long as it is applied to the right type of data. For flat, repetitive structures, TOON acts as an effective form of prompt compression: the LLM retains semantic understanding, and any differences in wording are superficial rather than affecting meaning or correctness.

⚠️ Key limitations:

Only a single LLM was tested (Amazon Nova Lite)
Only specific datasets were used (GitHub Events and Wikipedia Page Views)
Evaluation was conducted in English only
Prompts were simple analytical tasks, not complex reasoning scenarios

As in any systems project, solutions should be carefully evaluated to determine whether they are truly optimal for a given use case. Outcomes often depend on many variables, so testing and validation in the specific context are essential before making decisions or implementing at scale.

📌 How to cite this article

APA style

Mendez Escobar, Romina Elena. (2025). TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?.

https://dev.to/r_elena_mendez_escobar/toon-vs-json-for-llm-prompts-can-we-reduce-token-usage-without-losing-response-quality-59ed

BibTeX


text
@article{mendez2025ai,
  title  = {TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?},
  author = {Mendez Escobar, Romina Elena},
  year   = {2025},
  url    = {https://dev.to/r_elena_mendez_escobar/toon-vs-json-for-llm-prompts-can-we-reduce-token-usage-without-losing-response-quality-59ed}
}

From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock

Romina Elena Mendez Escobar — Wed, 31 Dec 2025 10:11:22 +0000

In recent months, we have increasingly incorporated artificial intelligence into our solutions, and with it a recurring need has emerged: searching and querying our own data using natural language efficiently.

Use cases such as semantic search or building solutions based on Retrieval-Augmented Generation (RAG) are no longer optional. Today, we need to understand the meaning of text, combine it with structured filters, and do so in an efficient and scalable way.
In this article, I explore a recent alternative within the AWS ecosystem: Amazon S3 Vectors 🪣, a serverless approach for vector storage and querying that aims to balance scalability, simplicity, and cost.

To make it more concrete (and a bit more entertaining)...we will work with a dataset of coffee products ☕ and build a complete flow that goes from generating embeddings with Amazon Bedrock 🧠 to an application deployed on AWS with Streamlit ✨, which allows natural language searches combined with filters.

A quick note on embeddings and semantic search

Before diving into the implementation, it is worth briefly clarifying two key concepts used throughout this tutorial:

Embeddings are numerical representations of text that capture semantic meaning. Instead of relying on exact word matching, embeddings map text into high-dimensional vector spaces where semantically similar pieces of text are positioned closer together. This representation allows systems to reason about intent and context rather than purely lexical similarity.
Semantic search builds on top of embeddings by retrieving results based on meaning rather than exact terms. A user query is first transformed into an embedding and then compared against stored vectors using similarity metrics such as cosine or Euclidean distance. This approach enables more flexible, intent-aware searches and can be further refined by combining semantic similarity with structured metadata filters to improve precision and relevance.

What is Amazon S3 Vectors?

Amazon S3 Vectors is a new type of storage within Amazon S3 designed specifically to natively store and query vectors.
In addition to storing vectors, this type of bucket allows associating structured metadata, which enables queries that combine semantic search with filters on those attributes.
Vector buckets support searches based on distance metrics, such as:

Cosine similarity: measures how similar two vectors are based on the angle between them, and is very common in text embeddings.
Euclidean distance: measures the “geometric” distance between two vectors in space. Unlike traditional vector databases, Amazon S3 Vectors makes it possible to implement a fully serverless architecture, achieving a good balance between scalability, operational simplicity, and cost. Below are some of the main benefits of using this functionality:

How do vectors work in Amazon S3?

Amazon S3 Vectors is based on the following main components:

🪣 1. Vector buckets
These are specialized buckets optimized for vector storage.
They support encryption and organize data internally through vector indexes, which enables efficient large-scale searches.

🧭 2. Vector indexes
An index defines how vectors are stored and queried within the bucket.
In addition to the vector, it allows associating metadata, which can later be used in queries through filters with a syntax similar to well-known operators, such as those used in MongoDB.

🔍 3. Queries
Queries are based on similarity searches, using the distance metric configured when creating the index, such as cosine or Euclidean.
These searches can be combined with metadata filters to refine results and reduce ambiguities.

⚙️ 4. API
Amazon S3 Vectors exposes an API that allows querying data through operations such as QueryVectors.
These queries can be executed using tools like the AWS CLI or Boto3, combining a query vector with metadata-based filters and parameters such as the number of results to return or whether to include the distance between vectors.

Process Flow

The previous image shows the complete workflow to implement semantic search using Amazon S3 Vectors, divided into three main stages:

1️⃣ Generate Vector Embeddings

The process starts from the input documents. These documents are sent to an embeddings model, in this case AWS Titan through Amazon Bedrock, which transforms the text into numerical vectors.
At this stage, not only are the vectors generated, but metadata describing each document is also associated.

2️⃣ Store Vector Data

The generated vectors, together with their metadata, are stored in an S3 Vector Bucket.
Within the bucket, the data is organized through one or more vector indexes, defined with a specific distance metric.
Being integrated into AWS, this data can be consumed by other services such as Amazon Bedrock, Amazon SageMaker, or Amazon OpenSearch.

3️⃣ Semantic Search via Vector Index

To perform a search, a natural language query is transformed again into a vector using the same embeddings model.
This query vector, together with metadata filters and the topK parameter, is used to query the vector index and retrieve the most semantically similar results.

Reference Architecture

In this tutorial, the use case is based on processing data initially stored in JSON format, which is transformed into Parquet as part of a data preparation workflow. From this processed data, the Amazon Titan model is invoked through Amazon Bedrock to generate embeddings, which are then stored together with their metadata in an Amazon S3 Vectors bucket, thus enabling semantic queries over the information.

Data processing is carried out through an Amazon Glue job in Python, where a typical clean data stage of any production data pipeline is implemented. In this phase, only the relevant fields are selected, text descriptions are normalized and corrected when necessary, and only after this cleaning is completed is the Titan model invoked. This approach helps optimize costs and performance by avoiding unnecessary model calls on data that will not be used later.

Finally, the data stored in the vector bucket is consumed by an application developed with Streamlit, which is deployed on AWS Elastic Beanstalk within a VPC. The application allows user queries to be transformed back into embeddings and used to query the vector index, combining semantic search with metadata-based filters, while access to services and system observability are managed through IAM roles and CloudWatch Logs.

Amazon Bedrock and Amazon Titan

Amazon Bedrock is a fully managed service that allows developers to build, deploy, and scale applications powered by artificial intelligence without the need to manage infrastructure. Through a unified API, Bedrock provides access to foundation models from different providers, making their integration into cloud architectures simple and secure.

For this tutorial, we use Amazon Titan Text Embeddings V2, a model available in Bedrock that can process up to 8,192 tokens or 50,000 characters and generate 1,024-dimensional vectors. This model is optimized for information retrieval tasks, semantic search, similarity measurement, and clustering, making it a suitable choice for RAG scenarios and large-scale text analysis.

Amazon Elastic Beanstalk

Amazon Elastic Beanstalk is a managed service that allows you to deploy and run web applications without the need to directly manage the underlying infrastructure. It automatically handles resource provisioning, load balancing, scaling, and monitoring, allowing the focus to remain on application development rather than operations.
In this tutorial, we use Elastic Beanstalk to deploy the application developed with Streamlit, taking advantage of its native integration with services such as EC2, Auto Scaling, and CloudWatch, which enables a fast, secure, and scalable deployment.

Below is a summary of some of the main benefits of using this solution:

📊 Dataset

The dataset used in this tutorial was obtained from the Amazon Reviews 2023 project, presented in the paper Bridging Language and Items for Retrieval and Recommendation (Hou et al., 2024). This dataset contains reviews and metadata for Amazon products, including titles, descriptions, categories, stores, and ratings.
For this use case, only the “Grocery_and_Gourmet_Food” category was selected, and within it, products related to coffee were filtered. This allows us to work with rich textual information and structured attributes that are ideal for semantic search scenarios.
The project repository includes both the filtered coffee product datasets and the already processed versions containing vector embeddings, making it easier to reproduce the tutorial and analyze the complete workflow.

Use Case

The use case presented in this tutorial starts from a simple but representative scenario: a user who wants to query coffee products using natural language, exploring the available catalog in a more flexible and intuitive way than a traditional search.

To enable this type of query, different textual attributes of the product are used, such as the title, description, and category, which helps better capture user intent. Within the dataset, several coffee-related categories are included, such as Coffee, Instant Coffee, Ground Coffee, Whole Coffee Beans, Single-Serve Capsules & Pods, Iced Coffee & Cold-Brew, among others.

Based on this, an application is designed in which the user can interact primarily through natural language, while complementing the search with structured filters to reduce ambiguities. These filters include, for example, product rating, store name (a detail that users often do not know or remember precisely), and price, allowing more accurate and relevant results without relying exclusively on a textual query.

Prerequisites

(1) 🗂️ Code repository

To follow this tutorial, it is necessary to clone the project repository, where the complete solution code is available.
In the following sections, the most relevant aspects of the implementation and design decisions are highlighted, rather than providing an exhaustive walkthrough of the entire source code.
If you find this tutorial useful, do not forget to leave a star ⭐️ on the repository and follow me to receive notifications about new articles. Your support helps keep creating valuable technical content for the community 🚀

RominaElenaMendezEscobar / s3-vector-coffee-tutorial

S3 Vector tutorial using cafe data and creating a Streamlit app deployed on Elastic Beanstalk

From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock

Use cases such as semantic search or building solutions based on Retrieval-Augmented Generation (RAG) are no longer optional. Today, we need to understand the meaning of text, combine it with structured filters, and do so in an efficient and scalable way In this article, I explore a recent alternative within the AWS ecosystem: Amazon S3 Vectors 🪣, a serverless approach for vector storage and querying that aims to balance scalability, simplicity, and cost.

To make it more concrete (and a bit more entertaining)...we will work with a dataset of coffee products ☕ and build a complete flow that goes from generating embeddings…

View on GitHub

(2) 🪣 Create Amazon S3 buckets

As part of this workflow, we need two Amazon S3 buckets:

A standard bucket to store raw and processed data.
An Amazon S3 Vectors bucket to store vectors and their metadata. In this tutorial, the following names are used as references:

AWS_BUCKET_NAME = "coffee-products-tutorial-full-data"
AWS_BUCKET_VECTOR_NAME = "coffee-products-tutorial"
AWS_INDEX_VECTOR_NAME = "idx-coffee-products"

(2.1) 🪣 Creating the S3 Vectors bucket

The first step is to create the vector bucket from the Amazon S3 console, in the Vector buckets section, select Create vector bucket and define a unique name for the bucket.
In the encryption configuration, you can use Amazon S3–managed encryption (SSE-S3), which is sufficient for this use case. It is worth noting that this setting cannot be modified later, so it is important to define it correctly from the beginning.

(2.2) 🧭 Creating the vector index

Once the bucket is created, the next step is to define a vector index, which will be responsible for organizing and querying the vectors efficiently.

During this configuration, three key aspects must be specified:

Index name, which must be unique within the bucket.
Vector dimension, which must match the output of the embeddings model (in this case, 1,024 dimensions for Amazon Titan).
Distance metric, where you can choose between cosine or Euclidean. For text embeddings, cosine similarity is usually the most commonly used option.

Like the bucket, the index also inherits the encryption configuration, and this cannot be modified once it has been created.

(3) 🔐 Policies

To work on this project, it is necessary to configure a set of IAM policies that allow access to the different services involved in the workflow.

In particular, the following are required:

Amazon Titan policy: allows invoking the Amazon Titan embeddings model through Amazon Bedrock to generate vectors from text.
Amazon S3 policy: enables reading and writing data in the Amazon S3 bucket used to store raw and processed data.
Amazon S3 Vectors policy: allows writing and querying vectors, along with their metadata, in the Amazon S3 Vectors bucket.

Finally, these policies are attached to an IAM role that is used by the application deployed on AWS Elastic Beanstalk, ensuring controlled and secure access to the required resources.

All the policies mentioned are available in the project repository.

🛠️ Implementation Guide

✅ Step 1: Dataset

As mentioned earlier, we start from a dataset in JSON format, which we download and then process into Parquet, since this format is more efficient for reading, storage, and processing in data pipelines.
The dataset used in this tutorial is available in my repository, inside the data/ folder.

⚙️ Step 2: Process data (embedding generation)

To generate the embeddings, we use a class that I created to simplify the code and encapsulate the interaction with Amazon Bedrock. By default, the class uses the amazon.titan-embed-text-v2:0 model, although the design allows it to be easily changed if you want to try another model.

This class includes three main methods:

create_client(): creates the Bedrock Runtime client with Boto3, using region and credentials.
get_embeddings(text): invokes the Titan model by sending the text and returns the generated vector.
generate_embeddings_batch(texts): generates embeddings in batches by iterating over a list of texts and showing progress with tqdm.

class EmbeddingsGenerator:
   def __init__(self,
                MODEL_NAME:str='amazon.titan-embed-text-v2:0',
                AWS_ACCESS_KEY_ID:str='',
                AWS_SECRET_ACCESS_KEY:str='',
                AWS_REGION:str=''
                ):
       self.MODEL_NAME = MODEL_NAME
       self.AWS_ACCESS_KEY_ID = AWS_ACCESS_KEY_ID
       self.AWS_SECRET_ACCESS_KEY = AWS_SECRET_ACCESS_KEY
       self.AWS_REGION = AWS_REGION


   def create_client(self):
       client = boto3.client(
               service_name='bedrock-runtime',
               region_name=self.AWS_REGION,
               aws_access_key_id=self.AWS_ACCESS_KEY_ID,
               aws_secret_access_key=self.AWS_SECRET_ACCESS_KEY
           )
       return client

   def get_embeddings(self, text:str):
       client = self.create_client()
       response = client.invoke_model(
           modelId=self.MODEL_NAME,
           body=json.dumps({
               "inputText": text
           })
       )
       response_body = json.loads(response['body'].read())
       embeddings = response_body['embedding']
       return embeddings

   def generate_embeddings_batch(self, texts:list):
       embeddings_list = []
       for text in tqdm(texts):
           embeddings = self.get_embeddings(text)
           embeddings_list.append(embeddings)
       return embeddings_list

To run it locally, you need a .env file with your credentials and region:

AWS_ACCESS_KEY=YOUR_ACCESS_KEY
AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET_ACCESS_KEY
AWS_REGION=YOUR_AWS_REGION

And a minimal usage example would be the following:

import os
import boto3
from dotenv import load_dotenv


load_dotenv()


AWS_ACCESS_KEY_ID = os.getenv('AWS_ACCESS_KEY')
AWS_SECRET_ACCESS_KEY = os.getenv('AWS_SECRET_ACCESS_KEY')
AWS_REGION = os.getenv('AWS_REGION')


emb_generator = EmbeddingsGenerator(
    AWS_ACCESS_KEY_ID=AWS_ACCESS_KEY_ID,
    AWS_SECRET_ACCESS_KEY=AWS_SECRET_ACCESS_KEY,
    AWS_REGION=AWS_REGION
)


input_text = "instant coffee sweet creamy vanilla flavor"
query_embedding = emb_generator.get_embeddings(text=input_text)

🪣 Step 3: Store data (S3 + S3 Vectors)

To simplify data ingestion, I created an S3 class that encapsulates access to both the standard S3 bucket and the Amazon S3 Vectors bucket. The idea is to keep the code clean and reusable, separating connection logic from write logic.

This class includes three main methods:

create_client(): creates a Boto3 client for the specified service (s3 or s3vectors).
upload_file(): uploads files to the standard S3 bucket (useful for raw and processed data).
upload_vector_data(): loads vectors into S3 Vectors using * put_vectors, sending them in batches to respect the per-request limit.
query_embedding(): enables semantic search by querying the vector index using an embedding and optional metadata filters, returning the most relevant results ranked by similarity.

lass S3:
   """Class to handle S3 operations including uploading files and vector data"""
   def __init__(self,
                AWS_ACCESS_KEY_ID:str='',
                AWS_SECRET_ACCESS_KEY:str='',
                AWS_REGION:str='',
                AWS_BUCKET_NAME:str='',
                AWS_BUCKET_VECTOR_NAME:str='',
                AWS_INDEX_VECTOR_NAME:str=''
                ):
       self.AWS_ACCESS_KEY_ID = AWS_ACCESS_KEY_ID
       self.AWS_SECRET_ACCESS_KEY = AWS_SECRET_ACCESS_KEY
       self.AWS_REGION = AWS_REGION
       self.AWS_BUCKET_NAME = AWS_BUCKET_NAME
       self.AWS_BUCKET_VECTOR_NAME = AWS_BUCKET_VECTOR_NAME
       self.AWS_INDEX_VECTOR_NAME = AWS_INDEX_VECTOR_NAME


   def create_client(self, service_name:str='s3'):
       """
       Create a boto3 client for the specified AWS service.
       """
       s3_client = boto3.client(
           service_name=service_name,
           region_name=self.AWS_REGION,
           aws_access_key_id=self.AWS_ACCESS_KEY_ID,
           aws_secret_access_key=self.AWS_SECRET_ACCESS_KEY
       )
       return s3_client


   def upload_file(self, file_name:str, object_name:str):
       """
       Upload a file to an S3 bucket.
       """
       s3_client = self.create_client()
       s3_client.upload_file(Filename=file_name, Bucket=self.AWS_BUCKET_NAME, Key=object_name)
       print(f"File {file_name} uploaded to bucket {self.AWS_BUCKET_NAME} as {object_name}")


   def upload_vector_data(self, data:list, batch_size:int=100):
       """
       Upload vector data to S3 Vectors in batches with tqdm for progress tracking.
       batchsize: it is the number of vectors per batch to avoid exceeding maximum size.
       """
       s3_vector_client = self.create_client(service_name='s3vectors')


       # Helper for chunking data into batches
       def chunked(lst, size):
           for i in range(0, len(lst), size):
               yield lst[i:i + size]


       batches = list(chunked(data, batch_size))


       # see the progress of the upload
       for i, batch in enumerate(tqdm(batches, desc="Uploading batches"), start=1):
           try:
               s3_vector_client.put_vectors(
                   vectorBucketName=self.AWS_BUCKET_VECTOR_NAME,
                   indexName=self.AWS_INDEX_VECTOR_NAME,
                   vectors=batch
               )
           except Exception as e:
               print(f"Error uploading batch {i}: {e}")

   def query_embedding(self,
              query_embedding:list,
              filter_data:dict=None,
               top_k=3):
       """Perform complete search with text and filters"""
       s3_vector_client = self.create_client(service_name='s3vectors')

       # Prepare base parameters
       query_params = {
           "vectorBucketName": self.AWS_BUCKET_VECTOR_NAME,
           "indexName": self.AWS_INDEX_VECTOR_NAME,
           "queryVector": {"float32": query_embedding},
           "topK": top_k,
           "returnDistance": True,
           "returnMetadata": True
       }

       # Only add filter if exists
       if filter_data:
           query_params["filter"] = filter_data

       # Execute search
       query_result = s3_vector_client.query_vectors(**query_params)
       return query_result['vectors']

To upload vectors to S3 Vectors, we first need to build the structure expected by put_vectors. Each item must include a key (a unique identifier in string format), the vector in data.float32, and a metadata object with the attributes that we will later use as filters in queries.
In addition, since no more than 100 vectors can be sent per request, the upload is performed in batches controlled by the batch_size parameter.

vector_data = []


for i in range(data_coffee_filter.shape[0]):
   vector_data.append({
       "key": str(data_coffee_filter['id'][i]),  # always need to be string
       "data": {
           "float32": data_coffee_filter['embeddings'][i]
       },
       "metadata": {
           "average": float(data_coffee_filter['average_rating'][i]),
           "rating_number": int(data_coffee_filter['rating_number'][i]),
           "price": float(data_coffee_filter['price'][i]),
           "shop_name": str(data_coffee_filter['shop_name'][i])
       }
   })


s3 = S3(
   AWS_ACCESS_KEY_ID=AWS_ACCESS_KEY_ID,
   AWS_SECRET_ACCESS_KEY=AWS_SECRET_ACCESS_KEY,
   AWS_REGION=AWS_REGION,
   AWS_BUCKET_NAME=AWS_BUCKET_NAME,
   AWS_BUCKET_VECTOR_NAME=AWS_BUCKET_VECTOR_NAME,
   AWS_INDEX_VECTOR_NAME=AWS_INDEX_VECTOR_NAME
)


s3.upload_vector_data(vector_data)

🔍 Step 4: Retrieve (QueryVectors + filters)

To retrieve results from Amazon S3 Vectors, the flow is always the same. First, we convert a natural language query into an embedding (vector) using the same model that was used during indexing. Then, we execute query_vectors, passing that vector as queryVector. From there, the service returns the top K most similar vectors according to the distance metric configured in the index (Cosine or Euclidean) and optionally, we can apply metadata filters to reduce ambiguity and improve precision.

The most important query_vectors parameters are:

queryVector: the embedding of the search text (in the format {"float32": [...]}).
topK: how many results we want to retrieve.
filter: filters based on the metadata stored together with the vector (for example shop_name, average, price).
returnDistance: whether to return the distance or similarity for each result. This is useful for applying a threshold and discarding results that are close but not very relevant.
returnMetadata: whether to also return the metadata associated with the vector, to display information in the app or apply additional logic.

To reduce the complexity of query implementation, a helper method is provided and encapsulated within the S3 utility class. This abstraction centralizes the interaction with Amazon S3 Vectors, simplifying semantic search execution and making the codebase cleaner, more reusable, and easier to maintain.

Amazon S3 Vectors, simplifying semantic search execution and making the codebase cleaner, more reusable, and easier to maintain.

Query Examples with Metadata Filters

🔎 Query by Single Metadata Field (Exact Match)

Example: filter by shop_name

s3.query_embedding( query_embedding=query_embedding,
                  filter_data={"shop_name": "nescafé"})

response

[{'distance': 0.41610199213027954,
  'key': 'de46725d-ef52-47ca-80e2-f1ba82c0353d',
  'metadata': {'price': 11.48,
   'shop_name': 'nescafé',
   'average': 4.4,
   'rating_number': 248}},
 {'distance': 0.47703248262405396,
  'key': '03915b9f-e592-40ec-b806-bd06b4213d90',
  'metadata': {'price': 13.4,
   'average': 3.6,
   'shop_name': 'nescafé',
   'rating_number': 471}},
 {'distance': 0.514411211013794,
  'key': '5037ea28-b789-427a-9b1f-d825ad68dd2d',
  'metadata': {'rating_number': 3052,
   'shop_name': 'nescafé',
   'average': 4.4,
   'price': 17.75}}]

🔢 Query Using Comparison Operators

In filters, you can use comparison operators, for example:

$gt: greater than
$gte: greater than or equal
(and others such as $lt, $lte, $eq, $ne depending on the case)

Here you can find more information about the commands you can use:
https://docs.aws.amazon.com/es_es/AmazonS3/latest/userguide/s3-vectors-metadata-filtering.html

Example: average rating greater than or equal to 4.2

s3.query_embedding( query_embedding=query_embedding,
                  filter_data={"average": {"$gte": 4.2}})

🔗 Query with Combined Conditions

When you need more than one condition, you can combine filters with:

$and: logical AND between multiple conditions
$or: logical OR between multiple conditions

Example: average rating ≥ 4.2 AND price ≤ 20

s3.query_embedding( query_embedding=query_embedding,
                  filter_data={
       "$and": [
           {"average": {"$gte": 4.2}},
           {"price": {"$lte": 20.0}}
       ]
   })

🖥️ Step 5: App (Streamlit)

While developing this tutorial, I realized that although it is possible to run the entire flow directly from Python code, it is not the most comfortable approach for an end user. For this reason, I decided to build a web application using Streamlit, a framework that allows you to create interactive interfaces in Python with very few lines of code.

In the repository, you will find a single file called app.py, which contains all the application logic. This makes it easy to clearly see how embedding generation, querying Amazon S3 Vectors, and result visualization are integrated, while keeping the focus on a simple and straightforward flow.

Streamlit provides an API with many interactive components such as text inputs, selectors, sliders, and chat-oriented elements. These components are ideal for this use case. For more details about the available components, you can check the official documentation:
https://docs.streamlit.io/develop/api-reference

🚀 Step 6: Configure the project to deploy the app (Elastic Beanstalk)

To deploy the application on AWS Elastic Beanstalk, we will package the project into a .zip with a specific structure. Beanstalk uses these files to configure the environment, install dependencies, and define how the app is executed when the instance starts.

    app.zip
    |__ 📂.ebextensions/
    |    |__ 📄 iam-role.config
    |    |__ 📄 securitygroup.config
    |__ 📂img/
    |    |__🏞️ preview_app.png
    |__ 📄 .ebignore
    |__ 📄 app.py
    |__ 📄 Procfile
    |__ 📄requirements.txt

📁 .ebextensions/iam-role.config (instance IAM role)

This file configures which IAM Instance Profile the Elastic Beanstalk instance will use. It is key because that role is what allows your app to have permissions to invoke Bedrock and query S3 and S3 Vectors (based on the policies you defined).

option_settings:
  aws:autoscaling:launchconfiguration:
    IamInstanceProfile: ElasticBeanstalk-CoffeeApp-Role

🔒 .ebextensions/securitygroup.config (restrict access by IP)

By default, the app is publicly accessible (depending on how the environment is configured). In this case, this configuration restricts access to the application only to your IP by adding inbound rules to the Beanstalk security group for HTTP (80) and HTTPS (443). This is useful in test environments or demos to prevent unwanted access.

Tip: you can get your public IP by searching “what is my ip” and replace

Resources:
  httpSecurityGroupIngress: 
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      GroupId: {"Fn::GetAtt" : ["AWSEBSecurityGroup", "GroupId"]}
      IpProtocol: tcp
      ToPort: 80
      FromPort: 80
      CidrIp: <your_ip>/32

  httpsSecurityGroupIngress:
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      GroupId: {"Fn::GetAtt" : ["AWSEBSecurityGroup", "GroupId"]}
      IpProtocol: tcp
      ToPort: 443
      FromPort: 443
      CidrIp: <your_ip>/32

🚫 .ebignore

This file works like a .gitignore, but for deployment. It indicates which files should not be uploaded to Elastic Beanstalk. This helps avoid including credentials, system junk, or unnecessary files that increase the package size.

🖥️ app.py (Streamlit application)

This is the main application file, where the Streamlit interface and the logic to generate embeddings, query S3 Vectors, and display results are defined. In this tutorial, the entire app lives in this single file to keep it simple and easy to follow.

🧾 Procfile (startup command)

Elastic Beanstalk needs to know which command to run to start your application. The Procfile defines that entrypoint. In this case, we start Streamlit listening on 0.0.0.0 to accept external traffic, and using a port defined for the environment.

web: streamlit run app.py --server.port=8000 --server.address=0.0.0.0

📦 requirements.txt (dependencies)

This file lists the libraries required for the app to run. Beanstalk installs them automatically during deployment.

🚀 Step 7: Deploy the solution

(1) Create a new application

In this step, a new application is created in AWS Elastic Beanstalk, which acts as the logical container for the project.
You only need to define an application name and, optionally, a short description.

(2) Environment

In this step, the environment where the application will be deployed is configured. For this use case, a Web server environment is selected, since it is a web application built with Streamlit that exposes an HTTP interface for users.

By default, Elastic Beanstalk suggests an environment name based on the application name, which is sufficient for this tutorial. This environment will be responsible for running the app, handling traffic, and applying scaling and monitoring configurations in the following steps.

(2) Environment – Step 1: Configure environment

In this step, the basic environment parameters are defined:

Environment tier: select Web server environment, since the application exposes a web interface over HTTP.
Application name: automatically filled with the name defined in the previously created application.
Environment name: name of the environment; the default suggested value can be used.
Domain: can be left empty so that Elastic Beanstalk automatically generates the subdomain.
Platform:
- Platform: Python
- Platform branch: Python 3.11 running on 64bit Amazon Linux 2023
- Platform version: leave the default recommended version.
Application code:
- Select Upload your code.
- Upload the .zip file generated previously.
Presets: Select Single instance (free tier eligible) for this tutorial.

(2) Environment – Step 2: Configure service access

In this step, the IAM roles that allow Elastic Beanstalk and EC2 instances to access AWS resources are configured:

Service role: the role that Elastic Beanstalk uses to create and manage the environment (Auto Scaling, Load Balancer, logs, etc.).
EC2 instance profile: the role used by the EC2 instances where the application runs.This role must include the necessary policies to access Amazon Bedrock, Amazon S3, and Amazon S3 Vectors.
EC2 key pair (optional): can be omitted if SSH access to the instances is not required. With this configuration, the application is correctly authorized to interact with AWS services in a secure manner.

(2) Environment – Step 3: Set up networking, database, and tags (optional)

In this step, the network where the environment will run is configured. For this tutorial, the default VPC values are used, making only the following adjustments:

VPC: select the account’s default VPC to simplify the configuration.
Public IP address: enable it so the application is accessible from the Internet.
Instance subnets: select two subnets in different Availability Zones, as shown in the image. Selecting more than one subnet allows Elastic Beanstalk to distribute instances across multiple Availability Zones, improving resilience and fault tolerance, even when using a simple deployment for tests or demos.

The remaining options (database and tags) can be left unconfigured for this use case.

(2) Environment – Step 4: Configure instance traffic and scaling

In this step, how the application runs and what type of resources it uses are defined:

Environment type: select Single instance, which is sufficient for this tutorial and helps reduce costs.
Fleet composition: use On-Demand instance, avoiding the complexity of Spot instances.
Architecture: choose x86_64 to ensure compatibility with all Python dependencies.
Instance type: select a lightweight type such as t3.small, suitable for running a low-consumption Streamlit application.
Monitoring and metadata: keep the default values, enabling CloudWatch metrics and using IMDSv2.

This configuration allows the application to be deployed in a simple, stable, and cost-effective way, ideal for tests, demos, and development environments.

(2) Environment – Step 5: Configure updates, monitoring, and logging

In this step, monitoring, update, and observability options for the environment are configured:

Monitoring: enable basic or enhanced monitoring so Elastic Beanstalk reports instance metrics to CloudWatch.
Health reporting: allows you to visualize the application status and detect failures early.
Managed platform updates: automatic environment updates (minor and patch) can be enabled during a defined weekly window.
Email notifications: allows configuring an email address to receive notifications about relevant environment events.
**Rolling updates and deployments: **defines how deployments and configuration changes are applied (for this tutorial, default values can be used).
**Logs: **enable sending instance logs to CloudWatch Logs to facilitate debugging and observability.
*Environment properties: * here you can define environment variables required by the application (for example AWS region, bucket names, or other configuration values the app needs).

With this configuration, the environment is prepared to operate in a stable and observable way, with controlled updates and no additional adjustments required for this use case.

🚀 Step 7: Validate the application deployment (Elastic Beanstalk)

Once the application is deployed, it is important to validate that everything is working correctly:

(1) Environment status

The first step is to verify that the environment status is Health: OK. This indicates that Elastic Beanstalk was able to start the application correctly and that no critical errors were detected during deployment.

(2) Application access

If the status is correct, you can click on the** environment domain** to access the application from the browser and confirm that the Streamlit interface loads correctly.

(3) Log review

If the application does not work as expected or the status is not OK, go to the Logs tab. From there, you can request logs, and it is recommended to download the last 100 records to make error analysis easier.

(4) Deploy a new version

If an issue is detected in the logs and the code needs to be fixed, you can deploy a new version using the Upload and deploy button. In this step, you only need to upload the updated .zip file and assign a new application version.

🧩 Conclusions

This tutorial presents a complete workflow for processing and querying data through semantic search, where it is essential not to lose sight of best practices in data cleaning and the correct definition of metadata. Metadata plays a fundamental role in guiding searches, reducing the amount of information queried, and significantly improving the relevance of results.

During the tests performed,** query performance** was notably fast, to the point that in some cases the spinner implemented in the application barely had time to appear. This shows that Amazon S3 Vectors can deliver suitable performance even for interactive, end-user–oriented scenarios.

When exploring the Boto3 API, it becomes apparent that some features commonly found in traditional databases are still missing, such as aggregated statistics or an equivalent of count(*). Currently, to determine the number of stored vectors, it is necessary to use operations like list_vectors with pagination. This suggests that, as a relatively new feature, there are clear opportunities for improvement in future versions of the service.

On the other hand, AWS Elastic Beanstalk proves to be a very good solution for deploying this type of application quickly and easily. However, in production scenarios, combining it with tools such as Terraform and CI/CD pipelines would allow deployments to be automated and manual intervention to be further reduced. In this tutorial, a console-based deployment was chosen to keep complexity under control and focus on the main use case.

Finally, this approach demonstrates how unstructured text analysis use cases, combined with structured data, offer a very compelling balance. In particular, building a chat-like interface that does not rely exclusively on natural language, but also incorporates explicit filters, makes it possible to create a hybrid model that improves precision, reduces ambiguity, and enriches the search experience.

📚 References

Amazon Web Services. (s.f.). Amazon S3 Vectors: Revolutionizing AI data storage with use cases. AWS re:Post. https://repost.aws/articles/ARY9EKiGFISfisAyvigDX3lQ/amazon-s3-vectors-revolutionizing-ai-data-storage-with-use-cases
Amazon Web Services. (s.f.). Amazon S3 Vectors. https://aws.amazon.com/es/s3/features/vectors/
Amazon Web Services. (s.f.). Vector buckets for Amazon S3. https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-buckets-details.html
Amazon Web Services. (s.f.). Metadata filtering for Amazon S3 Vectors. https://docs.aws.amazon.com/es_es/AmazonS3/latest/userguide/s3-vectors-metadata-filtering.html
Hou, Y., Li, J., He, Z., Yan, A., Chen, X., & McAuley, J. (2024). Bridging language and items for retrieval and recommendation. https://amazon-reviews-2023.github.io/
Streamlit Inc. (s.f.). Streamlit API reference. https://docs.streamlit.io/develop/api-reference
Amazon Web Services. (s.f.). Amazon S3 Vectors. https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3vectors.html

📌 How to cite this article

APA style

Mendez Escobar, Romina Elena. (2025). From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock.

https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b

BibTeX


text
@article{mendez2025aiawscoffee,
  title  = {From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock},
  author = {Mendez Escobar, Romina Elena},
  year   = {2025},
  url    = {https://dev.to/aws-builders/from-coffee-products-to-ai-search-building-a-serverless-semantic-search-architecture-with-amazon-5g5b}
}

Data-Driven Project Analysis: Analyzing Trello Kanban Projects with AI on AWS Bedrock

Romina Elena Mendez Escobar — Tue, 23 Dec 2025 11:22:10 +0000

Introduction

Modern software projects often involve multiple distributed teams working on high-complexity initiatives, with frequent releases and ongoing production fixes. While tools like Kanban boards help organize tasks, epics, and workflows, they also generate large volumes of unstructured data in the form of comments, status changes, and timelines.
As the number of interdependent tasks and contributors grows, understanding the real state of a project, and identifying early risks or bottlenecks, becomes increasingly difficult. As a result, manual analysis is time-consuming and often subjective, limiting timely and objective decision-making.

In this article, I present a practical use case that leverages AWS services and generative AI to enhance project analysis and interpretation. By analyzing task metadata and detecting semantic patterns in comments (such as ambiguity, implicit dependencies, missing definitions, or scope creep) AI enables more objective insights, early warnings, and data-driven decision-making

Understanding Kanban Board and Trello

Kanban is a visual project management methodology that originated in Toyota’s manufacturing system. It focuses on limiting work in progress and enabling continuous delivery by representing work items across different stages of a workflow.

Trello is a widely used web-based project management tool that implements Kanban principles through boards, lists, and cards. Each card typically represents a task, feature, or user story, and includes not only a status but also descriptive text, comments, and historical changes over time.
While Kanban boards are primarily designed for human collaboration, they also generate a rich source of textual and contextual data that can be analyzed programmatically.

User Stories as a Data Structure

A well-defined user story usually follows a consistent structure:

Who: the requester (As a…)
What: the objective (I want to…)
Why: the purpose (So that…)
Acceptance Criteria: explicit conditions for completion

This structure is not only useful for aligning teams, it also provides a clear semantic pattern that can be leveraged by AI models. When tasks are written consistently, the model can more easily understand intent, scope, dependencies, and completion expectations.
In other words, writing better user stories improves both human understanding and machine interpretation, making it a best practice for data-driven project analysis.

AWS Bedrock and Amazon Nova

For this tutorial, we leverage Amazon’s generative AI services, which provide a variety of pre-trained foundation models accessible through a single, unified platform.
AWS Bedrock is a fully managed service that allows developers to build, deploy, and scale AI-powered applications without the overhead of managing infrastructure. It provides seamless access to state-of-the-art foundation models from leading AI providers, all through a simple API.
For our implementation, we use Amazon Nova, AWS’s family of foundation models designed for tasks such as text generation, analysis, and summarization. In particular, Nova Lite offers a balanced combination of ⚡️performance and 💰cost-efficiency, making it ideal for analyzing project data and generating actionable insights.
In the following sections, we will demonstrate how to implement this service in Python, showing how AI can be applied to extract meaningful insights from Kanban project data.

Reference Architecture

Before diving into the implementation details, it is useful to understand the overall architecture that supports this use case. The following reference architecture illustrates how project data flows from Trello through AWS services and into an AI-powered analysis pipeline.

The entire process is executed through an AWS Glue job implemented in Python, which orchestrates data extraction, transformation, AI inference, and report generation in a scalable and automated manner.

At a high level, the architecture ingests Kanban project data from Trello, enriches it with temporal and contextual metadata, applies semantic analysis using generative AI models on AWS Bedrock, and produces structured, human-readable reports for project stakeholders.

Core Components

(1). 📋Trello Integration Class

Connects to Trello boards via the Trello API
Retrieves boards, lists, and cards with enriched metadata
Calculates time-based metrics (e.g., days until due date)
Exports structured data to Amazon S3 in JSON format

(2). ✨AWS Bedrock Integration

Invokes the Amazon Nova model using custom prompts
Processes project datasets to generate semantic insights
Uses configurable inference parameters to balance cost and accuracy

(3).📊 Report Generation (MarkdownPDFReport)

Converts AI-generated markdown into professional PDF reports
Applies custom styling for readability and consistency
Supports tables, lists, and structured summaries

(4). Supporting Services

🔐 AWS Secrets Manager: securely stores Trello API credentials
🪣 Amazon S3: stores datasets, prompts, and generated reports
📩 Amazon SES: distributes automated reports via email

Implementation Guide

The use case presented in this guide is based on a simulated Trello board representing a e-commerce software project. The board includes typical development activities such as feature implementation, backlog items, in-progress tasks, and delivery milestones, closely mirroring how Kanban is used in production environments.
This example is intentionally designed to resemble a realistic project scenario, allowing us to analyze both structured data (task metadata, statuses, due dates) and unstructured data (descriptions and comments). The following diagram illustrates the initial project setup and serves as the input for the implementation steps described in the next sections.

Prerequisites

Before running the solution, a few AWS and Trello prerequisites must be in place. These prerequisites ensure secure access to project data, proper execution of the Glue job, and automated report delivery.

(1). 🔑 Trello API credentials

To access Trello boards and cards programmatically, you need valid Trello API credentials, consisting of an API key and an access token.

Step 1: Obtain the API key

The API key can be generated from the Trello Power-Ups administration page:

https://trello.com/power-ups/admin

Step 2: Generate the access token

Once you have the API key, you must authorize your application and generate a token using the following endpoint (replace {API_KEY} with your own key):

https://trello.com/1/authorize?expiration=never&name=MyApp&scope=read,write&response_type=token&key={API_KEY}

This authorization flow grants read and write access to Trello resources and returns a token that will be used by the application to query boards, lists, cards, and comments. Both the API key and token should be treated as sensitive credentials.

(2). ⚙️ AWS IAM role

On the AWS side, an IAM role is required to execute the AWS Glue job and interact with the supporting services used in this solution.
The role must include permissions for:

AWS Glue (job execution)
Amazon S3 (data storage and retrieval)
AWS Secrets Manager (secure storage of Trello credentials)
Amazon Bedrock (AI model)
Amazon SES (email delivery)

A complete example IAM policy with the required permissions is provided in the project repository. You can attach this policy to the IAM role used by the Glue job to ensure the pipeline runs end to end without permission issues.

(3). 📩 Amazon SES configuration

Finally, Amazon Simple Email Service (SES) must be configured to enable automated report delivery.
This includes:

☑️ Verifying at least one sender email address or domain (SES identities)
☑️ Ensuring your AWS account has sufficient sending limits
☑️ Confirming the SES region matches the region used by the Glue job

Once configured, SES will be used to send the generated PDF reports to stakeholders automatically as part of the pipeline execution.

Implementation Steps

The following steps describe the end-to-end implementation of the solution, from secure credential management to AI-driven analysis and automated report distribution.

🔐 Step 1: Configure Secrets Manager

Store your Trello credentials securely in AWS Secrets Manager and this avoids hardcoding sensitive information and follows AWS security best practices. For this reason the secret should contain the Trello API key and token in JSON format.

⚙️ Step 2: Set Up the AWS Glue Environment

For this tutorial, the solution is implemented using an AWS Glue Python notebook, which provides a fully managed, serverless environment for running data processing jobs. Therefore, the complete source code is available in the project repository, because in the following sections shighlights the most relevant implementation details and design decisions rather than providing a full code walkthrough.

If you find this tutorial helpful, feel free to leave a star ⭐️ and follow me to get notified about new articles. Your support helps me grow within the tech community and create more valuable content! 🚀

RominaElenaMendezEscobar / aws-trello-ai-tutorial

End-to-end AWS Glue pipeline for extracting Trello Kanban data, analyzing it with Amazon Bedrock, and generating automated PDF reports.

🏷️ Data-Driven Project Analysis: Analyzing Trello Kanban Projects with AI on AWS Bedrock

Introduction

Modern software projects often involve multiple distributed teams working on high-complexity initiatives, with frequent releases and ongoing production fixes. While tools like Kanban boards help organize tasks, epics, and workflows, they also generate large volumes of unstructured data in the form of comments, status changes, and timelines As the number of interdependent tasks and contributors grows, understanding the real state of a project, and identifying early risks or bottlenecks, becomes increasingly difficult. Manual analysis is time-consuming and often subjective.

In this repository, I present a practical use case that leverages AWS services and generative AI to enhance project analysis and interpretation. By analyzing task metadata and detecting semantic patterns in comments (such as ambiguity, implicit dependencies, missing definitions, or scope creep) AI enables more objective insights, early warnings, and data-driven decision-making

🗂️ Folder Structure

The repository…

View on GitHub

📦 Step 2.1: Installing Additional Python Packages

AWS Glue comes with a predefined Python environment, but this solution requires additional libraries to interact with AWS services, process text, and generate reports.

The following directive installs the required dependencies at runtime:

Install required Python packages

%additional_python_modules boto3==1.34.34,botocore==1.34.34,markdown==3.5.2,beautifulsoup4==4.12.3,reportlab==4.0.8

These packages are used for:

boto3 / botocore: AWS SDK for Python, used to interact with services such as S3, Secrets Manager, Bedrock, and SES
markdown: Converts AI-generated Markdown into HTML
beautifulsoup4: Parses and transforms HTML content before PDF generation
reportlab: Generates styled PDF documents programmatically Installing only the required dependencies helps keep the Glue job lightweight and efficient.

📋 Step 2.2: Trello Data Extraction Class

The Trello class encapsulates all interactions with the Trello REST API and is responsible for retrieving, enriching, and preparing project data for AI analysis.

Key input parameters

BUCKET_NAME: Target S3 bucket for exporting processed data
API_KEY / API_TOKEN: Trello credentials retrieved securely from Secrets Manager
S3: Helper class instance used to write data to Amazon S3

Dataset design considerations

Although Trello provides a large number of fields, the implementation intentionally selects a minimal but meaningful subset of columns:

self.DATAFRAME_COLUMNS = [
    'id', 'dueComplete', 'desc', 'listName', 'name',
    'start', 'checkItems', 'checkItemsChecked', 'due', 'time_to_due']

This design choice offers several benefits:

Reduces token usage during AI inference (lower cost)
Avoids passing empty or unused fields
Improves model focus and processing efficiency

Temporal enrichment

The class automatically calculates the number of days remaining until each task’s due date (time_to_due). This temporal context helps the AI model reason about urgency, delays, and potential risks.
Finally, the data can be exported to Amazon S3 in CSV format or returned as filtered JSON, typically limited to tasks in To Do and Doing states.

🧩 Step 2.3: AWS Helper Classes (boto3 Abstractions)

To keep the AWS Glue notebook readable, modular, and maintainable, all AWS service interactions are encapsulated into small helper classes built on top of boto3.

aws_s3

Handles all Amazon S3 operations, including:

Reading prompt templates and input files
Writing intermediate datasets
Persisting generated PDF reports
Automatically partitioning outputs by execution date

aws_secrets_manager

Responsible for securely retrieving sensitive configuration from AWS Secrets Manager, in our use case is the Trello API credentials.

aws_ses

Manages email delivery workflows:

Reads the generated PDF report from S3
Renders an HTML email body (template stored in the repository)
Attaches the PDF report
Sends emails to configured recipients

🧠 Step 2.4: AWS Bedrock Integration and Inference Strategy

The AWSBedrock class manages the interaction with Amazon Bedrock, invoking the Amazon Nova Lite model to analyze Trello project data.

Model inputs

The model receives:

A filtered dataset (JSON) containing only relevant tasks and fields
A custom prompt defining the analysis objectives, expected insights, and report structure

Both the dataset and the prompt can be adjusted to fit different team practices or project types. The prompt used in this tutorial is provided in the repository as a reference example.

class AWSBedrock():
    def __init__(self,
PROMPT:str, 
DATASET: str, 
REGION:str="us-east-1",
MODEL_ID:str ="amazon.nova-lite-v1:0" ):
        self.prompt = PROMPT
        self.dataset = DATASET
        self.prompt_final = f"{self.prompt} {self.dataset}"
        self.region = REGION
        self.model_id =MODEL_ID

    def create_bedrock_client(self):
        bedrock = boto3.client(service_name='bedrock-runtime', region_name=self.region)
        return bedrock

    def get_payload(self):
        payload = {
        "messages": [
            {
                "role": "user",
                "content": [{"text": self.prompt_final}]
            }
        ],
        "inferenceConfig": {
            "max_new_tokens": 5000,
            "temperature": 0.4,
            "top_p": 0.9
        }
    }
        return payload

    def invoke_model(self):
        try:
            bedrock = self.create_bedrock_client()
            payload = self.get_payload()

            response = bedrock.invoke_model(
                modelId=self.model_id,
                body=json.dumps(payload)
            )

            response_body = json.loads(response['body'].read())
            data =response_body['output']['message']['content'][0]['text']
            return data
        except Exception as e:
             print(f"Error: {e}"

Inference configuration

"inferenceConfig": {
    "max_new_tokens": 5000,
    "temperature": 0.4,
    "top_p": 0.9
}

max_new_tokens (5000): Allows the model to generate detailed, structured reports
temperature (0.4): Ensures consistent and reliable analysis while preserving enough flexibility to detect patterns and nuances
top_p (0.9): Enables controlled diversity in model responses

A temperature of 0.4 was selected after iterative testing, as higher values introduced unnecessary variability, while lower values reduced the model’s ability to surface implicit risks and insights.
Before finalizing this configuration, multiple test runs were performed, refining both the dataset and the prompt to ensure the output aligned with the intended project analysis goals.

If you want to learn more about how these parameters work, I've included this article.

Romina Elena Mendez Escobar

Sep 9 '25

GenAI Foundations – Chapter 2: Prompt Engineering in Action – Unlocking Better AI Responses

#ai #openai #data

Comments

16 min read

📄 Step 2.5: Report Generation and Distribution

The MarkdownPDFReport class converts AI-generated Markdown into a professional, styled PDF document.

Input parameters

The class requires only:

Markdown text generated by the AI model
An optional output path (in-memory or file-based)

Key features

Custom heading hierarchies and typography
Styled tables and lists
Emoji-to-symbol mapping for visual status indicators
Fully customizable styles defined in internal methods

All visual styles are centralized and can be easily adapted to match organizational branding or reporting standards.

Once generated, the PDF is stored in 🪣 Amazon S3* and sent via 📩 email using the previously described SES class, the email HTML template used for embedding the report is also available in the repository and can be modified as needed.

📄 Example Output: Email and Report Preview

Below is an example of the report generated by the solution. The complete output consists of a six-page PDF, but for illustration purposes, the following screenshots show the cover page and a selection of summary tables used to highlight key project insights.

Conclusions

This article demonstrates how combining Kanban project data with generative AI can significantly enhance the way teams understand, communicate, and manage complex software projects. Beyond the technical implementation, several key insights and lessons emerged from this use case.

📉 Reducing Bias and Improving Decision-Making

One of the main benefits of this approach is the ability to reduce subjective bias in project analysis. By evaluating task metadata, timelines, and written communication through AI-driven semantic analysis, teams gain a more objective view of project status, risks, and bottlenecks.
This enables more focused stakeholder discussions and allows follow-up meetings to be based on concrete, data-driven insights rather than individual perceptions.

🗣️ Enhancing Stakeholder Communication

In projects with a large number of tasks and contributors, explaining delays or risks can be challenging. Automatically generated reports help translate complex project data into clear, structured summaries, making it easier to communicate issues, dependencies, and priorities to non-technical stakeholders and leadership teams.

🔄 Dataset and Tooling Flexibility

Although this example is based on Trello, the same approach can be applied to other project management tools such as Jira, Azure DevOps, Odoo, or similar platforms. By adapting the data extraction layer, teams can reuse the same analysis and reporting pipeline across different tools and project types.
Selecting only relevant fields remains critical, as passing unnecessary or empty data increases token usage without improving insight quality.

💬 Prompt Design as a Key Success Factor

Prompt engineering plays a central role in the quality of the generated insights. Providing better context—such as project goals, roadmap expectations, risks, or delivery constraints—helps the model produce more accurate and actionable conclusions.
During experimentation, iterative prompt refinement proved essential. In some cases, enforcing a strict output format (such as JSON) reduced the depth of the analysis, whereas allowing freer, unstructured responses resulted in richer conclusions. This highlights the importance of testing different prompt strategies rather than assuming a single optimal format.

📑 Output Formats and Performance Considerations

While this solution generates Markdown and converts it into a PDF report, alternative output formats such as JSON can also be produced. However, structured formats may negatively impact model performance if they overly constrain the response. Choosing the right output format depends on the downstream use case—human consumption, system integration, or further automation.

🧩 Model Selection Matters

Model choice significantly affects the quality of insights. Initial experiments using Amazon Titan did not produce sufficiently meaningful conclusions for this use case. After evaluating multiple options, Amazon Nova proved to be the best fit, offering a better balance between contextual understanding, analytical depth, and consistency.

Final Thoughts

AI should not replace project management practices, but it can act as a powerful decision-support layer, helping teams identify risks earlier, communicate more effectively, and focus discussions on what truly matters. With careful dataset selection, prompt design, and model evaluation, this approach can be adapted to a wide range of project environments and organizational needs.

📚References

Amazon Web Services. (n.d.). AWS Glue documentation. https://docs.aws.amazon.com/glue/
Amazon Web Services. (n.d.). AWS Bedrock. https://aws.amazon.com/en/bedrock/
Amazon Web Services. (n.d.). Amazon Nova: Generative AI models. https://aws.amazon.com/es/ai/generative-ai/nova/
Asana. (n.d.). What is Kanban?. https://asana.com/es/resources/what-is-kanban
Kanban Tool. (n.d.). Kanban history and evolution. https://kanbantool.com/kanban-guide/kanban-history

📌 How to cite this article

APA style

Mendez Escobar, Romina Elena. (2025). Data-Driven Project Analysis: Analyzing Trello Kanban Projects with AI on AWS Bedrock.

https://dev.to/aws-builders/data-driven-project-analysis-analyzing-trello-kanban-projects-with-ai-on-aws-bedrock-15f4

BibTeX


text
@article{mendez2025aiawstrello,
  title  = {Data-Driven Project Analysis: Analyzing Trello Kanban Projects with AI on AWS Bedrock},
  author = {Mendez Escobar, Romina Elena},
  year   = {2025},
  url    = {https://dev.to/aws-builders/data-driven-project-analysis-analyzing-trello-kanban-projects-with-ai-on-aws-bedrock-15f4}
}

From Raw Clinical Data to AI: Building a Modern Healthcare Data Platform on AWS

Romina Elena Mendez Escobar — Tue, 09 Dec 2025 10:04:02 +0000

The OMOP Common Data Model (CDM) is a standard for observational health data that allows the analysis of clinical data in a consistent and reproducible way. Implementing OMOP CDM in AWS requires a robust architecture that handles everything from data ingestion to advanced AI analysis, maintaining the highest standards of security and regulatory compliance, especially HIPAA for health data.

This guide describes a set of components in an architecture within AWS, and these do not define the only possible solution, I am only presenting a proposal of a series of components that you can use among the many services that this platform has available.

────────────────────────────────

🗂️ What is OMOP CDM?

The OMOP Common Data Model (CDM) is a standard designed by the OHDSI community to represent observational health data in a uniform way. Its main objective is to enable the standardization of medical data where different institutions, clinical systems and databases speak the same “language,” in order to facilitate reproducible analysis, cohort comparisons and multicenter studies.
The model is based on a set of normalized tables, standardized vocabularies and modeling conventions that define how patients, diagnoses, procedures, medication, clinical measurements, visits and temporal events should be represented.

────────────────────────────────

👤 Model Structure: Patient as Central Entity

OMOP organizes the information around the patient, who acts as the central unit of the model, and this structure allows the reconstruction of the patient’s clinical timeline and the analysis of their events in a temporal way.

────────────────────────────────

❤️ Standardized Vocabularies: the semantic heart of OMOP

One of the most important strengths of the CDM is the use of standardized vocabularies, which replace the diversity of ways of writing the same text with numeric IDs. These IDs allow the representation of clinical concepts in a consistent, interoperable and computable way.
In addition, the vocabularies have:

Hierarchies (for example, “type 2 diabetes mellitus” is a subconcept of “endocrine and metabolic diseases”),
Semantic relationships,
Standard and non-standard concepts.

Thanks to these hierarchies, an analyst can perform broad studies without knowing all the specific codes. For example, to analyze metabolic diseases, they can query the higher category and automatically include all subclasses (including different types of diabetes)

────────────────────────────────

☁️ OMOP in AWS

The architecture of the OMOP Common Data Model can be implemented in multiple environments (on-premise, hybrid or in different cloud providers). However, AWS offers a particularly robust ecosystem to address the challenges of standardization, integration, governance and advanced clinical data analysis.

In this section, we explore how to combine AWS services to build a complete pipeline that allows ingesting, transforming, standardizing and analyzing health data under the OMOP standard, maintaining high levels of security, regulatory compliance and operational efficiency.

⚠️ This approach is not intended to be the only way to implement OMOP, but a practical and modular guide that will allow you to understand which AWS services can help you in each phase of the process.

OMOP in AWS: Services by section

(1) 📄 Data: Clinical Sources, APIs and Personal Devices

In a modern health ecosystem, data no longer comes only from a hospital’s internal systems. Today, clinical information is distributed across multiple platforms, technologies and devices, requiring architectures capable of integrating, unifying and standardizing heterogeneous sources.

(2) 🔧 Pipeline Services: Data Ingestion and Initial Processing

To build a robust pipeline that enables the standardization of clinical data toward OMOP, it is essential to define how the data is extracted, ingested and prepared before transformation.
In this stage, the main objective is to capture the data from different sources and store them in raw format in Amazon S3, always preserving traceability and the original state of the information.
Below are the key services used in this phase:

Amazon MWAA (Managed Workflows for Apache Airflow)
Amazon MWAA allows running Apache Airflow DAGs without managing the underlying infrastructure.

Amazon Kinesis
Hospitals and health devices generate more and more real-time data; for these scenarios, Amazon Kinesis offers a highly scalable streaming solution.
The combined use of:

Kinesis Data Streams (real-time ingestion)
Kinesis Data Firehose (automated delivery to S3) allows capturing data streams without additional infrastructure and storing them directly in the raw bucket, ready to be processed by Airflow or other services.

AWS Lambda
This service allows executing serverless functions without provisioning servers, which makes it ideal for small tasks and specific events within the pipeline.
In this context, it is used for:

Lightweight pre-validation or normalization processes before sending files to S3.
Moving or restructuring files when new data arrives.
Automatic triggers when new objects are detected in S3 (for example, activating notifications).

(3) 🗂️ RAW Storage

Once extracted, all data will be stored initially in Amazon S3, which will act as the RAW zone of the data lake. This layer preserves the data in its original format, without transformations, to guarantee traceability, auditing and reprocessing capability.
Storage in S3 must be complemented with a set of key practices:

IAM + S3 Bucket Policies ensure role-based access.
Tags help automate governance and classification.
Lake Formation adds granular control at table/column level.
Lifecycle policies ensure retention and cost efficiency.

(4) 📌 Orchestration

In this section we describe the key DAGs we need to coordinate the different stages of the pipeline. Orchestration is essential to ensure that the extractions, transformations and loads are executed consistently, auditable and scalable.

(5) 🧠 AI & Unstructured Data

To process clinical notes and other unstructured data, we need to incorporate NLP techniques that allow extracting entities, mapping clinical concepts and automatically encoding information.
For this type of processing, we can rely on the following AWS services:

Amazon SageMaker
Allows training, tuning and deploying custom NLP models, from classic models to advanced transformer-based ones. It is ideal when full control of the ML pipeline, preprocessing, fine-tuning and integration with other system components is needed.

Amazon Comprehend Medical
Managed service that extracts clinical entities, relationships and conditions directly from medical text.
Important: Comprehend Medical supports a limited set of languages, so it is necessary to validate documentation before integrating it into the project.

In the following article you can find a complete implementation of a batch process using this service

Employing AWS Comprehend Medical for Medical Data Extraction in Healthcare Analytics

Romina Elena Mendez Escobar
Romina Elena Mendez Escobar

Romina Elena Mendez Escobar

Follow

Aug 7 '24

Employing AWS Comprehend Medical for Medical Data Extraction in Healthcare Analytics

#aws #python #datascience #nlp

4 reactions
Comments Add Comment

13 min read

Amazon Bedrock integrated with SageMaker
Although Bedrock is a separate service, it can be integrated into ML flows in SageMaker. Its main contribution is enabling foundational models and generative AI capabilities, opening the door to new use cases:

Automatic classification of clinical text.
Concept normalization assisted by generative models.
Semantic searches and context retrieval through vector databases (for example, to enrich mapping results or suggest probable clinical codes).

(6) 🩺 OMOP CDM

All processing stages converge in the implementation of the OMOP Common Data Model (CDM), stored in a relational database optimized for analytical and mixed workloads.

Amazon Aurora PostgreSQL
The recommended engine for hosting the CDM is Amazon Aurora PostgreSQL, because it:

Maintains full SQL compatibility and supports OHDSI ecosystem tools.
Provides high availability, automatic replication, and fast recovery.
Scales horizontally with read replicas, ideal for analytical and concurrent workloads.
Integrates seamlessly with ETL/ELT pipelines across AWS services.

Depending on the use case, Aurora can be complemented with additional analytics-oriented services.

Amazon Redshift
For advanced analytics over large datasets derived from the CDM, Amazon Redshift offers a distributed, high-performance environment for complex analytical queries.

Amazon Athena
Amazon Athena enables querying raw data stored in S3 without loading it into a database. It is especially useful for:

Quick validations before loading data into the CDM.
Debugging and data quality checks using SQL.
Exploring semi-structured files (CSV, JSON, Parquet).

Amazon ElastiCache
When the solution requires high-frequency or computationally expensive queries on the OMOP model, adding a cache layer with Redis or Memcached helps:

Reduce latency for repeated queries.
Store results of heavy computations (e.g., cohort definitions, vocabulary lookups).
Improve performance for dashboards and clinical applications that require fast responses.

(7) 📊 Data Visualization

Data visualization is essential not only to consume information but also to analyze, monitor and validate each stage of the pipeline. As we process clinical data, vocabularies, transformations and AI results, we need tools that make the quality, behavior and evolution of the data evident.

Below are various options depending on the use case:

Amazon QuickSight: It enables fast, interactive dashboards connected to Aurora, Redshift, Athena or S3. Its in-memory SPICE engine accelerates visualizations at scale while reducing load on source databases, making it ideal for data quality tracking and clinical monitoring.
Amazon SageMaker Model Dashboard: The SageMaker Model Dashboard centralizes observability for ML workflows, displaying metrics such as precision, recall and F1-score, along with model versions, drift indicators and execution history. This makes it easier to detect degradation early and maintain reliable NLP or predictive models.
Amazon Fargate / Amazon EKS: When fully custom dashboards are required—such as advanced visualizations, semantic comparisons or interactive analytics—Fargate and EKS provide the compute layer to run applications built with tools like Plotly, Dash, Streamlit or React-based libraries. This allows teams to create

(8) 🧭 Data Governance

Data governance is critical when working with sensitive health information, ensuring that data remains cataloged, documented and protected throughout every stage of the pipeline. A strong governance layer enforces access policies, allowing only authorized users to interact with clinical datasets under strict regulatory requirements. It also guarantees full traceability, enabling auditing of how data is accessed, transformed and shared across environments. Finally, governance provides controlled discoverability, ensuring that curated datasets can be safely searched and consumed while maintaining consistent metadata.

AWS Lake Formation
AWS Lake Formation centralizes governance for data stored in S3, offering fine-grained permissions at the table, column or row level, enforcing traceability and integrating tightly with the Glue Data Catalog to maintain consistent metadata.

Amazon DataZone
Amazon DataZone supports the organized publication and controlled sharing of datasets across the organization, enabling teams to work within structured data domains—such as Clinical, NLP, OMOP or Research—while unifying cataloging, governance and collaboration in one environment.

(9) 🔐 Security and Networking

Security and connectivity are fundamental pillars in any health data architecture, especially to comply with regulations such as HIPAA. In AWS, there are multiple services that protect both data and infrastructure. Below we describe the main components and their role within our OMOP CDM architecture.

(10) 🎚️ Monitoring and Billing

Monitoring and cost control are essential in health data architectures, especially when processing large clinical datasets or running AI workloads where training and inference can be resource-intensive.

🔍 Monitoring
AWS CloudWatch provides centralized metrics, logs and events from all AWS services, enabling teams to track infrastructure health, Airflow DAG execution and the behavior of ETL/ELT pipelines while receiving alerts for anomalies. For deeper inspection, AWS X-Ray traces requests across distributed systems—such as containerized services on ECS/EKS or APIs that expose OMOP data—making it easier to detect bottlenecks and debug complex data flows.

🧾 Billing
To maintain financial visibility and prevent cost overruns, AWS Cost Explorer offers detailed insights into usage patterns across services, including AI and data-intensive components. Complementing this, AWS Budgets allows setting custom spending limits and automated alerts, ensuring that project costs remain predictable and aligned with operational goals.

(11)🧱 Code & Deployment

Managing code and deploying infrastructure is essential to guarantee reproducibility, traceability and security in cloud-based health projects. This includes not only provisioning resources, but also maintaining reliable pipelines, consistent environments and well-governed ML assets.

🔧 Infrastructure as Code
Terraform allows defining the entire AWS architecture in a declarative way, ensuring that environments remain consistent and reproducible across development, staging and production. It supports provisioning core components such as S3 buckets, VPCs, databases and IAM roles while enforcing infrastructure governance.

🗂️ Versioning & CI/CD
GitHub serves as the central platform for code collaboration, offering pull requests, reviews and issue management. With GitHub Advanced Security, teams can catch vulnerabilities early through dependency scanning and code analysis.
GitHub Actions complements this by automating CI/CD pipelines building containers, validating data quality, deploying Airflow DAGs or updating infrastructure definitions—ensuring that each change is tested and safely promoted.

🏷️ Models & Containers
For containerized workloads, Amazon ECR provides a secure and scalable registry for images used in ECS, EKS or Fargate, ensuring consistency across environments. In parallel, the Amazon SageMaker Model Registry manages ML model versions, capturing lineage, approvals and metadata so that each model deployed into production remains auditable and reproducible.

(12) 🚀 AI Consume

Once the data is standardized and loaded into the OMOP CDM, it becomes the foundation for advanced analytics, AI-driven insights and secure data consumption. This unlocks opportunities for clinical research, decision support and the development of intelligent health applications.

☁️ Data Consumption through APIs
Standardized OMOP data can be exposed through secure API layers, enabling internal and external systems to retrieve curated clinical information. Services such as Amazon API Gateway combined with AWS Lambda provide scalable, low-latency endpoints that support both real-time and batch consumption.

📊 Advanced Analysis and Machine Learning
Amazon SageMaker enables training, evaluating and deploying Machine Learning models directly on top of OMOP data. This supports use cases such as predicting clinical risks, classifying patients by comorbidities or analyzing treatment response patterns, all while integrating seamlessly with the existing data pipeline.

🧩 Vector Search with Aurora and pgvector
By storing patient feature vectors in Aurora PostgreSQL using pgvector, the system can perform semantic similarity searches between patients or clinical cases. This capability enhances cohort discovery and enables personalized recommendation workflows.

🧠 Generative AI with Amazon Bedrock
Amazon Bedrock provides access to foundational models that can summarize clinical notes, extract information from unstructured text or augment concept mapping processes, expanding analytical depth through generative AI.

Researchers can query patients with similar disease profiles using pgvector, deploy readmission prediction models in SageMaker or generate automated insights from clinical notes using Bedrock-powered NLP.

📚 Conclusions

This guide presents a compact proposal for implementing OMOP CDM on AWS, showing how its services can support secure, scalable and efficient clinical data processing. The architecture is flexible and can be adapted to different project needs.

AWS provides an ecosystem that covers the entire data lifecycle, allowing integration with open-source tools and containerized workloads while maintaining control over performance and costs. This balance is especially important in health and AI-driven environments.

Building on strong governance and security practices, the proposed approach demonstrates that AWS enables compliant and reliable data workflows. With the right configuration, clinical data can be transformed into meaningful insights for research, analytics and innovation.

AWS re:Invent 2025: Updates in Infrastructure, Security, and Compute + Learning Path Summary

Romina Elena Mendez Escobar — Mon, 08 Dec 2025 09:52:24 +0000

📖 Introduction

At re:Invent 2025, AWS placed Generative AI at the center, moving from simple chats to agents that understand context, execute tasks, and integrate natively with infrastructure, security, and data services. Within this approach, AWS launched in Skill Builder a learning path with 33 courses and more than 60 hours to learn these new services, from fundamental to advanced level.

🔍 Why is this re:Invent a turning point?

The big novelty this year is how generative AI stops being an isolated component and becomes a central engine that drives automation, security, infrastructure, and operations. We are entering a stage where:

🤖 Agents not only process language: they execute real actions in AWS.
🔧 IaC automation is complemented by intelligent flows that detect, decide, and act.
🔓 Securit y is transformed thanks to the ability to analyze large volumes of logs in seconds, where every minute is critical.
🗂️ Data engineering and observability are rewritten with agents that contextualize, correlate, and recommend.

To support this technological leap, AWS launched new services (some very recent) and updated others, which motivated the design of an integrated learning path to learn them in a structured way.

🛠️Learning path details

33 total courses and more than 60 hours of content.
26 fundamental-level courses, 4 intermediate, and 3 advanced, combining updates of existing services with completely new launches.

📘 Service Overviews & Course Levels

The learning path organizes 33 courses by technical depth to help learners navigate new AWS services efficiently.

Link 👉 https://skillbuilder.aws/learning-plan/JZQY2Z8DG4/aws-reinvent-2025-announcements-learning-plan/VWQU3VK65K

Course Levels:

🟢 Beginner (26 courses): Introduces core services and fundamental concepts.
🟡 Intermediate (4 courses): Covers integration, automation, and real-world deployments.
🔴 Advanced (3 courses): Focuses on autonomous agents, high-performance compute, and advanced security.

Kiro

It is a development environment (IDE) with AI agents that start from a written specification and generate code, tests, and documentation, helping to design and maintain applications more quickly and consistently.
⏱ 3:30 hours | 📚 3 courses

🟢 Kiro Getting Started
🟢 Introduction to Kiro powers (Update)
🟡 Spec-Driven Development with Kiro

Amazon Nova 2

It is a family of multimodal generative AI models (text, image, audio, video) designed for advanced reasoning, conversational assistants, and content generation in enterprise applications.
⏱ 04:15 hours | 📚 4 courses

🟢 Amazon Nova 2: Understanding Models (New)
🟢 Amazon Nova 2 Sonic: Next-Generation Conversational AI (Update)
🟢 Introduction to Amazon Nova Forge (New)
🟡 Extended Thinking with Amazon Nova (Update)

Amazon Quick Suite

An integrated analytics and business intelligence platform powered by generative AI that unifies agents for research, data visualization, and workflow automation, accessible via chat and embedded in tools like browser, Slack, or Office.
⏱ 03:10 hours | 📚 3 courses

🟢 Introduction to Amazon Quick Suite
🟢 Getting Started with Administering Amazon Quick Suite
🟡 Amazon Quick Automate – Building Intelligent Workflows (Update)

AWS DevOps Agent

AI agent for operations that analyzes events and metrics, automates incident response, assists with root cause analysis, and suggests preventive actions to improve reliability.
⏱ 1:00 hour | 📚 1 course

🟢 Introduction to AWS DevOps Agent (New)

AWS AI Factories

It is a dedicated AI infrastructure solution deployed in the customer’s data center, with specialized hardware to train and run models while maintaining data sovereignty.
⏱ 00:30 minutes | 📚 1 course

🟢 Introduction to AWS AI Factories (New)

Amazon SageMaker

It is AWS’s managed machine learning platform that offers notebooks, data preparation tools, model training, and model deployment, now with more serverless options and a focus on foundation models. In this latest update, it includes a set of “SageMaker AI” capabilities such as serverless notebooks, simplified customization of foundation models, and elastic training with HyperPod to scale without managing infrastructure.
⏱ 03:30 hours | 📚 4 courses

🟢 Introduction to Amazon SageMaker Notebooks (Update)
🟢 Introduction to Model Customization in Amazon SageMaker AI (Update)
🔴 Elastic Training on Amazon SageMaker HyperPod (New)
🔴 Checkpointless Training on Amazon SageMaker HyperPod (New)

AWS Security Agent

Security agent that reviews from code to production environment, automates configuration assessments and penetration tests, and generates recommendations to reduce risk throughout the development lifecycle.
⏱ 00:30 minutes | 📚 1 course

🟢 Introduction to AWS Security Agent (Tech Preview) (Update)

Amazon Bedrock

It is the service that allows building and operating AI agents based on foundation models, with security controls, continuous evaluation, and policies to govern their behavior.
⏱ 02:10 hours | 📚 2 courses

🟢 AgentCore Evaluation on Amazon Bedrock (New)
🟢 AgentCore Policy on Amazon Bedrock (New)

Amazon EC2

This service has new compute instances with next-generation GPUs designed to train and serve large AI models with high performance. The new instances are optimized for frontier model training, combining next-generation GPUs with network and storage improvements to offer several times more performance than previous generations.
⏱ 02:30 hours | 📚 5 courses

🟢 Introduction to Amazon EC2 P6e-GB300 UltraServers (Update)
🟢 Introduction to Capacity Manager for Amazon EC2 (Update)
🟢 Introduction to Amazon EC2 Instance Attestation (New)
🟢 Introduction to Amazon EC2 P6-B300 Instances (New)
🟢 Introduction to Capacity Manager for Amazon EC2 (New)

Amazon S3 Vectors

It is an S3 capability to store vectors (embeddings) and perform semantic and similarity searches on documents, images, or other objects.
⏱ 1:00 hour | 📚 1 course

🟢 Amazon S3 Vectors Getting Started (Update)

Amazon FSx for NetApp ONTAP

Fully managed service that provides ONTAP file systems with enterprise features (snapshots, clones, replication) and the elasticity and pay-as-you-go model of AWS cloud.
⏱ 1:15 hours | 📚 1 course

🟢 Amazon FSx for NetApp ONTAP Primer (Update)

Amazon Aurora PostgreSQL
It is a relational database compatible with PostgreSQL that adds policies to hide or transform sensitive data. This new functionality allows defining dynamic masking policies so that sensitive data is displayed differently depending on the user’s role, reinforcing access control at column and row level.
⏱ 1:30 hours | 📚 1 course

🔴 Dynamic Data Masking in Aurora PostgreSQL (New)

AWS Transform

It is a suite of AI-powered tools to modernize .NET applications, full-stack Windows, and custom code, automating analysis, refactoring, and migration to accelerate legacy modernization toward cloud-native architectures.
⏱ 03:00 hours | 📚 3 courses

🟢 AWS Transform for .NET Getting Started (Update)
🟢 AWS Transform Custom (New)
🟢 AWS Transform Full-Stack Windows (New)

🚀 Conclusion: AI as the Engine of the Cloud Ecosystem

AWS re:Invent 2025 marks a decisive turning point: Generative AI has moved beyond being an isolated tool to become the central engine that drives the transformation of the cloud ecosystem.

This learning path of 33 courses is not just a set of trainings but a strategic roadmap showing how infrastructure, security, and operations converge with AI to enable a new generation of solutions.

The incorporation of agents, along with the evolution of compute and security improvements, is creating environments that are much more autonomous, efficient, and prepared for new use cases.

Specialized infrastructure plays a key role, where AWS AI Factories ensure data sovereignty in regulated industries, while the new EC2 instances optimized for AI increase performance for model training and deployment at scale. In this set of updates, it is clear that foundation models are becoming more powerful and are a fundamental part of decision-making, intelligent automation, and the creation of AI-powered products, generating real competitive advantage for organizations that can combine AI + infrastructure + security as a single strategy.

Therefore, this learning path is the ideal starting point to learn the new features, prepare your skills, and put them into practice in your next project within the AWS ecosystem.

From Search to Story: Using Gemini API to Automate Brand Content Analysis with Python

Romina Elena Mendez Escobar — Mon, 20 Oct 2025 17:54:15 +0000

Introduction

In a hyperconnected world, every post, comment, or interaction contributes to building a brand's reputation. Therefore, identifying what people are talking about and turning it into stories that inform, inspire, and connect is essential for any modern communication strategy.

This article was born from a concrete question: how can Generative AI be used to discover what is being said about a company and transform that information into relevant stories? Stories that reflect real experiences and concerns, turning them into inspiring narratives that strengthen brand identity.

In this tutorial, you will learn how to use Google Gemini to:

🔍 Search for information using generative AI integrated with Google Search
✍️ Transform findings into structured journalistic narratives
📊 Generate visual reports with graphics and automated storytelling

What is Brand Journalism?

According to an article by The New York Times Licensing Group, readers experience significant content fatigue: there are more than 1.8 billion websites and over 70 million blogs published each month.

Brand Journalism is a communication strategy where brands adopt journalistic techniques to tell relevant and engaging stories. Instead of direct advertising messages, content is created with a narrative, informative, and value-added approach, similar to traditional media.

Key Features

Journalistic techniques: Application of rigorous journalistic methods to create credible and well-structured content.
Audience interests: Focus on the real interests of the audience, not just the messages the brand wants to convey.
Quality and useful information: Content that educates, informs, or solves concrete problems.
Use of different formats: Variety of formats (reports, interviews, analyses, infographics, videos) to maintain engagement.
Storytelling: Narratives that connect emotionally with values, experiences, and social impact.

Benefits

The benefits we can identify based on this are:

Brand Positioning: Establish yourself as a thought leader in your industry.
Audience Loyalty: Build authentic and lasting relationships with your audience.
Differentiation against the Competition: Stand out from competitors through higher-quality editorial content.
Greater Organic Reach: Valuable content is naturally shared, amplifying reach without direct advertising investment.

What is Generative AI?

Generative AI is a branch of artificial intelligence focused on creating new and original content: text, images, audio, video, or synthetic data. Its development has been possible thanks to deep learning, especially through advanced architectures such as transformers, which process information in parallel and capture complex relationships in large data volumes.

Additional Resources on GenAI

I have written a series of articles on the fundamentals of generative AI

Gemini

Gemini is a family of multimodal AI models developed by Google DeepMind. It integrates into multiple Google products and can process text, images, and other data types simultaneously.

Grounding with Google Search

For this use case, we will use the Grounding with Google Search functionality, which connects the model directly to Google to perform searches and obtain up-to-date information.

Main Advantages:

📏Increased Accuracy: Reduces model hallucinations by accessing verifiable information.
⚡️Real-Time Information: Access to current data, reducing uncertainty about the model's knowledge.
📚Citations and References: Retrieves source links and provides control over consulted data sources.

Use Case

Brand Journalism is a strategic tool for companies to communicate their values from an authentic perspective. However, we often need to find topics that might interest our target audience, so it is essential to search for:

Mentions of the company on different sites
Reputation and notable aspects
Trends and relevant conversations

This starting point helps those who write articles or create storytelling based not only on what the company wants to show but also on the external perspective others have of it.

Practical Example: 📱iPhone 17

Using the latest iPhone launch as an example, we will:

Search for recently published articles
Classify and analyze these documents
Generate a report with visualizations, conclusions, and structured narratives

Next, we will see how to implement this strategy through an automated workflow that integrates AI and data analysis.

Implementation Process

The following diagram illustrates how our automated analysis system works.

1️⃣ Search with Google Search

We use Grounding with Google Search to find relevant articles and request output in JSON format using this structure:

{ 
   "title": "full article title",
   "source_name": "media name",
   "date": "publication date",
   "url": "article link",
   "site_name": "website name",
   "summary": "2-4 line summary",
   "sentiment": "positive/negative/neutral",
   "category": "rumor/analysis/comparison/market/technical",
   "sentiment_score": "1-10 score"
 }

2️⃣ Storytelling Generation

We use another prompt to generate different types of narratives based on the articles found:

Analytical Insights: Compact analytical summary with concrete data.
Storytelling Narrative: Engaging mini-narrative based on dataset evidence.
Tone Variants (A/B/C): Three versions with different focuses: objective, emotional, and strategic.

3️⃣ Report Creation

We generate a PDF report including:

Charts created with Seaborn and Matplotlib
Visual trend analyses
Narrative conclusions based on generated storytelling
Customizing the layout using ReportLab

Tutorial

How Does Gemini Work with Google Search?

When performing a query, Gemini not only relies on its internal knowledge but also actively searches updated information on Google Search. This grounding capability allows the model to access real-time data, verify facts, and provide responses based on concrete sources, reducing hallucination risk and ensuring relevance.

Pre-requisite: Access to Gemini API

Before starting, you need to get access to the Gemini API:

Create an account in Google AI Studio
Create or log in with your Google account
Generate your API key from the control panel

Note: You can use Gemini's free tier to test this project.

Once you have your API key, configure it in a .env file:

API_KEY = "tu_api_key_de_gemini"
MODEL_ID = "gemini-2.5-flash"

We use Gemini 2.5 Flash because it is the most cost-efficient model optimized for frequent, low-cost tasks.

Repository Structure

For this tutorial you must clone the following repository and you can get the complete code from this tutorial.

RominaElenaMendezEscobar / brand-journalism-gemini

Tutorial about Brand Journalism Code Using Google Gemini

How to Use AI in Brand Journalism with Gemini to Transform Digital Information into Strategic Editorial Content?

Introduction

This repository was born from a concrete question: how can Generative AI be used to discover what is being said about a company and transform that information into relevant stories? Stories that reflect real experiences and concerns, turning them into inspiring narratives that strengthen brand identity.

In this tutorial, you will learn how to use Google Gemini to:

🔍 Search for information using generative AI integrated with Google Search
✍️ Transform findings into structured journalistic narratives
📊 Generate visual reports with graphics and automated storytelling

# What is Brand Journalism According to an…

View on GitHub

If you find this tutorial helpful, feel free to leave a star ⭐️ and follow me to get notified about new articles. Your support helps me grow within the tech community and create more valuable content! 🚀

project/
   ├── img/                    # Generated graphics
   ├── prompt/
   │   ├── prompt_search.txt       # Search Prompt
   │   └── prompt_storytelling.txt # Prompt for narrative
   ├── report/                # PDFs generated
   ├── brand_journalist_analyzer.py
   ├── report_plots.py
   ├── report_analysis.py
   └── main.py
   └── .env

Main Files

1️⃣ Prompts (/prompt)

🗒prompt_search.txt: Here we define how to perform the search in Google Search and structure the results in JSON. This prompt instructs the model to return structured information with fields such as the article's title, source, date, URL, summary, sentiment, and category.
🗒prompt_storytelling.txt: In this file, we define how to generate conclusions and storytelling based on the articles found. It requests different types of outputs, including objective analysis, immersive narratives, and three tone variants (objective, emotional, and emotional).

2️⃣ Brand Journalism Analyzer

🗒brand_journalist_analyzer.py: This class is the core of the application and handles all interaction with the Gemini API. It implements three main functionalities: news retrieval using Google Search, structured storytelling generation, and analytical insights extraction. The most important method is search_news(), which executes real-time searches and returns structured data in JSON format. To use integrated Google Search, simply set config={"tools": [{"google_search": {}}]} in the API call.

def search_news(self, max_retries=1, create_dataframe=True):
    """Search for news on a topic using Google Search."""
    prompt = self.search_prompt()

    response = self.client.models.generate_content(
        model=self.model_id,
        contents=prompt,
        config={"tools": [{"google_search": {}}]}
    )

    # Process and clean JSON response
    txt = response.candidates[0].content.parts[0].text
    clean_text = self._clean_json_response(txt)

    return json.loads(clean_text)

3️⃣ Visualization Generator

report_plots.py: This class creates all the report visualizations using Seaborn and Matplotlib. It generates three essential chart types: a bar chart showing which media outlets publish the most on the topic, a timeline visualizing the evolution of publications over time, and a heatmap that cross-references sentiment with content categories. All visual aspects are customizable: color palette, titles, axis labels, and save paths. The methods first prepare the data with Pandas aggregations and then generate the visualizations, automatically saving them as PNG files.

4️⃣ PDF Report Generator

report_analysis.py: This class assembles the final report in professional PDF format using ReportLab. It combines multiple elements: a customizable logo, corporate-style headers, informative tables about the analyzed dataset, pre-generated visualizations, formatted narratives with full Markdown support (including headings, lists, code, and emphasis), and conclusions and storytelling sections with different tone variations.

🎯Process Orchestration

The main.py file constitutes the application's main entry point, orchestrating the entire Brand Journalism pipeline. This script coordinates the interaction between all the developed classes, managing the flow from real-time information retrieval to the generation of the final document, ensuring that each component executes in the correct order and with the necessary dependencies.

🐍main.py

from brand_journalist_analyzer import BrandJournalistAnalyzer
from report_analysis import ReportAnalysis
from report_plots import DataVisualizer
from dotenv import load_dotenv
import os
import pandas as pd
from datetime import datetime

if __name__ == "__main__":
    # Cargar variables de entorno
    load_dotenv()
    API_KEY = os.getenv("API_KEY")
    MODEL_ID = os.getenv("MODEL_ID")

    # Generar ruta con timestamp
    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
    output_path = f"report/news_report_{timestamp}.pdf"

    # Inicializar analizador
    analyzer = BrandJournalistAnalyzer(api_key=API_KEY)

    # Buscar o cargar noticias (usa caché si existe)
    response = analyzer._load_or_search(force_refresh=False)
    search_data = pd.DataFrame(response)

    # Generar storytelling y conclusiones
    storytelling = analyzer.get_storytelling(response)
    conclusion = analyzer.get_conclusion(response)

    # Crear visualizaciones
    visualizer = DataVisualizer(dataset=search_data)
    visualizer.plot_news_by_source()
    visualizer.plot_news_over_time()
    visualizer.plot_sentiment_category_heatmap()

    # Generar reporte PDF
    report = ReportAnalysis(
        dataset=search_data,
        filename=output_path,
        conclusion=conclusion,
        storytelling=storytelling,
    )
    report.create_report()

🗒 Report Generation

The system automatically generates a professional PDF report using Seaborn/Matplotlib for visuals and ReportLab for document layout. It includes:

Media coverage charts
Temporal trends
Heatmap crossing content categories with sentiment
Structured storytelling and analytical conclusions

Final Report Structure

In this use case, we generated a four-page PDF report that provides a comprehensive overview of the analysis, starting with complete details of the websites and media outlets where relevant news stories on the researched topic were found.

The document includes graphical visualizations specifically designed to analyze temporal publishing trends, allowing for the identification of patterns of interest over time, as well as categorical classifications based on the criteria identified by the AI model following the instructions defined in the search prompt.

The final section of the report presents analytical conclusions based on quantitative data and storytelling narratives structured in different tones, providing multiple perspectives on the same information.

💡 Conclusions

AI can be a powerful tool for optimizing research and analysis processes, but I still believe that authentic company communication requires the perspective, sensitivity, and values that only people can provide.

This tutorial offers an automated starting point that:

Collects and structures scattered information
Identifies patterns and trends in large data volumes
Generates evidence-based insights

However, Brand Journalism work should remain in the hands of professionals who can:

Interpret data within the organizational context
Align narratives with real corporate values
Add nuances, experiences, and internal perspectives
Ensure the message genuinely reflects brand identity
Humanize content with empathy and authentic connection

AI provides the knowledge foundation, but people create the true connection with the audience. Therefore, effective storytelling emerges from combining automated analysis with human narrative craftsmanship.

📚 References:

What Is Brand Journalism — and Why It Matters.
The New York Times Licensing Group.
Retrieved from https://nytlicensing.com/latest/marketing/brand-journalism-and-why-it-matters/
Gemini About.
Google.
Retrieved from https://gemini.google/about/
Pichai, S., & Hassabis, D. (2023, December 6). Introducing Gemini: Our largest and most capable AI model.
Google Blog.
Retrieved from https://blog.google/technology/ai/google-gemini-ai/#sundar-note
Grounding with Google Search.
Google AI Documentation.
Retrieved from https://ai.google.dev/gemini-api/docs/google-search?hl=es-419

Do you have any other thoughts or suggestions? Leave them in the comments.

How to Use AI in Brand Journalism with Gemini to Transform Digital Information into Strategic Editorial Content?

Romina Elena Mendez Escobar — Mon, 20 Oct 2025 08:32:49 +0000

Introduction

In this tutorial, you will learn how to use Google Gemini to:

🔍 Search for information using generative AI integrated with Google Search
✍️ Transform findings into structured journalistic narratives
📊 Generate visual reports with graphics and automated storytelling

What is Brand Journalism?

According to an article by The New York Times Licensing Group, readers experience significant content fatigue: there are more than 1.8 billion websites and over 70 million blogs published each month.

Key Features

Journalistic techniques: Application of rigorous journalistic methods to create credible and well-structured content.
Audience interests: Focus on the real interests of the audience, not just the messages the brand wants to convey.
Quality and useful information: Content that educates, informs, or solves concrete problems.
Use of different formats: Variety of formats (reports, interviews, analyses, infographics, videos) to maintain engagement.
Storytelling: Narratives that connect emotionally with values, experiences, and social impact.

Benefits

The benefits we can identify based on this are:

Brand Positioning: Establish yourself as a thought leader in your industry.
Audience Loyalty: Build authentic and lasting relationships with your audience.
Differentiation against the Competition: Stand out from competitors through higher-quality editorial content.
Greater Organic Reach: Valuable content is naturally shared, amplifying reach without direct advertising investment.

What is Generative AI?

Additional Resources on GenAI

I have written a series of articles on the fundamentals of generative AI

Gemini

Gemini is a family of multimodal AI models developed by Google DeepMind. It integrates into multiple Google products and can process text, images, and other data types simultaneously.

Grounding with Google Search

For this use case, we will use the Grounding with Google Search functionality, which connects the model directly to Google to perform searches and obtain up-to-date information.

Main Advantages:

📏Increased Accuracy: Reduces model hallucinations by accessing verifiable information.
⚡️Real-Time Information: Access to current data, reducing uncertainty about the model's knowledge.
📚Citations and References: Retrieves source links and provides control over consulted data sources.

Use Case

Mentions of the company on different sites
Reputation and notable aspects
Trends and relevant conversations

This starting point helps those who write articles or create storytelling based not only on what the company wants to show but also on the external perspective others have of it.

Practical Example: 📱iPhone 17

Using the latest iPhone launch as an example, we will:

Search for recently published articles
Classify and analyze these documents
Generate a report with visualizations, conclusions, and structured narratives

Next, we will see how to implement this strategy through an automated workflow that integrates AI and data analysis.

Implementation Process

The following diagram illustrates how our automated analysis system works.

1️⃣ Search with Google Search

We use Grounding with Google Search to find relevant articles and request output in JSON format using this structure:

{ 
   "title": "full article title",
   "source_name": "media name",
   "date": "publication date",
   "url": "article link",
   "site_name": "website name",
   "summary": "2-4 line summary",
   "sentiment": "positive/negative/neutral",
   "category": "rumor/analysis/comparison/market/technical",
   "sentiment_score": "1-10 score"
 }

2️⃣ Storytelling Generation

We use another prompt to generate different types of narratives based on the articles found:

Analytical Insights: Compact analytical summary with concrete data.
Storytelling Narrative: Engaging mini-narrative based on dataset evidence.
Tone Variants (A/B/C): Three versions with different focuses: objective, emotional, and strategic.

3️⃣ Report Creation

We generate a PDF report including:

Charts created with Seaborn and Matplotlib
Visual trend analyses
Narrative conclusions based on generated storytelling
Customizing the layout using ReportLab

Tutorial

How Does Gemini Work with Google Search?

Pre-requisite: Access to Gemini API

Before starting, you need to get access to the Gemini API:

Create an account in Google AI Studio
Create or log in with your Google account
Generate your API key from the control panel

Note: You can use Gemini's free tier to test this project.

Once you have your API key, configure it in a .env file:

API_KEY = "tu_api_key_de_gemini"
MODEL_ID = "gemini-2.5-flash"

We use Gemini 2.5 Flash because it is the most cost-efficient model optimized for frequent, low-cost tasks.

Repository Structure

For this tutorial you must clone the following repository and you can get the complete code from this tutorial.

RominaElenaMendezEscobar / brand-journalism-gemini

Tutorial about Brand Journalism Code Using Google Gemini

How to Use AI in Brand Journalism with Gemini to Transform Digital Information into Strategic Editorial Content?

Introduction

In this tutorial, you will learn how to use Google Gemini to:

🔍 Search for information using generative AI integrated with Google Search
✍️ Transform findings into structured journalistic narratives
📊 Generate visual reports with graphics and automated storytelling

# What is Brand Journalism According to an…

View on GitHub

project/
   ├── img/                    # Generated graphics
   ├── prompt/
   │   ├── prompt_search.txt       # Search Prompt
   │   └── prompt_storytelling.txt # Prompt for narrative
   ├── report/                # PDFs generated
   ├── brand_journalist_analyzer.py
   ├── report_plots.py
   ├── report_analysis.py
   └── main.py
   └── .env

Main Files

1️⃣ Prompts (/prompt)

🗒prompt_search.txt: Here we define how to perform the search in Google Search and structure the results in JSON. This prompt instructs the model to return structured information with fields such as the article's title, source, date, URL, summary, sentiment, and category.
🗒prompt_storytelling.txt: In this file, we define how to generate conclusions and storytelling based on the articles found. It requests different types of outputs, including objective analysis, immersive narratives, and three tone variants (objective, emotional, and emotional).

2️⃣ Brand Journalism Analyzer

🗒brand_journalist_analyzer.py: This class is the core of the application and handles all interaction with the Gemini API. It implements three main functionalities: news retrieval using Google Search, structured storytelling generation, and analytical insights extraction. The most important method is search_news(), which executes real-time searches and returns structured data in JSON format. To use integrated Google Search, simply set config={"tools": [{"google_search": {}}]} in the API call.

def search_news(self, max_retries=1, create_dataframe=True):
    """Search for news on a topic using Google Search."""
    prompt = self.search_prompt()

    response = self.client.models.generate_content(
        model=self.model_id,
        contents=prompt,
        config={"tools": [{"google_search": {}}]}
    )

    # Process and clean JSON response
    txt = response.candidates[0].content.parts[0].text
    clean_text = self._clean_json_response(txt)

    return json.loads(clean_text)

3️⃣ Visualization Generator

report_plots.py: This class creates all the report visualizations using Seaborn and Matplotlib. It generates three essential chart types: a bar chart showing which media outlets publish the most on the topic, a timeline visualizing the evolution of publications over time, and a heatmap that cross-references sentiment with content categories. All visual aspects are customizable: color palette, titles, axis labels, and save paths. The methods first prepare the data with Pandas aggregations and then generate the visualizations, automatically saving them as PNG files.

4️⃣ PDF Report Generator

report_analysis.py: This class assembles the final report in professional PDF format using ReportLab. It combines multiple elements: a customizable logo, corporate-style headers, informative tables about the analyzed dataset, pre-generated visualizations, formatted narratives with full Markdown support (including headings, lists, code, and emphasis), and conclusions and storytelling sections with different tone variations.

🎯Process Orchestration

🐍main.py

from brand_journalist_analyzer import BrandJournalistAnalyzer
from report_analysis import ReportAnalysis
from report_plots import DataVisualizer
from dotenv import load_dotenv
import os
import pandas as pd
from datetime import datetime

if __name__ == "__main__":
    # Cargar variables de entorno
    load_dotenv()
    API_KEY = os.getenv("API_KEY")
    MODEL_ID = os.getenv("MODEL_ID")

    # Generar ruta con timestamp
    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
    output_path = f"report/news_report_{timestamp}.pdf"

    # Inicializar analizador
    analyzer = BrandJournalistAnalyzer(api_key=API_KEY)

    # Buscar o cargar noticias (usa caché si existe)
    response = analyzer._load_or_search(force_refresh=False)
    search_data = pd.DataFrame(response)

    # Generar storytelling y conclusiones
    storytelling = analyzer.get_storytelling(response)
    conclusion = analyzer.get_conclusion(response)

    # Crear visualizaciones
    visualizer = DataVisualizer(dataset=search_data)
    visualizer.plot_news_by_source()
    visualizer.plot_news_over_time()
    visualizer.plot_sentiment_category_heatmap()

    # Generar reporte PDF
    report = ReportAnalysis(
        dataset=search_data,
        filename=output_path,
        conclusion=conclusion,
        storytelling=storytelling,
    )
    report.create_report()

🗒 Report Generation

The system automatically generates a professional PDF report using Seaborn/Matplotlib for visuals and ReportLab for document layout. It includes:

Media coverage charts
Temporal trends
Heatmap crossing content categories with sentiment
Structured storytelling and analytical conclusions

Final Report Structure

💡 Conclusions

This tutorial offers an automated starting point that:

Collects and structures scattered information
Identifies patterns and trends in large data volumes
Generates evidence-based insights

However, Brand Journalism work should remain in the hands of professionals who can:

Interpret data within the organizational context
Align narratives with real corporate values
Add nuances, experiences, and internal perspectives
Ensure the message genuinely reflects brand identity
Humanize content with empathy and authentic connection

📚 References:

What Is Brand Journalism — and Why It Matters.
The New York Times Licensing Group.
Retrieved from https://nytlicensing.com/latest/marketing/brand-journalism-and-why-it-matters/
Gemini About.
Google.
Retrieved from https://gemini.google/about/
Pichai, S., & Hassabis, D. (2023, December 6). Introducing Gemini: Our largest and most capable AI model.
Google Blog.
Retrieved from https://blog.google/technology/ai/google-gemini-ai/#sundar-note
Grounding with Google Search.
Google AI Documentation.
Retrieved from https://ai.google.dev/gemini-api/docs/google-search?hl=es-419

Do you have any other thoughts or suggestions? Leave them in the comments.

How to Use AI in Brand Journalism with Gemini to Transform Digital Information into Strategic Editorial Content?

Romina Elena Mendez Escobar — Mon, 20 Oct 2025 08:08:06 +0000

Introduction

In this tutorial, you will learn how to use Google Gemini to:

🔍 Search for information using generative AI integrated with Google Search
✍️ Transform findings into structured journalistic narratives
📊 Generate visual reports with graphics and automated storytelling

What is Brand Journalism?

According to an article by The New York Times Licensing Group, readers experience significant content fatigue: there are more than 1.8 billion websites and over 70 million blogs published each month.

Key Features

Journalistic techniques: Application of rigorous journalistic methods to create credible and well-structured content.
Audience interests: Focus on the real interests of the audience, not just the messages the brand wants to convey.
Quality and useful information: Content that educates, informs, or solves concrete problems.
Use of different formats: Variety of formats (reports, interviews, analyses, infographics, videos) to maintain engagement.
Storytelling: Narratives that connect emotionally with values, experiences, and social impact.

Benefits

The benefits we can identify based on this are:

Brand Positioning: Establish yourself as a thought leader in your industry.
Audience Loyalty: Build authentic and lasting relationships with your audience.
Differentiation against the Competition: Stand out from competitors through higher-quality editorial content.
Greater Organic Reach: Valuable content is naturally shared, amplifying reach without direct advertising investment.

What is Generative AI?

Additional Resources on GenAI

I have written a series of articles on the fundamentals of generative AI

Gemini

Gemini is a family of multimodal AI models developed by Google DeepMind. It integrates into multiple Google products and can process text, images, and other data types simultaneously.

Grounding with Google Search

For this use case, we will use the Grounding with Google Search functionality, which connects the model directly to Google to perform searches and obtain up-to-date information.

Main Advantages:

📏Increased Accuracy: Reduces model hallucinations by accessing verifiable information.
⚡️Real-Time Information: Access to current data, reducing uncertainty about the model's knowledge.
📚Citations and References: Retrieves source links and provides control over consulted data sources.

Use Case

Mentions of the company on different sites
Reputation and notable aspects
Trends and relevant conversations

This starting point helps those who write articles or create storytelling based not only on what the company wants to show but also on the external perspective others have of it.

Practical Example: 📱iPhone 17

Using the latest iPhone launch as an example, we will:

Search for recently published articles
Classify and analyze these documents
Generate a report with visualizations, conclusions, and structured narratives

Next, we will see how to implement this strategy through an automated workflow that integrates AI and data analysis.

Implementation Process

The following diagram illustrates how our automated analysis system works.

1️⃣ Search with Google Search

We use Grounding with Google Search to find relevant articles and request output in JSON format using this structure:

{ 
   "title": "full article title",
   "source_name": "media name",
   "date": "publication date",
   "url": "article link",
   "site_name": "website name",
   "summary": "2-4 line summary",
   "sentiment": "positive/negative/neutral",
   "category": "rumor/analysis/comparison/market/technical",
   "sentiment_score": "1-10 score"
 }

2️⃣ Storytelling Generation

We use another prompt to generate different types of narratives based on the articles found:

Analytical Insights: Compact analytical summary with concrete data.
Storytelling Narrative: Engaging mini-narrative based on dataset evidence.
Tone Variants (A/B/C): Three versions with different focuses: objective, emotional, and strategic.

3️⃣ Report Creation

We generate a PDF report including:

Charts created with Seaborn and Matplotlib
Visual trend analyses
Narrative conclusions based on generated storytelling
Customizing the layout using ReportLab

Tutorial

How Does Gemini Work with Google Search?

Pre-requisite: Access to Gemini API

Before starting, you need to get access to the Gemini API:

Create an account in Google AI Studio
Create or log in with your Google account
Generate your API key from the control panel

Note: You can use Gemini's free tier to test this project.

Once you have your API key, configure it in a .env file:

API_KEY = "tu_api_key_de_gemini"
MODEL_ID = "gemini-2.5-flash"

We use Gemini 2.5 Flash because it is the most cost-efficient model optimized for frequent, low-cost tasks.

Repository Structure

For this tutorial you must clone the following repository and you can get the complete code from this tutorial.

RominaElenaMendezEscobar / brand-journalism-gemini

Tutorial about Brand Journalism Code Using Google Gemini

How to Use AI in Brand Journalism with Gemini to Transform Digital Information into Strategic Editorial Content?

Introduction

In this tutorial, you will learn how to use Google Gemini to:

🔍 Search for information using generative AI integrated with Google Search
✍️ Transform findings into structured journalistic narratives
📊 Generate visual reports with graphics and automated storytelling

# What is Brand Journalism According to an…

View on GitHub

project/
   ├── img/                    # Generated graphics
   ├── prompt/
   │   ├── prompt_search.txt       # Search Prompt
   │   └── prompt_storytelling.txt # Prompt for narrative
   ├── report/                # PDFs generated
   ├── brand_journalist_analyzer.py
   ├── report_plots.py
   ├── report_analysis.py
   └── main.py
   └── .env

Main Files

1️⃣ Prompts (/prompt)

🗒prompt_search.txt: Here we define how to perform the search in Google Search and structure the results in JSON. This prompt instructs the model to return structured information with fields such as the article's title, source, date, URL, summary, sentiment, and category.
🗒prompt_storytelling.txt: In this file, we define how to generate conclusions and storytelling based on the articles found. It requests different types of outputs, including objective analysis, immersive narratives, and three tone variants (objective, emotional, and emotional).

2️⃣ Brand Journalism Analyzer

🗒brand_journalist_analyzer.py: This class is the core of the application and handles all interaction with the Gemini API. It implements three main functionalities: news retrieval using Google Search, structured storytelling generation, and analytical insights extraction. The most important method is search_news(), which executes real-time searches and returns structured data in JSON format. To use integrated Google Search, simply set config={"tools": [{"google_search": {}}]} in the API call.

def search_news(self, max_retries=1, create_dataframe=True):
    """Search for news on a topic using Google Search."""
    prompt = self.search_prompt()

    response = self.client.models.generate_content(
        model=self.model_id,
        contents=prompt,
        config={"tools": [{"google_search": {}}]}
    )

    # Process and clean JSON response
    txt = response.candidates[0].content.parts[0].text
    clean_text = self._clean_json_response(txt)

    return json.loads(clean_text)

3️⃣ Visualization Generator

report_plots.py: This class creates all the report visualizations using Seaborn and Matplotlib. It generates three essential chart types: a bar chart showing which media outlets publish the most on the topic, a timeline visualizing the evolution of publications over time, and a heatmap that cross-references sentiment with content categories. All visual aspects are customizable: color palette, titles, axis labels, and save paths. The methods first prepare the data with Pandas aggregations and then generate the visualizations, automatically saving them as PNG files.

4️⃣ PDF Report Generator

report_analysis.py: This class assembles the final report in professional PDF format using ReportLab. It combines multiple elements: a customizable logo, corporate-style headers, informative tables about the analyzed dataset, pre-generated visualizations, formatted narratives with full Markdown support (including headings, lists, code, and emphasis), and conclusions and storytelling sections with different tone variations.

🎯Process Orchestration

🐍main.py

from brand_journalist_analyzer import BrandJournalistAnalyzer
from report_analysis import ReportAnalysis
from report_plots import DataVisualizer
from dotenv import load_dotenv
import os
import pandas as pd
from datetime import datetime

if __name__ == "__main__":
    # Cargar variables de entorno
    load_dotenv()
    API_KEY = os.getenv("API_KEY")
    MODEL_ID = os.getenv("MODEL_ID")

    # Generar ruta con timestamp
    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
    output_path = f"report/news_report_{timestamp}.pdf"

    # Inicializar analizador
    analyzer = BrandJournalistAnalyzer(api_key=API_KEY)

    # Buscar o cargar noticias (usa caché si existe)
    response = analyzer._load_or_search(force_refresh=False)
    search_data = pd.DataFrame(response)

    # Generar storytelling y conclusiones
    storytelling = analyzer.get_storytelling(response)
    conclusion = analyzer.get_conclusion(response)

    # Crear visualizaciones
    visualizer = DataVisualizer(dataset=search_data)
    visualizer.plot_news_by_source()
    visualizer.plot_news_over_time()
    visualizer.plot_sentiment_category_heatmap()

    # Generar reporte PDF
    report = ReportAnalysis(
        dataset=search_data,
        filename=output_path,
        conclusion=conclusion,
        storytelling=storytelling,
    )
    report.create_report()

🗒 Report Generation

The system automatically generates a professional PDF report using Seaborn/Matplotlib for visuals and ReportLab for document layout. It includes:

Media coverage charts
Temporal trends
Heatmap crossing content categories with sentiment
Structured storytelling and analytical conclusions

Final Report Structure

💡 Conclusions

This tutorial offers an automated starting point that:

Collects and structures scattered information
Identifies patterns and trends in large data volumes
Generates evidence-based insights

However, Brand Journalism work should remain in the hands of professionals who can:

Interpret data within the organizational context
Align narratives with real corporate values
Add nuances, experiences, and internal perspectives
Ensure the message genuinely reflects brand identity
Humanize content with empathy and authentic connection

📚 References:

What Is Brand Journalism — and Why It Matters.
The New York Times Licensing Group.
Retrieved from https://nytlicensing.com/latest/marketing/brand-journalism-and-why-it-matters/
Gemini About.
Google.
Retrieved from https://gemini.google/about/
Pichai, S., & Hassabis, D. (2023, December 6). Introducing Gemini: Our largest and most capable AI model.
Google Blog.
Retrieved from https://blog.google/technology/ai/google-gemini-ai/#sundar-note
Grounding with Google Search.
Google AI Documentation.
Retrieved from https://ai.google.dev/gemini-api/docs/google-search?hl=es-419

Do you have any other thoughts or suggestions? Leave them in the comments.

How to Use AI in Brand Journalism with Gemini to Transform Digital Information into Strategic Editorial Content?

Romina Elena Mendez Escobar — Sun, 19 Oct 2025 22:53:37 +0000

Introduction

In this tutorial, you will learn how to use Google Gemini to:

🔍 Search for information using generative AI integrated with Google Search
✍️ Transform findings into structured journalistic narratives
📊 Generate visual reports with graphics and automated storytelling

What is Brand Journalism?

According to an article by The New York Times Licensing Group, readers experience significant content fatigue: there are more than 1.8 billion websites and over 70 million blogs published each month.

Key Features

Journalistic techniques: Application of rigorous journalistic methods to create credible and well-structured content.
Audience interests: Focus on the real interests of the audience, not just the messages the brand wants to convey.
Quality and useful information: Content that educates, informs, or solves concrete problems.
Use of different formats: Variety of formats (reports, interviews, analyses, infographics, videos) to maintain engagement.
Storytelling: Narratives that connect emotionally with values, experiences, and social impact.

Benefits

The benefits we can identify based on this are:

Brand Positioning: Establish yourself as a thought leader in your industry.
Audience Loyalty: Build authentic and lasting relationships with your audience.
Differentiation against the Competition: Stand out from competitors through higher-quality editorial content.
Greater Organic Reach: Valuable content is naturally shared, amplifying reach without direct advertising investment.

What is Generative AI?

Additional Resources on GenAI

I have written a series of articles on the fundamentals of generative AI

Gemini

Gemini is a family of multimodal AI models developed by Google DeepMind. It integrates into multiple Google products and can process text, images, and other data types simultaneously.

Grounding with Google Search

For this use case, we will use the Grounding with Google Search functionality, which connects the model directly to Google to perform searches and obtain up-to-date information.

Main Advantages:

📏Increased Accuracy: Reduces model hallucinations by accessing verifiable information.
⚡️Real-Time Information: Access to current data, reducing uncertainty about the model's knowledge.
📚Citations and References: Retrieves source links and provides control over consulted data sources.

Use Case

Mentions of the company on different sites
Reputation and notable aspects
Trends and relevant conversations

This starting point helps those who write articles or create storytelling based not only on what the company wants to show but also on the external perspective others have of it.

Practical Example: 📱iPhone 17

Using the latest iPhone launch as an example, we will:

Search for recently published articles
Classify and analyze these documents
Generate a report with visualizations, conclusions, and structured narratives

Next, we will see how to implement this strategy through an automated workflow that integrates AI and data analysis.

Implementation Process

The following diagram illustrates how our automated analysis system works.

1️⃣ Search with Google Search

We use Grounding with Google Search to find relevant articles and request output in JSON format using this structure:

{ 
   "title": "full article title",
   "source_name": "media name",
   "date": "publication date",
   "url": "article link",
   "site_name": "website name",
   "summary": "2-4 line summary",
   "sentiment": "positive/negative/neutral",
   "category": "rumor/analysis/comparison/market/technical",
   "sentiment_score": "1-10 score"
 }

2️⃣ Storytelling Generation

We use another prompt to generate different types of narratives based on the articles found:

Analytical Insights: Compact analytical summary with concrete data.
Storytelling Narrative: Engaging mini-narrative based on dataset evidence.
Tone Variants (A/B/C): Three versions with different focuses: objective, emotional, and strategic.

3️⃣ Report Creation

We generate a PDF report including:

Charts created with Seaborn and Matplotlib
Visual trend analyses
Narrative conclusions based on generated storytelling
Customizing the layout using ReportLab

Tutorial

How Does Gemini Work with Google Search?

Pre-requisite: Access to Gemini API

Before starting, you need to get access to the Gemini API:

Create an account in Google AI Studio
Create or log in with your Google account
Generate your API key from the control panel

Note: You can use Gemini's free tier to test this project.

Once you have your API key, configure it in a .env file:

API_KEY = "tu_api_key_de_gemini"
MODEL_ID = "gemini-2.5-flash"

We use Gemini 2.5 Flash because it is the most cost-efficient model optimized for frequent, low-cost tasks.

Repository Structure

For this tutorial you must clone the following repository and you can get the complete code from this tutorial.

RominaElenaMendezEscobar / brand-journalism-gemini

Tutorial about Brand Journalism Code Using Google Gemini

How to Use AI in Brand Journalism with Gemini to Transform Digital Information into Strategic Editorial Content?

Introduction

In this tutorial, you will learn how to use Google Gemini to:

🔍 Search for information using generative AI integrated with Google Search
✍️ Transform findings into structured journalistic narratives
📊 Generate visual reports with graphics and automated storytelling

# What is Brand Journalism According to an…

View on GitHub

project/
   ├── img/                    # Generated graphics
   ├── prompt/
   │   ├── prompt_search.txt       # Search Prompt
   │   └── prompt_storytelling.txt # Prompt for narrative
   ├── report/                # PDFs generated
   ├── brand_journalist_analyzer.py
   ├── report_plots.py
   ├── report_analysis.py
   └── main.py
   └── .env

Main Files

1️⃣ Prompts (/prompt)

🗒prompt_search.txt: Here we define how to perform the search in Google Search and structure the results in JSON. This prompt instructs the model to return structured information with fields such as the article's title, source, date, URL, summary, sentiment, and category.
🗒prompt_storytelling.txt: In this file, we define how to generate conclusions and storytelling based on the articles found. It requests different types of outputs, including objective analysis, immersive narratives, and three tone variants (objective, emotional, and emotional).

2️⃣ Brand Journalism Analyzer

🗒brand_journalist_analyzer.py: This class is the core of the application and handles all interaction with the Gemini API. It implements three main functionalities: news retrieval using Google Search, structured storytelling generation, and analytical insights extraction. The most important method is search_news(), which executes real-time searches and returns structured data in JSON format. To use integrated Google Search, simply set config={"tools": [{"google_search": {}}]} in the API call.

def search_news(self, max_retries=1, create_dataframe=True):
    """Search for news on a topic using Google Search."""
    prompt = self.search_prompt()

    response = self.client.models.generate_content(
        model=self.model_id,
        contents=prompt,
        config={"tools": [{"google_search": {}}]}
    )

    # Process and clean JSON response
    txt = response.candidates[0].content.parts[0].text
    clean_text = self._clean_json_response(txt)

    return json.loads(clean_text)

3️⃣ Visualization Generator

report_plots.py: This class creates all the report visualizations using Seaborn and Matplotlib. It generates three essential chart types: a bar chart showing which media outlets publish the most on the topic, a timeline visualizing the evolution of publications over time, and a heatmap that cross-references sentiment with content categories. All visual aspects are customizable: color palette, titles, axis labels, and save paths. The methods first prepare the data with Pandas aggregations and then generate the visualizations, automatically saving them as PNG files.

4️⃣ PDF Report Generator

report_analysis.py: This class assembles the final report in professional PDF format using ReportLab. It combines multiple elements: a customizable logo, corporate-style headers, informative tables about the analyzed dataset, pre-generated visualizations, formatted narratives with full Markdown support (including headings, lists, code, and emphasis), and conclusions and storytelling sections with different tone variations.

🎯Process Orchestration

🐍main.py

from brand_journalist_analyzer import BrandJournalistAnalyzer
from report_analysis import ReportAnalysis
from report_plots import DataVisualizer
from dotenv import load_dotenv
import os
import pandas as pd
from datetime import datetime

if __name__ == "__main__":
    # Cargar variables de entorno
    load_dotenv()
    API_KEY = os.getenv("API_KEY")
    MODEL_ID = os.getenv("MODEL_ID")

    # Generar ruta con timestamp
    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
    output_path = f"report/news_report_{timestamp}.pdf"

    # Inicializar analizador
    analyzer = BrandJournalistAnalyzer(api_key=API_KEY)

    # Buscar o cargar noticias (usa caché si existe)
    response = analyzer._load_or_search(force_refresh=False)
    search_data = pd.DataFrame(response)

    # Generar storytelling y conclusiones
    storytelling = analyzer.get_storytelling(response)
    conclusion = analyzer.get_conclusion(response)

    # Crear visualizaciones
    visualizer = DataVisualizer(dataset=search_data)
    visualizer.plot_news_by_source()
    visualizer.plot_news_over_time()
    visualizer.plot_sentiment_category_heatmap()

    # Generar reporte PDF
    report = ReportAnalysis(
        dataset=search_data,
        filename=output_path,
        conclusion=conclusion,
        storytelling=storytelling,
    )
    report.create_report()

🗒 Report Generation

The system automatically generates a professional PDF report using Seaborn/Matplotlib for visuals and ReportLab for document layout. It includes:

Media coverage charts
Temporal trends
Heatmap crossing content categories with sentiment
Structured storytelling and analytical conclusions

Final Report Structure

💡 Conclusions

This tutorial offers an automated starting point that:

Collects and structures scattered information
Identifies patterns and trends in large data volumes
Generates evidence-based insights

However, Brand Journalism work should remain in the hands of professionals who can:

Interpret data within the organizational context
Align narratives with real corporate values
Add nuances, experiences, and internal perspectives
Ensure the message genuinely reflects brand identity
Humanize content with empathy and authentic connection

📚 References:

What Is Brand Journalism — and Why It Matters.
The New York Times Licensing Group.
Retrieved from https://nytlicensing.com/latest/marketing/brand-journalism-and-why-it-matters/
Gemini About.
Google.
Retrieved from https://gemini.google/about/
Pichai, S., & Hassabis, D. (2023, December 6). Introducing Gemini: Our largest and most capable AI model.
Google Blog.
Retrieved from https://blog.google/technology/ai/google-gemini-ai/#sundar-note
Grounding with Google Search.
Google AI Documentation.
Retrieved from https://ai.google.dev/gemini-api/docs/google-search?hl=es-419

Do you have any other thoughts or suggestions? Leave them in the comments.

DEV Community: Romina Elena Mendez Escobar

From Hype to Product: How AI Is Being Used Today

💄Sephora Transforms User Experience with AI in ChatGPT

☕️ What’s Behind Every Cup of Coffee You Enjoy at Starbucks?

🖼️ Pinterest AI Turns Your Ads into Scalable Results

✈️ Delta Reveals Why We Still Seek Real Experiences When Traveling

📊 Agentic AI in Action: How Amazon Helps Sellers Make Real-Time Decisions

📱Apple Unifies Business Management on a Single Platform

🛒 How Instacart Connects Physical Stores and Real-Time Data with AI

Before You Go

🧩 Recommended App

📢 Join the Conversation

More women in Tech. Fewer women leading

A persistent gap: the numbers behind the reality

The broken rung: when careers start at a disadvantage

Invisible work: tasks that consume time without building careers

Learning to stay relevant: the challenge of continuous upskilling

Systemic barriers in transition: the added impact of AI

Building the future of technology is also a matter of diversity

📚References

AI in healthcare: how OpenAI is transforming medical care

OpenAI for Healthcare: Operationalizing AI in Healthcare Organizations

ChatGPT Health: A Smarter Way to Understand Your Health

Comparative Overview

Conclusion

📚Referencias

📌 How to cite this article

TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?

Introduction

What Is TOON (and How Is It Different from JSON)?

The Experiment

Datasets

Fetching the Data

Part 1: Token Reduction

🏗️ Methodology

🧪 Results: Token Reduction Metrics

💡 Why the difference?

Part 2: Does Response Quality Stay the Same?

🧪 Results: Evaluation Metrics

LLM and Embeddings Setup (AWS Bedrock)

Bedrock Client Implementation

invoke_prompt

🌡️ Why temperature = 0?

get_embeddings

Experimental Setup

Prompt Design

Evaluation Procedure

Results

Lexical Variability vs. Data Accuracy

RominaElenaMendezEscobar / experiment-toon-vs-json

This repository contains a practical benchmark comparing JSON and TOON (Terse Object Oriented Notation) as data serialization formats for LLM prompts.

TOON vs JSON for LLM Prompts: Can We Reduce Token Usage Without Losing Response Quality?

|Tags: llm, ai, optimization, python|

Introduction

Conclusions

📌 How to cite this article

From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock

A quick note on embeddings and semantic search

What is Amazon S3 Vectors?

How do vectors work in Amazon S3?

Process Flow

1️⃣ Generate Vector Embeddings

2️⃣ Store Vector Data

3️⃣ Semantic Search via Vector Index

Reference Architecture

Amazon Bedrock and Amazon Titan

Amazon Elastic Beanstalk

📊 Dataset

Use Case

Prerequisites

(1) 🗂️ Code repository

RominaElenaMendezEscobar / s3-vector-coffee-tutorial

S3 Vector tutorial using cafe data and creating a Streamlit app deployed on Elastic Beanstalk

From Coffee Products to AI Search: Building a Serverless Semantic Search Architecture with Amazon S3 Vectors and Bedrock

(2) 🪣 Create Amazon S3 buckets

(2.1) 🪣 Creating the S3 Vectors bucket

(2.2) 🧭 Creating the vector index

(3) 🔐 Policies

🛠️ Implementation Guide

✅ Step 1: Dataset

`invoke_prompt`

🌡️ Why `temperature = 0`?

`get_embeddings`

|Tags: `llm`, `ai`, `optimization`, `python`|