DEV Community: Andreea Miclaus

Unicorns and Rainbows: The Reality of Implementing AI in a Corporate

Andreea Miclaus — Sat, 08 Feb 2025 10:21:08 +0000

Unicorns and Rainbows. Is it a metaphor? Is it a reality? Maybe both. Think of an unicorn dancing on top of a radiant rainbow. But, in fact, what does it mean?

Image generated by AI

Humanity has always been drawn to utopia — a perfect, idealized future where all problems are solved. Believing that the world is steadily marching toward this vision is tempting. In the AI landscape, the unicorn (you have noticed the 5th leg , right?) represents the elevated promises, wild imagination, and relentless hype that paint a picture of transformative, almost magical technology.

AI trends move at lightning speed, leaving the real engineers behind to fix the mess. Learn how to not to run after trends and architect real solutions.

Subscribe now

The rainbow, however, represents the real world: entire potential but riddled with imperfections, inconsistencies, and systemic barriers.

Just like the stock market, AI has its declines and flows. Everything might seem to skyrocket, but a slight shift — technical debt, regulatory burdens, or enterprise realities — can send it crashing back to earth. The question is not whether AI is a transformative force (there is no doubt it is!) but whether we’re being realistic about its trajectory.

This article will discuss the reality of using AI in the enterprise environment, address technical debt, bridge knowledge gaps, and understand the herd effect that fuels the AI bubble. We aim to offer a realistic roadmap for businesses navigating the complex AI landscape by critically analyzing these factors.

1. The AI bubble

We have been in Data & AI for over 10 years. The AI bubble has never been so big. We have AI everywhere on our laptops, phones, and websites. The CEOs of Nvidia, Microsoft, Meta, and OpenAI are spreading a lot of news about revolutionary AI technology, how AI agents will replace humans, how we will reach AGI soon, and how we will have AI everywhere. We live in an AI bubble, and even though the technology is accurate, it is nontrivial to apply it to actual use cases and drive business value than advertised.

The technological advancements in the AI field are significant, and the value AI generates is real. However, there are still many gaps that people who try to build AI products see clearly. Two factors contribute to the AI bubble: knowledge and the herd effect. Somehow, they are tangential but different.

2. The Role of Knowledge Gaps

The gap between AI insiders and the general public is one of the hinds of the AI bubble. The saying “Knowledge is power” remains valid for AI within its development and implementation context.

People who are deeply invested in the development of AI are fully aware of the nuances, challenges, and limitations that come with the implementation of AI-based solutions.

On the other hand, AI outsiders are constantly awe-struck by the marketing terms associated with AI which presents an entirely different world of possibilities. This knowledge gap enables misconceptions to spread at an alarming rate, therefore making the hype of AI take precedence over the reality of what AI systems can offer.

3. Herd Effect: Fear of Missing Out (FOMO)

Another significant factor driving the AI hype is the herd effect or the Fear of Missing Out (FOMO).

As more companies invest in AI and tout their successes, others feel compelled to follow suit, fearing they’ll fall behind if they don’t adopt AI technologies. This rush often leads to deploying AI solutions without a thorough understanding of their applicability or potential ROI, further inflating the AI bubble. The result is a market saturated with AI buzzwords and solutions that may not deliver the promised transformative impact.

AI models (under AI models, we understand foundation models) are used everywhere, where a standard ML model should be used instead. This adds complexity and decreases reliability.

4. Back to basics

Most companies can not reliably bring standard machine learning models to production and lack monitoring practices.

Despite what many people think, workflows that include AI models are, on average, more complex to bring into production and monitor — even if you take the simplest scenario without RAG or finetuning involved — just call a 3rd party API.

In too many cases, we seem to have forgotten the basic principles of machine learning and blindly rely on what that API outputs. This is the danger of AI hype: AI has become accessible to everyone, and many software engineers treat it as just another API call.

What could go wrong? The data model has not changed, the code has not changed (and neither has the environment where it gets executed), and the version of the API has not changed. But this is the beauty of machine learning: even if everything in your control has not changed, the model can start performing poorly unexpectedly because data distribution has changed.

This does not just happen with standard machine learning models, it also happens with AI models — we just have less means to impact that behavior, and prompt fine tuning becomes an essential part of the process.

5. Real-world, 2025

Experts say that 2025 will be the year of AI agents. But is it really true?

While the AI hype machine continues to boom, real-world adoption tells a different story. The promise of autonomous AI agents seamlessly operating across enterprises remains largely aspirational. The reality? AI in enterprise is still a work in progress — complex, expensive, and often misaligned with actual business needs.

Take BBVA , the Spanish bank that went all in on OpenAI’s technology. They deployed over 2,900 AI models to enhance productivity, yet integrating them into their existing systems turned out to be a logistical nightmare. AI doesn’t operate in a vacuum; it needs to connect with legacy infrastructure, existing workflows, and strict regulatory requirements. And that’s where reality bites — scaling AI across an enterprise is exponentially harder than rolling out a chatbot.

The UK government’s attempt to integrate AI into its welfare system faced significant limitations. At least six AI prototypes, designed to enhance staff training, improve job center services, and streamline disability benefit processing, were discontinued due to issues in scalability, reliability, and insufficient testing. Officials acknowledged several “frustrations and false starts,” highlighting the complexities involved in deploying AI within public services.

A study highlighted several obstacles in developing and deploying AI agents within enterprises. Security concerns were identified as a top challenge by leadership (53%) and practitioners (62%). Other significant challenges included data governance, performance issues, and integration complexity. These findings underscore the multifaceted difficulties organizations face in implementing AI agents effectively.

Reflecting on these examples, it’s evident that the widespread adoption of AI agents in enterprise settings faces significant limitations. While 2025 may usher in extensive research, proofs of concept (POCs), and minimum viable products (MVPs), the path to full-scale integration remains complex.

6. AI in a corporate environment

Big companies operate under strict rules, structured workflows, and a constant focus on ROI. Unlike agile startups that can adapt on the fly, large organizations have to deal with complex approval processes, compliance checks, and risk management. All this makes adopting AI a slower process, and the idea of rapid transformation often feels more like a distant dream than something achievable.

Chip Huyen references the most common LLM applications in her AI engineering book. Enterprises are risk-averse and prefer to deploy internal-facing applications first. From what we have seen so far, even though there is initial support from the leadership to deploy such applications, not enough funding goes to those projects (and unlikely will) as they do not generate direct business value. We are not saying there is no value — there is, but it is challenging to convince stakeholders.

Image reinterpreted from Huyen, C. (2025). AI Engineering: Building Applications with Foundation Models. Available on Amazon

In enterprises, the most common use cases with direct business generation are related to customer service (forwarding customers to the right agents/ processes) and reviewing contracts. These use cases have been there for a while, and have historically been NLP-heavy, and AI models helped to improve these projects.

Some companies have tried to use LLMs for recommendations and chatbots, and the world has seen enough failures. Here are some examples:

DPD’s customer-facing chatbot, “Ruby,” was designed to assist customers with their inquiries. However, due to insufficient safeguards, a user was able to provoke the bot into swearing and composing a poem criticizing the company itself. This incidentunderscores the importance of implementing strict content moderation protocols and regularly updating AI systems to prevent such occurrences.

Similarly, Pak’nSave’s AI meal planner app, intended to provide innovative recipe suggestions, malfunctioned and recommended a combination of ingredients that would produce chlorine gas, labeling it as an “aromatic water mix.” This highlights the critical need for rigorous testing and validation of AI outputs, especially in applications directly impacting consumer health and safety.

It feels like not everyone has learned from it, and we regularly see companies launching AI applications without clear business value with poor guardrails, mainly for “marketing purposes”. Let’s hope it will not turn out to be bad marketing, as users will try to make the app do things it is not supposed to do “just for fun”.

There are exceptions. Some companies created nice LLM-powered recommendations, for example, Zalando. It has well-implemented guardrails and is useful for the customers (it helps to find items that are otherwise hard to find via search). In October 2024, Zalando expanded its AI-powered assistant to all 25 markets, supporting local languages. This expansion aims to provide customers with personalized fashion advice and insights into emerging local trends, thereby enhancing the shopping experience.

7. Areas of attention & conclusions

There is great potential to leverage AI in a corporate setting. However, to hope for enterprise adoption, we must consider security gaps, controlled environments, transparency and traceability , and a way to monitor and evaluate AI systems.

In enterprise ecosystems, AI systems need large volumes of data, including personal and proprietary information. Their role is to enhance workflows and boost efficiency, but they need access to critical systems, which can be considered a security risk. Organizations must focus on preventing unapproved access to data, breaches, and compliance issues.

Threat actors can deploy malware that mimics AI behavior to breach networks, skew decisions, or steal secrets. AI agents act autonomously, making them harder to detect and control. This creates a major challenge: real-time oversight of AI systems.

Monitoring is a persistent issue. Few companies have proper systems in place, and AI’s complexity makes it even harder. Owners must fully understand every decision their AI makes

AI’s transformative potential is undeniable, but the path from hype to reality is complex and challenging. Rather than chasing unicorns and rainbows, organizations must take a grounded, strategic approach — one that prioritizes real business needs, robust security frameworks, and a deep understanding of AI’s limitations. The road ahead is uncertain, but one thing is clear: the way we answer these questions will determine whether AI becomes a lasting force for good or just another passing bubble.

What do you think — are we ready?

Thanks for reading Hyperplane! Subscribe for free to receive new posts and support our work.

All you know about RAG is a lie

Andreea Miclaus — Wed, 29 Jan 2025 19:53:34 +0000

Everything starts with a PoC, right? A client approaches you with basic requirements and a vision to create something groundbreaking. That’s when the excitement begins—turning an idea into a proof of concept (PoC) feels like the first step toward innovation.

Over the past twelve months, I’ve gone through five different attempts to launch a fully functional Retrieval-Augmented Generation (RAG) system in production. Every single one ended up on the scrap heap for different reasons. Some projects died early in the prototyping phase, while others crashed and burned when scaling issues reared their ugly heads.

The journey taught me one critical lesson: choosing the right focus areas during the PoC phase can make or break the project.

As shown in the graphic above, the RAG pipeline consists of multiple moving parts—from preprocessing documents to integrating with vector databases and large language models. Each layer comes with its own engineering challenges; not all are worth solving during a PoC.

The key to a successful PoC is identifying which parts of the RAG pipeline truly matter and warrant deeper engineering effort.

Focusing too broadly or tackling production-scale issues prematurely is a recipe for wasted time, blown budgets, and, ultimately, failed projects.

In the following sections, I’ll share the lessons I learned across five different attempts, highlighting what worked and what didn’t and how careful selection during the PoC phase could have saved me a lot of headaches.

Project #1: Let’s "LangChain" everything

Generative AI was everywhere.

It seemed like everyone was talking about the next generation of chatbots, proclaiming that classical machine learning was outdated.

A lot of noise was in my mind, so I decided to take what appeared to be the easiest route: using open-source LLM orchestrators like LangChain.

I went into their documentation, binge-watched YouTube tutorials, and for a moment, I felt invincible—like everything was finally falling into place, as if a divine hand was guiding me.

Armed with an open-source framework, I figured hooking up a vector database to a large language model was no big deal. After all, I had worked with AI APIs and text embeddings before.

But I couldn’t have been more wrong.

What went wrong?

Dependency hell: LangChain and its associated libraries were frequently updated, and with every update came compatibility issues. The vector database APIs and LLM integrations would often break, requiring constant troubleshooting and rework.
Loss of control: Using an external framework meant I had little control over its internal workings. Changes in the framework’s imports or logic disrupted my implementation, forcing me to rewrite parts of my code every time the framework evolved.
Scalability issues: While LangChain worked well for a single-user PoC, scaling it to multiple concurrent users introduced latency and resource allocation issues that the framework was not equipped to handle.
Security gaps: Sensitive information, such as user data, leaked through generated responses because there was no built-in mechanism to manage private or confidential data securely. These leaks led to compliance concerns and blocked progress.

Takeaway:
LangChain and similar frameworks are fantastic for building quick proofs of concept, offering a way to validate ideas and experiment with LLMs.

However, transitioning to production requires an entirely different approach.

For production, you need complete control over your pipeline, robust scalability strategies, and a security-first mindset. The flexibility and speed that make frameworks like LangChain appealing for PoCs can become liabilities when faced with real-world demands.

Project #2: The "It can’t be that hard" prototype → no frameworks, 100% control over data

In my mind, data is the most crucial part of any AI system. So, in one of our projects, I decided to build the data ingestion and indexing components entirely from scratch. My thinking was simple: if we could ensure 100% control over the data pipeline, we’d avoid the issues that come with off-the-shelf frameworks and guarantee long-term flexibility.

To make this approach even more robust, we decided to build custom data connectors for various sources like Google Drive, Microsoft Outlook, PDFs, and wikis.

On top of that, we added the Ray framework for distributed processing and used low-level control with the Qdrant SDK for vector indexing. This would give us unparalleled control—or so I thought.

What went wrong?

Document ingestion nightmares: Parsing files turned into a complete fiasco. Hidden metadata in PDFs caused chunking logic to break. Microsoft Outlook attachments came in unpredictable formats, and wikis were riddled with inconsistent structure. Each source introduced unique quirks that required constant debugging.
Hallucinations: Despite the focus on data quality, the LLM still generated references to nonexistent documents. Adjusting prompt parameters helped marginally, but hallucinations were far from eliminated.
Complexity overload: Developing custom connectors and indexing logic during the PoC phase created a flood of bugs. Prematurely adding production-level features—like distributed processing with Ray—complicated the system far beyond what was necessary for a proof of concept.
Qdrant SDK challenges: While Qdrant is powerful, using its low-level SDK demands a deeper understanding of how vector databases work. This introduced a steep learning curve, and bugs in query performance and indexing logic delayed progress.

Takeaway:
Preprocessing and data consistency are critical to AI success, but trying to build everything from scratch for a PoC is overkill.

Building custom data connectors is hard enough without the added complexity of integrating distributed frameworks like Ray or low-level vector database tools like Qdrant SDK. For a PoC, simplicity should be the priority—production-level features can (and should) wait for later.

I learned that while data is king, focusing solely on data quality during a PoC can derail the entire project if it comes at the expense of speed and simplicity.

👇👇👇
This is an excerpt from the full article over on Substack. If you found it helpful, please consider subscribing, it helps us know we're on the right track!

How to crawl your way into market dominance

Andreea Miclaus — Sat, 28 Dec 2024 08:55:00 +0000

Why do so many AI projects feel like déjà vu?

You start with bold ambitions, tackle a proof of concept, and… it stalls. Again.

At ML Vanguards, we know this story all too well. It’s the cycle we’ve broken countless times in our work at Cube Digital.

The truth? Building production-grade AI isn’t about chasing buzzwords — it’s about combining engineering knowledge with practical AI to deliver systems that actually work, scale, and drive real-world impact.

1. Business strategy revolves around aata

There is no surprise that data is power. The stock market is data, the consumers’ behavior is data, even clicks on buttons are relevant from a business perspective.

There is an awful need for automated solutions to streamline social media data collection and analysis. Knowing what your users want and actually use from your business is the key to decision making and strategy.

This article outlines an end-to-end solution of a highly scalable data-ingestion pipeline tailored to a specific area: marketing intelligence. This architecture caters to various analytical processes: sales, competitor analysis, market analysis, and customer insights to name a few.

2. Gathering Relevant Data Points

Scheduler: Plays multiple roles, despite its name, but the most important one is to trigger the crawler lambdas for each page link it has.

Crawler: The name states its purpose. If you’re not familiar with the term crawling, pause this article and look it over before proceeding. This component takes the page link and starts crawling/extracting various posts and information about them. More details will come in the implementation part.

Database: Most posts are unstructured textual data, but we can extract other useful information from them, and MongoDB shines at handling semi-structured data.

To mark the complete flow of the solution, the scheduler triggers a crawler lambda instance for each page, sending the page name and the link. The crawler starts extracting the posts from last week and stores the raw content, the post’s creation date, the link itself, and the name, but this doesn’t stop here. You can extract more information depending on what the platform offers you.

Then, the scheduler waits for all lambda instances to finish their execution, aggregates the extracted posts from the database, and, using some prompt templates, sends the posts along with these to ChatGPT to generate some reports.

2.1 Scheduler

The reporting part is not the focus, although you can find it here along with all the code in this article. The leading actor here is the scheduling part itself, and this is the main entry point of the system where the whole flow is started and orchestrated:

Then, it stores the correlation ID of each lambda in a list and waits for all lambdas to finish their execution here. The awaited time defined here is 15 seconds; you can play with it according to the average time it takes for your crawler to complete its task, so Cloudwatch is not called that often.

Last, it finds all crawled posts from these pages and sends them to the report generation phase.

2.2 Crawler

We’ve defined a main abstraction point for all types of crawlers. It defines a common interface that all derived crawlers must implement, and all subclasses must provide their implementation for the extract() method so wherever you need to build a new crawler. Besides the fact that this brings a lot of reusability and uniformity, another valuable advantage is represented down below:

Each crawler is easily promoted and called automatically. In this case, we have a dispatcher whose job is to select and instantiate the correct crawler class based on the link you’ve provided to be processed. This essentially acts as a registry and a factory for the crawler and manages these under the unified interface and structure we’ve created for them. The advantages?

Flexibility & scalability: This component unlocks the possibility of easy addition without modifying the existing codebase. This makes the system easily expandable; you can include more domains and specialized crawlers—just plug and play them.
Encapsulation & modularity: The dispatcher encapsulates the logic for determining which crawler to use based on the link. This makes the system more modular and allows each crawler to focus on its core business logic without worrying about pattern matching.

3. Challenges & pitfalls

Running headless browser instance with Selenium in Lambda runtime environment

The Lambda execution environment is read-only, so anything you want to write on disk should be done into a temporary file. This will mostly ruin your dream of automatically installing the binary driver. So you would need to install this directly in the docker image and reference it manually in Selenium’s driver options. The only driver that worked for this setup was the Google binary driver.

Aggregate empty pages

The initial monitoring algorithm was quite basic. It involved looping over the correlation IDs of each Lambda invocation and checking the database for any generated posts. However, a corner case we found where no new posts had been created for some pages within the searched time range, causing the algorithm to enter an infinite loop.

Avoid being blocked by social media platforms

A common issue — one that could have consumed days of effort — required approaching it from a different perspective. Popular social media platforms employ numerous anti-bot protection mechanisms to prevent crawling, such as request header analysis, rate limiting, and IP blocking.

Conclusion

In this article, we’ve explored a complete end-to-end robust solution for building a Highly Scalable Data Ingestion pipeline that can leverage existing data from multiple crawlable sources for various processes like ML training, data analysis, etc.

We’ve gone through specific challenges you might face and how to overcome them in this process.

🔗 Check out the code on GitHub and support us with a ⭐️
Thanks for reading ML Vanguards! Subscribe for free to receive new posts and support our work.

Within our newsletter, we keep things short and sweet.

If you enjoyed reading this article, consider checking out the full version on Medium. It’s still free ↓

Full article on Medium