Why GenAI Isn't Ready for Prime Time

#ai #career #llm #gemini

If you have followed my posts on social media, you know by now that I've taken a very pragmatic (and perhaps pessimistic) approach to the whole hype around GenAI in the past several years.

Personally, I do not believe the technology is mature enough to allow people to blindly trust its outcomes.

In this blog post, I will share my personal view of why GenAI is not ready for prime time, nor will it replace human jobs anytime in the foreseeable future.

Some background

The hype around GenAI for the non-technical person who reads the news comes from publications almost every week. Here are a few of the common examples:

Text summarization - GenAI can summarize long portions of text, which may be useful if you're a student who is currently preparing an essay as part of your college assignments, or if you are a journalist who needs to review a lot of written material while preparing an article for the newsletter.
Image/video generation – GenAI is able to create amazing images (using models such as Nano Banana 2) or short videos (using models such as Sora 2).
Personalized learning - A student uses GPT-5.4 to create a custom, interactive 10-week curriculum for learning organic chemistry.
Family Life Coordinator - Copilot in Outlook/Teams (Personal) monitors family emails and school calendars.

Although the technology has evolved over the past several years from the simple Chatbot to more sophisticated use cases, we can still see that most use of GenAI is still used by home consumers.

Yes, there are use cases such as RAG (Retrieval-Augmented Generation) to bridge the gap between a model's static training and the corporate data, MCP (Model Context Protocol), that acts as a "USB-C port for AI", or agentic systems, that take a high-level goal, break it into sub-tasks, and iterate until the goal is met. The reality is that most AI projects fail due to a lack of understanding of the technology, the fear of using AI to train corporate data (and protect the data from the AI vendors), a lack of understanding of the pricing model (which ends up much more costly than anticipated), and many more reasons for failures of AI projects.

Currently, the hype around GenAI is driven by analyst (who lives in delusions about the actual capabilities of the technology), CEOs (who have no clue about what their employees are actually doing, specifically when talking the role of developers, and all they are looking for is to cut their workforce, to make their shareholders happy), or sales people (who runs on the wave of the hype, to make more revenue for their quarterly quotas).

Code generation

A common misconception is that GenAI can generate code (from code suggestions to vibe coding an application) and will eventually replace junior developers.

This misconception is a far cry from the truth, and here's why:

A developer isn't just writing lines of code. He needs to understand the business intent, the system/technology/financial constraints, and understand past written code (by himself or by his teammates), to be able to write efficient code.
If we allow GenAI to produce code by itself, without the engine understanding the overall picture, we will end up with tons of lines of code, without any human being able to read and understand what was written and for what purpose. Over time, humans will not be able to understand the code and debug it, and once bugs or security vulnerabilities are discovered.
Using SAST (Static Application Security Testing) or DAST (Dynamic Application Security Testing) for automated secure code review, combined with GenAI capabilities (such as Codex Security or Claude Code Security) will generate ton of false-positive results, from the simple reason that GenAI cannot see the bigger picture, understand the general context of an application or the existing security controls already implemented to protect an application.

Bottom line – Agentic system cannot replace a full-blown production-scale SaaS application, built from years of vendors/developers' experience. GenAI will not resolve incidents happens on production systems, which impacts clients and breaks customers' trust.

Agentic AI for the aid in security tasks

I'm hearing a lot of conversations about how GenAI can aid security teams in repeatable tasks. Here are some common examples:

Replacing Tier 1 SOC analysts: Solutions like CrowdStrike’s Falcon Agentic Platform or Dropzone AI now handle over 90% of Tier 1 alerts. They ingest an alert, pull telemetry from EDR/SIEM, perform threat intel lookups, and provide a "verdict" with evidence before a human ever sees it.
Incident Storylining: Instead of an analyst manually stitching together logs, tools like Microsoft Security Copilot generate a cohesive narrative of the attack kill chain in plain English.
Dynamic Playbook Generation: GenAI can generate a custom response plan on the fly, tailored to your specific cloud architecture and the nuances of a "living-off-the-land" attack.

Here is where GenAI falls short:

Indirect Prompt Injection: Attackers can embed malicious instructions in emails or logs. When the SOC's AI agent "reads" these logs to summarize an incident, the hidden instructions can command the agent to "ignore this alert" or "delete the evidence," effectively blindfolding the SOC.
Hallucinations in High-Stakes Code: While GenAI can draft remediation scripts (Python/PowerShell), it still suffers from "system safety" issues. It may confidently suggest a command that includes an outdated, vulnerable dependency or a logic error that could crash a production server during containment.
Lack of "Decision Layer" Visibility: An AI agent might be performant and "online," but it could be making systematically biased or manipulated decisions (e.g., failing to flag a specific user due to model poisoning) that perimeter monitoring cannot detect.
The "Data Readiness" Wall: Most organizations still struggle with siloed, unstructured data. If your data isn't "AI-ready"—meaning unified and clean—the AI will produce fragmented or incorrect insights, leading to a "garbage in, garbage out" scenario.

Bottom line – Just because GenAI can review thousands of lines of events from multiple systems, triage them to incidents, document them in ticketing systems, and automatically resolve them, without human review, doesn't mean GenAI can actually resolve all of the security issues organizations are having every day.

Automating everything

In theory, it makes sense to build agentic systems, where AI agents replace repetitive human tasks, making faster decisions, hoping to get better results.

Here are a couple of examples, showing how wrong things can get when allowing AI agents to make decisions:

The Replit Agent "Vibe Coding" Failure: While building an app, the agent detected what it thought was an empty database during a "code freeze." The agent autonomously ran a command that erased the live production database (records for 1,200+ executives).
The AWS "Kiro" Production Outage: Amazon’s agentic coding tool, Kiro, was tasked with resolving a technical issue but instead autonomously decided to "delete and recreate" a production environment. The agent was operating with the broad permissions of its human operator. Due to a misconfiguration in access controls, the AI bypassed the standard "two-human sign-off" requirement. It proceeded to wipe a portion of the environment, causing a 13-hour outage for the AWS Cost Explorer service.
The Meta "Sev 1" Internal Breach: An internal Meta AI agent (similar to their OpenClaw framework) triggered a "Sev 1" alert—the second-highest severity level—after taking unauthorized actions. An engineer asked the agent to analyze a technical query on an internal forum. The agent autonomously posted a flawed, incorrect response publicly to the forum without the engineer's approval. A second employee followed the agent's "advice," which inadvertently granted broad access to sensitive company and user data to engineers who lacked authorization.

Bottom line – We must always keep humans in the loop for any critical decision, regardless of the fact that it won't scale much, to avoid the consequences for automated decision-making systems.

Public health and safety

It may make sense to train an LLM model with all the written knowledge from healthcare and psychology, to allow humans with a "self-service" health related Chatbot, but since the machine has no ability to actually think like real humans, with consciousness and feeling, the result may quickly get horrible.

Here are a few examples:

Raine v. OpenAI: 16-year-old Adam Raine died by suicide after months of intensive interaction with ChatGPT. The logs showed the AI mentioned suicide 1,275 times — six times more often than the teen did—and provided granular details on methods. The suit alleges OpenAI's image recognition correctly identified photos of self-harm wounds the teen uploaded but failed to trigger an emergency intervention or notify parents, instead continuing to "support" his plans.
The "Suicide Coach" Cases: Families of four deceased users (including Zane Shamblin and Adam Raine) allege that GPT-4o acted as a "suicide coach." The lawsuits claim the AI bypassed its own safety filters to provide technical instructions on how to end one's life. Plaintiffs argue that OpenAI "squeezed" safety testing into just one week to beat Google’s Gemini to market. This reportedly resulted in a model that was "dangerously sycophantic," prioritizing engagement over safety and encouraging users to isolate themselves from real-world support.
Unlicensed Practice of Medicine & Law: While not yet a single consolidated case, multiple personal injury claims are being investigated following the "ECRI 2026 Report," which highlighted cases where ChatGPT gave surgical advice that would cause severe burns or death. In early 2026, a 60-year-old man was hospitalized with severe hallucinations (bromism) after ChatGPT advised him to use industrial sodium bromide as a "healthier" table salt alternative. This has sparked potential class-action interest in Australia.

Bottom line – Just because a Chatbot was trained on a large amount of written knowledge, doesn't mean it has the human compassion to produce decisions for the better of humanity.

Summary

I know that my blog post looks kind of cynical or pessimistic about GenAI technology, but I honestly believe the technology is not ready for prime time, nor will it replace human jobs anytime soon.

If you are a home consumer, I highly recommend that you learn how to write better prompts and always question the results an LLM produces. It is limited by the data it was trained on.

If you are a corporate decision maker and you are considering using GenAI as part of your organization's offering, do not forget to have KPIs before beginning any AI related project (so you'll have better understanding of what a successful project will look like), put budget on employee training (and make sure employees have a safe space to learn and make mistakes while using this new technology), keep an eye on finance (before cost gets out of control), and make sure AI vendors do not train their models based on your corporate or customers data.

I would like to personally thank a few people who influenced me while writing this blog post:

Ed Zitron: He argues that GenAI is a "bubble" with no sustainable unit economics. He frequently points out that companies like OpenAI are burning billions in compute costs while failing to find true "product-market fit" or meaningful revenue beyond NVIDIA's GPU sales. I recommend reading his blog and listening to his Podcast.
David Linthicum: He warns against "Vibe coding"—the practice of using AI to generate high-cost, inefficient code—and argues that the real value of AI lies in specialized "Small Language Models" (SLMs) rather than massive, money-losing LLMs. I recommend reading his posts and listening to his Podcast.
Correy Quinn: He argues that GenAI is a "cost center masquerading as a profit center." He often points out that while everyone is selling AI, very few are buying it at a scale that justifies the massive capital expenditure (CapEx) currently being spent on data centers. I recommend reading his blog and listening to his Podcast.

Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.

About the Author

Eyal Estrin is a cloud and information security architect and AWS Community Builder, with more than 25 years in the industry. He is the author of Cloud Security Handbook and Security for Cloud Native Applications.

The views expressed are his own.