What Happens When AI Stops Waiting for Your Prompts?

#webdev #ai #agentaichallenge #agents

I recently read a study paper titled "The Rise of Agentic AI, which entirely changed how I view technology's evolution.

I utilize a typical generative AI almost every day, just like the vast majority of us. It is so intelligent, yet I have understood that it is also 100% passive. It merely sits there waiting to be prompted and returns text. However, as I read this paper, I got exposed to an enormous transformation towards agentic AI.

Rather than merely responding to questions, Agentic AI is an independent, purpose-directed computer employee. You provide it with a task to accomplish, such as Plan my whole travel itinerary and book it within my budget, and it literally sits there and uses the internet to do it, with little to no hand-holding.

The revelation of the brain of the AI. I needed to know how these agents pulled this off as I continued to read the paper. It happens that developers are creating architectures that replicate human thought.

This brain is divided into several interesting modules, which the paper breaks down:

Profile: To begin with, the AI is assigned a particular persona, such as a "Senior Python Developer" or a "Financial Analyst," on which it is based.

Memory: It is similar to us in that it possesses short-term memory regarding the activity at hand, but above all, it has a long-term memory. It keeps the past wrongs in memory to avoid them.

Planning & Action: This made the most interesting part to read. The agent takes an overwhelmingly objective approach and divides it into small, actionable units. In case it encounters an error in code execution or web searching, it does not shut down and wait until I fix it. It is a literal criticism of its own error, a revision of its plan, and an attempt to begin again.

When Agents Team Up, the part that truly left me astonished was that of multi-agent systems. One AI is nothing when you can have an entire team. The study points to structures where there is a society of AI developed by developers. The manager agent decomposes a software project, gives the code to a developer agent, and hands the output to a tester agent. They share ideas, discuss, and resolve issues.

Assessment of the AI (Evaluation Metrics): Naturally, an autonomous system cannot be tested by a multiple-choice test. The researchers outlined the difficulty of the process of assessing these agents. Rather than simply taking the final result, they evaluate the AI based on Task Success Rate (has it successfully done the job?), Efficiency (how many API calls did it make non-useful?), and Robustness (how well does it cope with unexpected errors?).

Blocks Ahead: But it is not all perfect. The article was quite definite on how difficult things have become. Security is a huge issue--they can freely surf the web and execute scripts; therefore, they are the best targets of hackers. Another problem is also the problem of coordination chaos, where the various agents may find themselves in indefinite loops arguing against each other. And the biggest question is whether, in an autonomous AI, the worst financial trade occurs or files with valuable information are deleted, who is responsible?

During my AI course, this paper brought it all together when related to our AI lectures. We have been studying rational agents working in the PEAS (Performance, Environment, Actuators, Sensors) framework. Remarkably, Agentic AI uses the standard utility-based agent that we analyze in the classroom and up-scales it to the contemporary web. These agents do not operate on a simple grid in 2D but rather have web APIs as actuators and live internet data as sensors; they navigate tricky digital worlds to strike their targets.

My NotebookLM "Aha!" Moment To tell the truth, at the moment when I initially looked at the architectural schemes in the paper, they seemed to me like a maze of flowcharts of the first order. However, I chose to save the PDF in Google NotebookLM and requested it to describe the coordination of the team in a simple fashion. It immediately provided me with an analogy of genius, of the reactive and deliberative loop being compared to human reflexes and strategic planning. Even that one encounter layered all the thick academic verbiage and rendered the whole system orchestration comprehensible.

The agentic AI is not only a new tool but also the start of a digital workforce. And when we have read this paper, I am terribly eager to know what we are going to put up with next.

And there is a brief overview in video form in which I explained what I explored in this research paper further.

Video:
All thanks to @raqeeb_26, our teaching assistant, for helping us to explore the research-related work.