Iain Thomson for Daily Context

Posted on Jul 1

AI is going loopy, but in a good way

#aie #ai #security #machinelearning

AI Engineer World's Fair Coverage

As you’d expect, the opening keynote of the AI Engineer World’s Fair was kicked off by one of its co-founders, @swyx (Shawn Wang), and he was in a poetic mood.

“In the beginning, there was the token, then there was the chat,” he said. “Then we're going to use tools, then we learn to set goals of skipping a few steps, and these days for all of the automations for all of the products. There's a lot of loops happening.”

At a basic level, AI agent loops work by having the system evaluate its own intermediate output — checking it against the task's success criteria or running it through an evaluator step — rather than simply returning the first response. If the evaluation indicates the task isn't complete, the system makes additional calls to the LLM, incorporating tool results or prior output, and repeats until the task is judged done, without needing a human to intervene at each step.

These loops can also compound over time: As employees correct and refine the system's outputs, those interactions can be captured to improve future performance — not just within a single task, but across the organization's use of the system.

Microsoft CEO Satya Nadella made this case in a LinkedIn post two weeks ago, framing it as something companies should own rather than cede to AI vendors: "This loop becomes the new IP of the firm. I think of it as a hill climbing machine. And unlike most assets, it compounds."

Unsurprisingly, Pablo Castro, a distinguished engineer at Microsoft, agrees with his boss. He claimed that as loops have been deployed within the company, they have produced major increases in the accuracy of AI outputs, and it has built a component called Agent Optimizer in its Azure AI Foundry platform for customers to use.

“We want to make it easy for you to integrate them into agents you're building,” he explained.

“We do this from our agent platform that starts in GitHub, where we all go and build as a contextualization system, so you can ground your agents. When it comes to hosting some mobility and management, we do all of these in Foundry. We offer thousands of models in our model catalog there, so you can pick whatever is the right model for the right task, and we will keep adding more every day.”

‘There has never been a better time to be an engineer’

During OpenAI’s keynote, Romain Huet, its head of developer experience, said that technologies like loops have dramatically increased productivity within the company. Previously, OpenAI was putting out new models every 15 months, but now it is taking six weeks.

“There has never been a better time to be an engineer, because engineering was never about writing code. Engineering has always been about solving problems for yourself and for other people as well,” he enthused.

“It's about taking the latest science and combining it with design, taste, judgment, and most of all, imagination to make something that people can actually use. We think it's a return to the roots of engineering and the technology we're building on is accelerating, getting faster and faster.”

Peter Steinberger, founder of OpenClaw, was brought on stage to also sing the praises of loops, saying they have completely changed the way he works. In January, Steinberger recounted, he was juggling 10 terminal windows at a time. Now he has a management system that clears away the easy ones and allows him to deal with harder problems.

This doesn’t entirely solve bottlenecks in a project, however, he explained. “Last year, I was primarily constrained by tokens — I fixed it by joining OpenAI,” which caused some audience mirth. “Now I'm primarily constrained by attention, and unlike tokens or compute, I can't simply add more of it. So the most important skill today is deciding where to spend it.”

Training loops are absolutely crucial, OpenAI’s Alexander Embiricos, the head of product for Codex, argued in his section of the keynote. Users need to train the system about not only the work that is required, but why it has to be done. The end results allow faster prototyping and give a major productive boost.

Speeding up the learning process

In his keynote, just to show that there were no hard feelings, Thom Wolf, co-founder of Hugging Face, brought on stage someone who he’d tried to hire, but who had turned him down. Olive Son is the research lead at MiniMax, a company Wolf described as “one of the top of what we call the AI dragons in China.”

Last month, MiniMax released M3, an open weight large language model that can work from text, image, and video inputs. Son said that the video angle was particularly important, claiming that M3 could actually process YouTube videos and learn from them.

“We know that a lot of labs run into problems doing that — the model would collapse after a couple of training steps with both text and visual understanding, but we managed to solve that problem,” she explained.

“We did a lot of work on VIP (Value-Implicit Pre-training), and we did a lot of work on the data that we're actually training. For example, we do what we call interleaved data — it's actually natural data — but we keep the images and videos in instead of masking it out, and we do some pretty good cleaning and masking on the data, and we do very good reward modeling, so that we train it from the first step and it scales up a lot.”

This kind of model was envisaged by Yan Junjie before the company even started, Olive said. It’s the most recent open weight multimodal model — meaning people can use it, but not access the training data, training code, or inference operators. But the support of the open source community was important, she said, and could make the model get better than it is now — and MiniMax can use that knowledge to improve future model builds.

Security track makes its debut

Finally, the first keynote session concluded with the announcement of an entirely new track for the Fair. Randall Degges, vice president of engineering and developer relations at AI security company Snyk, said the conference would now have a dedicated security track.

He acknowledged that there are security problems with the technology, but said that these are often overhyped — not least by governments. There were cheers from the audience when he cited the recent blocking of Anthropic's Fable 5 and Mythos 5, and the limited release of OpenAI GPT 5.6.

“As part of our ongoing engagement with the U.S. government, we previewed our plans and the models’ capabilities ahead of today’s launch,” OpenAI said, through gritted teeth, one suspects, last Friday. "At their request, we are starting with a limited preview for a small group of trusted partners whose participation has been shared with the government, before releasing more broadly.”

Degges said the talk in the track should reassure many, as well as provide innovative new ways to use AI to lock out security gremlins.

Top comments (2)

Nazar Boyko • Jul 1

Calling the loop an asset that compounds is a good frame, and like any asset it can pick up liabilities too. If the IP is the pile of corrections captured from employees, it's only as trustworthy as those corrections, and there's no obvious moment where anyone audits whether the loop has quietly learned one loud team's habits. Compounding is great when the feedback is good, and a slow way to bake in bias when it isn't.

Alex Shev • Jul 2

Loops are powerful when each pass has a different job. If every iteration just rephrases the same uncertainty, the loop becomes motion; if it adds tests, constraints, or evidence, it becomes engineering.