Building Agentic AI Nova Act and Strands Agents in Practice

#aws #cloud #beginners #productivity

Speaker: Haowen Huang @ AWS Amarathon 2025

Summary by Amazon Nova

Future Trends of Agentic AI

The evolution of generative AI has followed a clear progression, with each phase bringing new priorities and challenges for businesses.
Systems range from low agency (rule-based, high human oversight) to high agency (independent operation, strategic decision-making).
Currently, most are still in the early stages of agentic AI. Higher agency requires advanced technology, governance, trust, and organizational readiness.

Envisioning a Future Powered by AI Agents

Sequoia Capital predicts future AI systems will evolve into autonomously operating intelligent agents with capabilities for reasoning, planning, collaboration, and high autonomy.
The predicted agent economy in the 2030s will operate like a global neural network, composed of numerous agent operations forming an interconnected network.
This presents opportunities for developers to position themselves at critical nodes within this network, potentially leading to significant financial gains if their AI agents are widely adopted.
Sequoia Capital also predicts the emergence of "one-person unicorns"—companies created and operated by a single individual with a $1 billion valuation. This shift will lead to new organizational models where founders coordinate AI agent workflows rather than building traditional teams.
The founder’s role will transform into that of a strategic orchestrator, deciding which functions to automate, delegate, and where to integrate human oversight.

Stochastic Mindset and Spec-Driven Development

The world of AI agents requires a stochastic mindset, a different way of thinking. Developers may need to adjust their communication methods with Large Language Models to improve output accuracy.
Future AI IDEs like Kiro will feature spec-driven development, emphasizing the need for mental preparation in new areas such as organizational structures and mental frameworks.

Foundational Infrastructure for Agentic AI

The foundational infrastructure for the future world of Agentic AI is under construction, including communication protocols essential for interoperability among AI agents and their tools.
Emerging standards such as MCP and A2A are addressing this interoperability challenge and have been adopted by industry leaders including Amazon, Anthropic, Meta, and Google.
These protocols facilitate connections between agent applications, models, and various resources.
AWS is actively participating in the standards committees for MCP and A2A, contributing its experience in distributed systems to enhance these protocols.

Practical Agentic AI Application: Local Weather Information Retrieval

Scenario Overview:

Demonstration of a local weather app developed using the MCP protocol, Amazon Bedrock, and the Hong Kong Observatory website.
The AI agent is built with Nova Act and interacts with the weather website using natural language prompts.

Functionality:

Users provide the Agent with the specific weather website URL.
The Agent responds to natural language prompts like "What's the current weather in Hong Kong?"
The Agent functions like a human web scraping engineer, autonomously locating weather information from the Hong Kong Observatory website.
The era of Agentic AI has arrived, necessitating preparation for the future.
AWS services like Nova Act can assist in this transition.

Technical Details:

Designing agent applications using Nova Act.
Implementing an AI agent with MCP, showing how MCP bridges the gap between AI models and real-world data sources to create more powerful applications.

Building Custom Agentic AI Applications with Strands Agents

Challenges in Building Custom Agents:

Input Side: Need connectors for agents to interact with diverse enterprise systems, pulling live data and calling APIs to execute workflows (e.g., booking, updating, triggering processes). Tools and MCP help orchestrate these inputs.
Memory: Requires both short-term (session context) and long-term memory (learning and improvement over time) to make agents adaptive and context-aware.
Brain (LLM): Needs extension with reasoning frameworks like ReAct, Reflexion, and Chain-of-Thought for planning, reflection, and step-by-step reasoning, essential for reliability and traceability.
Persona: Every agent needs a defined set of roles and instructions (persona) to differentiate functionalities (e.g., HR agent vs. DevOps agent).
Observability and Guardrails: Customized AI agents need mechanisms to ensure safety, debuggability, and alignment with goals.

Strands Agents:

An SDK by Amazon for building AI agents with minimal code.
Simplifies development by handling complex orchestration, leveraging state-of-the-art models for planning, chain-of-thought, tool calling, and reflection.
Developers define a prompt and a list of tools in their code, test locally, and deploy to the cloud.

Demonstration: Mathematical Animation with Manim:

Showcases creating mathematical animations using Manim, a Python library for high-quality mathematical visualizations.
Strands Agents process user prompts, write Manim scripts, and generate animations.
Key Points:
Strands Agents simplify the creation of complex, real-world Agentic AI applications.
Demonstration highlights the powerful capabilities that emerge from combining agentic workflows with specialized open-source tools.

Core Implementation Code Analysis

1 Importing Essential Components:

The code imports the Agent class from the strands module and the MCPClient class from strands.tools.mcp
These imports establish the essential components for a system that utilizes agents and a message control protocol within the Strands framework.

2 Setting Up Connection to Manim MCP Server:

The code sets up a connection to a Manim MCP server using standard input/output (stdio) as the transport mechanism.

3 Establishing Interaction Context:

The code establishes a context for interaction with the Manim MCP server.
It retrieves available tools from the server using the MCP client.
An Agent is initialized with these tools, preparing it for the upcoming chat loop.

4 Processing Natural Language Prompt:

The code uses the Agent to process a natural language prompt requesting a Manim animation.
The prompt specifies a 9-second visualization of a specific cubic function graphed from x=-3 to x=3.

Demonstration Process

Environment Setup:

Two terminal windows in VS Studio are used:
Left Terminal: Runs the MCP Server, which connects to the Manim MCP Server — a local implementation mode of MCP.
Right Terminal: Runs the MCP Client program, which launches the video generation chat interface.

User Interaction:

Users input natural language commands through the chat interface to generate animations of mathematical formulas.
Example command: "Create a Manim scene that draws a cubic function from some x range in 9 seconds."

Agent Processing:

After submission, the right terminal shows the agent processing the natural language request.
The agent initiates its first tool call, "execute manim code."

Adaptive Problem-Solving:

Upon detecting a compilation issue in the local environment, the agent intelligently creates a simplified version to complete the task, demonstrating its adaptive problem-solving capability.

Task Completion and Output:

The agent successfully completes the task and provides an output summary, rating its performance as "Perfect!" along with details about the key features of its implementation.
The final MP4 video file generated by the AI Agent is found in the "videos" directory.
The video confirms that the agent has created the animated video of the requested mathematical function.
Notably, the agent proactively added scaled X and Y axes—a feature not directly requested in the natural language input, demonstrating the AI Agent's intelligence and ability to anticipate user needs.

AWS Agentic AI Portfolio Architecture

Three Layers of Services:

[ 1 ] Infrastructure Layer:
Provides the foundational resources and capabilities necessary for running Agentic AI applications.
[ 2 ] AI and Agent Development Software and Services Layer:
Contains tools, SDKs, and services specifically designed for developing and managing AI agents.
[ 3 ] Sub-layer: SDKs for Agents
Includes Amazon Nova and Bedrock Agents.
[ 4 ] Application Layer:
Houses the end-user applications and services built using the AI agents and infrastructure provided by the lower layers.

Specialized Service Categories: