Ömer Berat Sezer for AWS Community Builders

Posted on May 10 • Edited on May 15

🔥Simple DEV Blog Post Writer with LangGraph Multi Agents with Memory, AWS Bedrock Nova: Generator & Evaluator Pattern

#ai #python #programming #discuss

Published @ AWS Builder Center

So far, we’ve covered how to use agent tools, skills, and sub-agents. Now, it’s time to put everything into practice. I’m going to walk you through a small proof of concept (POC) project to show how these pieces actually work together.

In this small POC project, I built a simple DEV blog post writer using LangGraph multi-agents with memory and AWS Bedrock Nova models. The workflow starts by extracting keywords from a user prompt, researching related topics from DEV Community posts, and then generating a draft blog article. An evaluator agent reviews the output and provides feedback for refinement, creating a lightweight content improvement loop.

This is intentionally a POC rather than a production-grade or highly optimized blog-writing system. Alongside blog generation, the workflow also stores topic summaries in a memory.md file to retain useful context for future runs. The project is open for extension and experimentation, making it a practical starting point for exploring agentic content pipelines.

Why MultiAgent Generator & Evaluator Pattern?

Improves output quality by combining creation (generator) with structured review (evaluator)
Enables iterative refinement > generate > evaluate > improve until a quality threshold is met
Increases reliability by separating responsibilities (build vs. critique)
Works especially well for code, content, and complex reasoning tasks

How it generates & evaluates the blog post:

Whether you're exploring agent design or building your own system, this will give you a clear, practical starting point 😉

Dependencies & Configuration
Implementing Memory
Implementing Research Abilities
Agents State & Evaluation Result
Generator & Evaluator Agents Nodes
Workflow & Route & Graph
Run & Call AWS Nova
All Code & Demo
Memory.MD File
Generated Blog Post
Conclusion
References

Dependencies & Configuration

Please install dependencies:

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
# deactivate

Requirements.txt:

langchain>=1.0.0
langchain-aws>=1.2.0
langgraph>=1.0.0
python-dotenv>=1.0.0
boto3>=1.34.0
langchain-community>=0.4.1
duckduckgo-search>=8.1.1
ddgs>=9.14.1
requests>=2.33.0

Enable AWS Bedrock model access in your region (e.g. eu-central-1, us-east-1)
- AWS Bedrock > Bedrock Configuration > Model Access > AWS Nova-Pro, or Claude 3.7 Sonnet
In this code, we'll use AWS Nova-Pro, because it's served in different regions by AWS.
After model access, give permission in your IAM to access AWS Bedrock services: AmazonBedrockFullAccess
2 Options to reach AWS Bedrock Model using your AWS account:
- AWS Config: With aws configure, to create configand credentials files
- Getting variables using .env file: Add .env file:

AWS_ACCESS_KEY_ID= PASTE_YOUR_ACCESS_KEY_ID_HERE
AWS_SECRET_ACCESS_KEY=PASTE_YOUR_SECRET_ACCESS_KEY_HERE

Implementing Memory

Memory Functions (init, read, append):

# Memory 
def mem_init(topic: str) -> None:
    MEM.write_text(
        f"# Memory — {topic}  ({datetime.datetime.now():%Y-%m-%d %H:%M})\n\n---\n\n"
        "## Research\n\n## Sources\n\n## Critiques\n\n## Log\n",
        encoding="utf-8",
    )

def mem_read() -> str:
    return MEM.read_text(encoding="utf-8") if MEM.exists() else ""

def mem_append(section: str, content: str) -> None:
    text = mem_read()
    eol  = text.index("\n", text.index(f"## {section}") + len(f"## {section}"))
    MEM.write_text(text[:eol+1] + "\n" + content.strip() + "\n" + text[eol+1:], encoding="utf-8")

Implementing Research Abilities

Research functions (keywords generation, search on DuckDuckGo, fetch and collect):

from ddgs import DDGS
import re, sys, json, datetime, requests

# Research
def get_keywords(topic: str) -> list[str]:
    resp = GEN.invoke(
        f"Generate 8 diverse search queries for technical articles about: {topic}\n"
        "Return ONLY a JSON array of strings."
    )
    try:    return json.loads(resp.content.strip())[:8]
    except: return [topic]

def search(query: str) -> list[dict]:
    def run(q: str) -> list[dict]:
        with DDGS() as d:
            raw = list(d.text(q, max_results=MIN_ARTICLES * 4))
        return [{"title": r["title"], "url": r["href"], "snippet": r["body"]}
                for r in raw if SITE in r.get("href", "")]
    for q in [f"site:{SITE} {query}", f"{query} {SITE}"]:
        try:
            hits = run(q)
            if hits: return hits[:MIN_ARTICLES]
        except: continue
    return []

def fetch(url: str, snippet: str = "") -> str:
    try:
        html = requests.get(url, headers={"User-Agent": "Mozilla/5.0"}, timeout=10).text
        text = re.sub(r"<(style|script)[^>]*>.*?</\1>", " ", html, flags=re.DOTALL)
        text = re.sub(r"<[^>]+>", " ", text)
        text = re.sub(r"\s{2,}", " ", text).strip()
        return text[:3000] if len(text) >= 300 else (snippet[:3000] or "[unavailable]")
    except Exception as e:
        return f"[failed: {e}]"

def collect(keywords: list[str]) -> tuple[str, list[str]]:
    seen, found = set(), []
    for kw in keywords:
        if len(found) >= MIN_ARTICLES: break
        for hit in search(kw):
            if hit["url"] in seen or len(found) >= MIN_ARTICLES: continue
            seen.add(hit["url"])
            content = fetch(hit["url"], hit["snippet"])
            if len(content.strip()) < 300 or content.startswith("["): continue
            found.append({**hit, "content": content})
            print(f"  {len(found)}/{MIN_ARTICLES}  {hit['url']}")
    if len(found) < MIN_ARTICLES:
        print(f"  WARNING: only {len(found)}/{MIN_ARTICLES} articles found")
    sections = [
        f"### [{a['title']}]({a['url']})\n**Snippet:** {a['snippet']}\n\n**Content:**\n{a['content']}\n"
        for a in found
    ]
    return "\n---\n".join(sections) or "_No results._", [a["url"] for a in found]

Agents State & Evaluation Result

State needs to communicate between agents.
Also, EvalResult (evaluation of result) needs to optimize, evaluate the result.

from typing import TypedDict, Literal, Optional
from pydantic import BaseModel, Field

class State(TypedDict):
    topic: str; keywords: list[str]; blog: str
    feedback: Optional[str]; verdict: str; iteration: int; max_iter: int

class EvalResult(BaseModel):
    verdict:         Literal["accepted", "rejected"]
    depth_score:     int = Field(ge=1, le=5)
    recency_score:   int = Field(ge=1, le=5)
    structure_score: int = Field(ge=1, le=5)
    writing_score:   int = Field(ge=1, le=5)
    feedback:        Optional[str] = None
evaluator = EVAL.with_structured_output(EvalResult)

Generator & Evaluator Agents Nodes

Generator create keywords, research, appends the data (links, summary) into the Memory, then generates the blog post.

# generator
def generator(state: State) -> dict:
    n, topic = state["iteration"] + 1, state["topic"]
    print(f"\n[Generator] Iter {n}/{state['max_iter']}")

    keywords = state["keywords"] or get_keywords(topic)
    research, urls = collect(keywords)
    now = datetime.datetime.now().strftime("%H:%M:%S")

    mem_append("Research", f"### Iter {n} — {now}\n**Keywords:** {', '.join(keywords)}\n\n{research}\n")
    mem_append("Sources",  f"### Iter {n}\n" + "\n".join(f"- {u}" for u in urls) + "\n")
    mem_append("Log",      f"- **Iter {n}** `{now}` — {len(urls)} articles\n")

    rewrite = (f"\n\nPREVIOUS POST:\n{state['blog']}\nFEEDBACK:\n{state['feedback']}"
               if state.get("feedback") and n > 1 else "")

    # Strip the Log section — critiques + research are useful, raw logs are noise
    memory_context = mem_read().split("## Log")[0][:4000]

    blog = GEN.invoke(
        f"Write a 1500-2000 word technical blog post about **{topic}** for senior AI/ML engineers.\n\n"
        f"MEMORY CONTEXT (previous research, sources, critiques):\n{memory_context}\n\n"
        f"CURRENT RESEARCH:\n{research[:8000]}"
        f"{rewrite}\n\n"
        "Requirements: cite ≥8 sources by title + URL, # title, ## sections, code examples.\n"
        "End with ## References (real URLs only). Markdown only."
    ).content

    BLOG.write_text(blog, encoding="utf-8")
    print(f"  Blog: {len(blog)} chars | {len(urls)} sources")
    return {"blog": blog, "iteration": n, "keywords": keywords}

Evaluator evaluate the post in certain criterias (depth, recency, structure, writing), appends the output into the Memory.

# evaluator
def evaluate(state: State) -> dict:
    print(f"\n[Evaluator] Reviewing iter {state['iteration']}...")
    e: EvalResult = evaluator.invoke(
        f"Evaluate this blog post on **{state['topic']}** for senior AI engineers.\n\n"
        f"POST:\n{BLOG.read_text(encoding='utf-8')}\n\nMEMORY:\n{mem_read()[:2000]}\n\n"
        "Score 1-5 each: depth, recency (2023-2025), structure, writing. Accept only if ALL ≥ 4."
    )
    print(f"  D:{e.depth_score} R:{e.recency_score} S:{e.structure_score} W:{e.writing_score} → {e.verdict.upper()}")

    now = datetime.datetime.now().strftime("%H:%M:%S")
    mem_append("Critiques",
        f"### Iter {state['iteration']} — {now} — **{e.verdict.upper()}**\n"
        f"| Depth | Recency | Structure | Writing |\n|---|---|---|---|\n"
        f"| {e.depth_score}/5 | {e.recency_score}/5 | {e.structure_score}/5 | {e.writing_score}/5 |\n"
        + (f"\n**Feedback:** {e.feedback}\n" if e.feedback else "")
    )
    mem_append("Log",
        f"- **Iter {state['iteration']}** `{now}` — {e.verdict.upper()} "
        f"(D:{e.depth_score} R:{e.recency_score} S:{e.structure_score} W:{e.writing_score})\n"
    )
    return {"verdict": e.verdict, "feedback": e.feedback or ""}

Workflow & Route & Graph

It needs to create the agents as nodes using graph.
Route decides the number of iterations.

from langgraph.graph import StateGraph, START, END

def route(state: State) -> str:
    if state["iteration"] >= state["max_iter"]:
        print("  Max iterations — publishing best version."); return "Accepted"
    return "Accepted" if state["verdict"] == "accepted" else "Rejected"

# Graph
graph = StateGraph(State)
graph.add_node("generator", generator)
graph.add_node("evaluator", evaluate)
graph.add_edge(START, "generator")
graph.add_edge("generator", "evaluator")
graph.add_conditional_edges("evaluator", route, {"Accepted": END, "Rejected": "generator"})
pipeline = graph.compile()

Run & Call AWS Nova

For each agents, assign ChatBedrockConverse.

from langchain_aws import ChatBedrockConverse
from pathlib import Path
from dotenv import load_dotenv

load_dotenv()
OUT  = Path("output"); OUT.mkdir(exist_ok=True)
BLOG = OUT / "blog_post.md"
MEM  = OUT / "memory.md"
SITE         = "dev.to"
MIN_ARTICLES = 5
GEN  = ChatBedrockConverse(model="us.amazon.nova-pro-v1:0", temperature=0.8)
EVAL = ChatBedrockConverse(model="us.amazon.nova-pro-v1:0", temperature=0)

def run(topic: str, max_iter: int = 3) -> None:
    print(f"\n{'='*60}\n  {topic}\n{'='*60}")
    mem_init(topic)
    result = pipeline.invoke({
        "topic": topic, "keywords": [], "blog": "", "feedback": None,
        "verdict": "", "iteration": 0, "max_iter": max_iter,
    })
    print(f"\n{'='*60}\n  Done in {result['iteration']} iter(s) — output/ written\n{'='*60}\n")

if __name__ == "__main__":
    run(" ".join(sys.argv[1:]) or "AI Agents with Memory on AWS Bedrock AgentCore")

All Code & Demo

GitHub Link: Project on GitHub

Run:

python3 agent.py

Memory.MD File

Generator collects blog posts from Dev.To with their links and content.
Generator collects all links under sources for references.

## Research

### Iter 1 — 14:46:42
**Keywords:** AI Agents with Memory on AWS Bedrock AgentCore: Overview and Use Cases, Implementing Long-Term Memory for AI Agents using AWS Bedrock AgentCore, Comparing AWS Bedrock AgentCore Memory Features with Other AI Platforms, ...

### [AWS Bedrock AgentCore Memory: Give Your AI Agent a Brain That Actually ...](https://dev.to/sampathkaran/aws-bedrock-agentcore-memory-give-your-ai-agent-a-brain-that-actually-remembers-12ie)
**Snippet:** It's a managed memory service purpose-built for agents, with three distinct memory tiers and a retrieval API that plugs directly into the Bedrock agent runtime. ...

**Content:**
AWS Bedrock AgentCore Memory: Give Your AI Agent a Brain That Actually Remembers -  Prerequisites Familiarity with AWS Bedrock, boto3, and building LLM-based agents. The Problem With Stateless Agents If we ship a Bedrock agent to production, we already hit this wall. Every invocation is stateless. We hack around it by stuffing conversation history into the prompt, bloating your token count, and eventually hitting context limits. Or you build your own memory layer...

## Sources
### Iter 1
- https://dev.to/sampathkaran/aws-bedrock-agentcore-memory-give-your-ai-agent-a-brain-that-actually-remembers-12ie
- https://dev.to/aws-builders/agent-memory-strategies-building-believable-ai-with-bedrock-agentcore-kn6
- https://dev.to/aws/build-production-ai-agents-with-managed-long-term-memory-2jm
- https://dev.to/sudarshangouda/ai-agent-memory-from-manual-implementation-to-mem0-to-aws-agentcore-2d7c
- https://dev.to/yuriybezsonov/ai-agent-memory-made-easy-amazon-bedrock-agentcore-memory-with-spring-ai-3bng

Evaluator puts its critiques under Memory.MD

## Critiques

### Iter 1 — 14:46:54 — **ACCEPTED**
| Depth | Recency | Structure | Writing |
|---|---|---|---|
| 5/5 | 5/5 | 5/5 | 5/5 |

## Log

- **Iter 1** `14:46:54` — ACCEPTED (D:5 R:5 S:5 W:5)
- **Iter 1** `14:46:42` — 5 articles

GitHub Link: Memory.MD on GitHub

Generated Blog Post

It creates blog post. However, our aim is not to create perfect blog post.

# AI Agents with Memory on AWS Bedrock AgentCore

In the evolving landscape of AI and machine learning, the ability to create agents that can retain and utilize past interactions is paramount. AWS Bedrock AgentCore Memory provides a robust solution for building AI agents with memory capabilities. This article will delve into how AWS Bedrock AgentCore Memory works, its advantages, and how senior AI/ML engineers can leverage it to build more effective and context-aware AI agents.

## Introduction to AWS Bedrock AgentCore Memory

AWS Bedrock AgentCore Memory is a managed service designed specifically for AI agents, offering a sophisticated memory system that enhances the capabilities of these agents. Unlike traditional stateless agents, AWS Bedrock AgentCore Memory enables agents to retain context across interactions, making them more effective and efficient. This service is particularly beneficial for applications requiring long-term memory, such as customer support bots, virtual assistants, and conversational AI systems....

GitHub Link: Blog Post Output on GitHub

Conclusion

In this post, we mentioned:

how to create multi-agents with LangGraph,
how to use memory files (markdown files) between multi-agents,
how to create nodes, routes, states, evaluation loop,
how to use AWS Bedrock Nova.

This small POC project creates blog post drafts, but the goal is not to write a perfect blog post fully with AI. Instead, it helps people by doing tasks like research, organizing ideas, and creating a first draft. The draft can then be improved with feedback from agents and edited by a human. Writing is still a human process, and AI agents are used as assistants to save time and improve quality, not to replace the writer.

If you found the tutorial interesting, I’d love to hear your thoughts in the blog post comments. Feel free to share your reactions or leave a comment. I truly value your input and engagement 😉

For other posts 👉 https://dev.to/omerberatsezer 🧐

References

Your comments 🤔
I’d love to hear how others are building agentic workflows.

Are you using generator–evaluator patterns, memory layers, or multi-agent architectures in real projects?
What are you thinking about LangGraph, Multi-Agents, AWS Bedrock Services?

Drop your thoughts, feedback, or improvements in the comments, always curious to learn from different approaches. ☺️

Top comments (1)

Ömer Berat Sezer AWS Community Builders • May 10 • Edited

My current view is that the future of agentic apps is less about one super agent and more about orchestrated systems of smaller specialized agents, tools, memory layers, and evaluation mechanisms. I also think we’ll move from prompt-centric apps toward workflow-centric products, where reliability, observability, and evaluation become as important as model quality itself. I'm curious how others see the future of agentic apps.

Table of Contents