Pravesh Sudha

for AWS Community Builders

Posted on Oct 29

🚀 Deploying Cognee AI Starter App on AWS ECS Using Terraform

#ai #devops #rag #programming

A step-by-step guide to automate infrastructure and scale a Flask application with Terraform, Docker, and AWS ECS.

🧭 Introduction

Welcome Devs to the world of AI and Automation 👋

Today, we’re diving into an exciting hands-on project where infrastructure meets intelligence. We’ll explore Cognee AI— a memory layer for LLMs that lets applications remember, retrieve, and build on prior context — and see how it works in action through the Cognee Starter App, built with Flask.

But we’re not stopping there. Once the app is ready, we’ll deploy it to AWS ECS (Fargate) using Terraform, bringing the power of Infrastructure as Code (IaC) to streamline and automate the entire deployment process.

By the end of this guide, you’ll:

🧠 Get familiar with Cognee AI and its role as a memory layer for LLMs.
🐳 Containerise and prepare a Flask application for production.
☁️ Provision AWS infrastructure using Terraform.
🚀 Deploy the app seamlessly on AWS ECS with Fargate.

So without further ado, let’s get started and build something awesome!

📽️ Youtube Demonstration

🧰 Pre-Requisites

Before we roll up our sleeves and dive into the deployment, let’s make sure your local environment is ready. Having the right tools installed will make the process smooth and error-free.

Here’s what you’ll need on your system:

🪝 AWS CLI configured with an IAM user that has ECS, VPC, and IAM Full Access permissions.

👉 If you’re not familiar with this setup, check out my detailed step-by-step guide here: Learn How to Deploy a Three-Tier Application on AWS EKS Using Terraform.
🐳 Docker — to containerize our Flask application before deploying it to ECS.
🐍 Python — since Cognee Starter runs on Flask.
🧱 Terraform CLI — the star of this blog, which we’ll use to provision and manage our AWS infrastructure.

Once you’ve checked these boxes ✅, you’re all set to move to the fun part — building and deploying our application!

🧠 Understanding Cognee AI

Before jumping into infrastructure, let’s take a moment to understand what Cognee AI actually does — and why it’s important.

In simple terms: “Cognee organizes your data into AI memory.”

When you make a call to a Large Language Model (LLM), the interaction is stateless — meaning it doesn’t remember what happened during previous calls or have access to your broader document context. This makes it difficult to build real-world applications that require context retention, document linking, or knowledge continuity.

That’s where Cognee AI comes in. It acts as a memory layer for LLMs, allowing you to:

Link documents and data sources together.
Maintain context across multiple LLM calls.
Create richer, more intelligent applications that can reason over previous interactions.

In this project, we’ll be using the Cognee Starter App, which gives a hands-on introduction to this powerful memory layer — and then we’ll deploy it on AWS ECS (Fargate) to make it production-ready.

🧭 How Cognee Works

Cognee isn’t just about storing data — it’s about understanding and structuring it so LLMs can use it intelligently. When it comes to your data, Cognee knows what matters.

There are four key operations that power the Cognee memory layer:

.add — Prepare for Cognification

This is the starting point. You send your data asynchronously, and Cognee cleans, processes, and prepares it for the memory layer.
.cognify — Build a Knowledge Graph with Embeddings

Cognee splits your documents into chunks, extracts entities and relations, and links everything into a queryable knowledge graph — the core of its memory layer.
.search — Query with Context

When you search, Cognee combines vector similarity with graph traversal. Depending on the mode, it can fetch raw nodes, explore relationships, or even generate natural language answers using RAG (Retrieval-Augmented Generation). It ensures the right context is always delivered to the LLM.
.memify — Semantic Enrichment of the Graph (coming soon)

This will enhance the knowledge graph with deeper semantic understanding, adding richer contextual relationships.

In our hands-on demonstration, we’ll use the first three methods — .add, .cognify, and .search — to see how Cognee works in action before deploying the app on AWS ECS.

🧪 A Small Demo to Understand Cognee Functions

Before we jump into deploying our Flask application on AWS, let’s take a few minutes to understand how Cognee AI works practically.

I’ve already hosted the project code on GitHub 👇

👉 GitHub Repository – terra-projects

Head inside the cognee-flask directory, and you’ll find the entire project structure there.

🧰 Step 1: Set up Environment Variables

Inside the project folder, create a .env file. You can refer to the .env.example file for the format.

You’ll need a Gemini API Key, which is free to get. Paste it in your .env file like this:

LLM_PROVIDER="gemini"
LLM_MODEL="gemini/gemini-2.5-flash"
LLM_API_KEY="<your-gemini-key>"

# Embeddings
EMBEDDING_PROVIDER="gemini"
EMBEDDING_MODEL="gemini/text-embedding-004"
EMBEDDING_DIMENSIONS="768"
EMBEDDING_API_KEY="<your-gemini-key>"

👉 If you’re using a different LLM provider, follow this guide to configure it properly: Cognee Docs – Installation & Setup

⚡ Step 2: Install Dependencies

To install all the required dependencies, run:

uv sync

This will set up everything you need to run the demo locally.

🧠 Step 3: Understanding `testing_cognee.py`

Now, open the testing_cognee.py file. Here’s what it looks like:

from cognee import SearchType, visualize_graph
import cognee
import asyncio
import os, pathlib

async def main():
    # Create a clean slate for Cognee -- reset data and system state
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    # Add sample content
    text = "Cognee turns documents into AI memory."
    await cognee.add(text)

    # Process with LLMs to build the knowledge graph
    await cognee.cognify()

    graph_file_path = str(
        pathlib.Path(
            os.path.join(pathlib.Path(__file__).parent, ".artifacts/graph_visualization.html")
        ).resolve()
    )
    await visualize_graph(graph_file_path)

    # Search the knowledge graph
    graph_result = await cognee.search(
        query_text="What does Cognee do?", query_type=SearchType.GRAPH_COMPLETION
    )
    print("Graph Result: ")
    print(graph_result)

    rag_result = await cognee.search(
        query_text="What does Cognee do?", query_type=SearchType.RAG_COMPLETION
    )
    print("RAG Result: ")
    print(rag_result)

    basic_result = await cognee.search(
        query_text="What are the main themes in my data?"
    )
    print("Basic Result: ")
    print(basic_result)

if __name__ == '__main__':
    asyncio.run(main())

📝 What’s happening here:

First, we purge any existing data using prune_data and prune_system.
Then, we add new text data to Cognee using .add.
We process and build a knowledge graph with .cognify.
The graph is stored in .artifacts/graph_visualization.html.
Finally, we run three types of searches:
- GRAPH_COMPLETION – explores relationships.
- RAG_COMPLETION – generates natural language answers.
- Basic Search – retrieves core themes.

▶️ Step 4: Run the Demo

Run the script with:

uv run testing_cognee.py

You should see outputs for Graph, RAG, and Basic results in your terminal. You can also open the .artifacts/graph_visualization.html file in a browser to view the knowledge graph Cognee has generated.

🧪 Step 5: Run the Flask App Locally

Before deploying to the cloud, let’s test the Flask app locally:

uv run app.py

Once the server starts, open 👉 http://localhost:5000

You’ll see the Cognee AI Starter App UI. Here’s what you can do:

Add your own data.
Ask a query.
Wait 1–2 minutes.
You’ll get:
- 🧠 Graph Completion Result
- 💬 RAG Completion Result
- 📝 Basic Theme of the text
- 🌐 A vector graph visualization generated by Cognee.

Click on “View Knowledge Graph” to explore the graph and see how Cognee structured your data.

✅ With this, we’ve understood Cognee’s core functionality locally.

Next up — we’ll take this application to the cloud by deploying it on AWS ECS using Terraform 🚀

☁️ Deploying the Cognee App on AWS ECS Using Terraform

We’ve successfully tested the Cognee AI Starter App locally — now it’s time to take things to the cloud 🌥️

We’ll use Terraform to provision all the required AWS infrastructure and deploy our Flask application on ECS (Fargate).

Inside your project directory, navigate to the terra-config folder. This is where all of our Terraform configuration files live.

🧱 Step 1: Understanding the Terraform Files

provider.tf

Defines AWS as our cloud provider and sets the region and provider configuration.
default_config.tf
- Fetches the default VPC and subnets.
- Creates a security group for ECS tasks with port 5000 open to allow inbound traffic.
main.tf (the heart of the project)
- Creates an ECS cluster.
- Defines the ECS task definition with:
  - Docker image
  - Container port
  - CPU & memory configurations
  - CPU architecture
- Creates an IAM role for ECS.
- Finally, provisions an ECS service using the task definition inside the cluster.
get_ip.sh

A simple shell script that uses AWS CLI to fetch the public URL where your application is running.

⚡ Step 2: Initialize and Deploy the Infrastructure

Run the following commands inside the terra-config directory:

terraform init
terraform plan
terraform apply --auto-approve

⏳ This will take around 2 minutes to provision the complete infrastructure on AWS — including the ECS cluster, service, task definition, and networking setup.

🌍 Step 3: Get the Application URL

Once the deployment finishes, make the script executable and run it:

chmod u+x get_ip.sh
./get_ip.sh

This will output the URL where your application is hosted.

👉 Open the URL in your browser, and you’ll see your Flask Cognee application live on AWS ECS 🎉

From here, you can:

Add your data.
Ask queries.
See the graph and AI-generated answers exactly like in the local demo.

The only difference is — it’s now running inside a scalable, production-grade ECS cluster.

🧹 Step 4: Clean Up Resources

Once you’re done testing, it’s good practice to destroy the infrastructure to avoid unwanted AWS charges:

terraform destroy --auto-approve

This will tear down all ECS services, roles, and networking resources you created for this project.

✅ And that’s it!

You’ve successfully deployed your Flask Cognee AI Starter App on AWS ECS using Terraform. With just a few commands, we automated the entire infrastructure provisioning and deployment pipeline.

🎯 Conclusion

Congratulations! 🎉 You’ve successfully taken a Flask-based Cognee AI Starter App from your local machine all the way to the cloud using AWS ECS and Terraform.

In this blog, we learned how to:

Understand Cognee AI and its memory layer for LLMs.
Explore and test its key operations — .add, .cognify, and .search.
Run the app locally to see how Cognee organizes and queries your data.
Provision AWS infrastructure using Terraform.
Deploy the Flask application on ECS (Fargate) and access it via a public URL.

This hands-on project demonstrates the power of combining AI, containerization, and infrastructure automation to deploy intelligent applications in a scalable, production-ready environment.