Kazuya

Posted on Dec 5, 2025

AWS re:Invent 2025 - From vibe to live in minutes with Heroku AI PaaS (AIM250)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - From vibe to live in minutes with Heroku AI PaaS (AIM250)

In this video, Julián Duque, Principal Developer Advocate at Heroku, presents their AI platform offerings. He demonstrates Heroku's managed inference and agents supporting models like Claude 3.5 Sonnet, GPT-4o, and Amazon Nova, with Model Context Protocol integration for secure data access. A live demo shows a solar energy dashboard using agents with database query and Python code execution tools. He also showcases Heroku Vibes, a tool for building web applications through natural language prompts, demonstrating an AWS Re:Invent agenda builder created and deployed entirely through conversational AI.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introducing Heroku's AI Platform: Building the 'Heroku of AI'

Hello and welcome to this presentation. I'm going to be talking about the AI offering that we have at Heroku. Before we start, I'm going to introduce myself. My name is Julián Duque, and I'm the Principal Developer Advocate for Heroku, which is a Salesforce company.

Being a Salesforce company, we are a publicly traded company, so please don't make any purchase decisions based on things that are not yet publicly available. I should mention that I may be discussing things that are not yet fully GA. Heroku is a Platform as a Service, and we have been in the business since 2010 and 2011.

We were the first Platform as a Service for Ruby on Rails applications, and now we support more than nine programming languages and some other managed services. We are renowned for offering a great developer experience, and our developer experience is so good that many companies and open source products have created the "Heroku of X." They try to mimic our experience and create something based on what we do. A couple of years ago, we asked ourselves this question: Who is going to build the Heroku of AI? Who is going to bring the developer experience of building AI applications that Heroku offers?

And of course, the answer was ourselves. We are going to be building the Heroku of AI.

Building AI applications and agents is challenging. You need to make a lot of decisions and take care of things that are not usually obvious. For example, we need to select the right models—the large language models we are going to be using for our application. If it's a text-to-text model, an image generation model, or if we are going to do embeddings, we need to make sure that we are using the right model for our app.

Also, how are we going to integrate with other applications and services? How are we going to allow the LLM or the model to access data securely? Right now, one of the risks about building AI is data leakage. We are giving access to our data to a third party that can access it, and it might be vulnerable to certain security issues. Additionally, how are we going to operate the infrastructure behind AI? There are many hidden things we don't know, like managing GPUs and maintaining the models running. There is a lot of complexity behind it.

It's hard to make things easy for developers and easy for the company to operate and manage. Heroku has been working on a platform for a while, so we already have an integrated platform where you can use and deploy your applications and agents. We have great developer experience and operational experience. We have already selected a list of curated AI models that we know are used in the industry and that are working for these new types of applications. We also added extensibility through the Model Context Protocol, which is a protocol that allows the LLMs to have access to more context through prompt resources and tools.

Getting to production is hard. We need to make a lot of complex decisions. For example, when we want to take an application or an agent to production, we need to choose where we are going to deploy this application and what the underlying operating system and network configuration will be for our app. How will the CPU and memory be distributed for our application? What will be the data stored, the logging, the observability? There are a lot of moving pieces related to deploying an application to production.

With Heroku, we simplify all of these choices and we are giving you an opinionated platform. You can focus on building the application using our developer tools and then deploy it. You can operate your application by using the tools we are giving you to scale your app and monitor your app. We take care of the rest. We take care of the observability, we take care of the metrics, we take care of the support, and we give you the tools to scale and maintain the application. This is increasing developer productivity, reducing the money that you are spending on DevOps and infrastructure operation, and increasing the return on investment of your application.

So let me tell you about the different AI offerings that we announced and released this year, and this is why we became the AI platform as a service.

Heroku Managed Inference and Agents: Live Demo of a Solar Energy Dashboard

First, we have managed inference and agents. We allow you to provision AI models to an app with just one click or one command. You can select from a list of curated AI models that allow you to do text generation, image generation, or embeddings with just one click. This is available right now.

We recently launched support for Claude 3.5 Sonnet, Claude 3.5 Haiku, Amazon Nova Lite, Amazon Nova Pro, and GPT-4o. We also added support for MCP, the Model Context Protocol, and this support works in two ways. We have a Heroku MCP server that allows developers and DevOps people to use our MCP that knows how to manage Heroku resources. This is for builders, but we also have a platform for you to securely deploy MCP servers that you are building. You can deploy and host your MCP on Heroku and access that MCP remotely through an HTTP interface or directly from within the agent endpoint that we provide with managed inference and agents.

We also have support for vector databases through pgvector, which is an extension of PostgreSQL to create vector databases to perform similarity search and create retrieval augmented generation pipelines. Back in October, we released the pilot of Heroku Vibes, which is a tool that allows you to build web applications using natural language from the web. This tool lets you build the app and deploy the app automatically to Heroku so you don't even need to worry about the infrastructure behind. We are taking care of everything, and with just one prompt and a couple of adjustments if you want to iterate over it, you can have an application from an idea to production in minutes.

Let me show you a couple of demos that I have here. The first shows the Heroku managed inference and agents and how I built an application that uses Heroku AI to get access to data and perform smart actions within an application. The other example is Heroku Vibes, which lets you build applications using natural language.

I built this application, which is a dashboard for a solar energy company. Let's say you have a solar installation in your house or business and you want to monitor how much energy you are producing, how much energy you are consuming, what your savings will be, and get insights about the application. This dashboard already accesses my PostgreSQL database that is deployed on Heroku and creates the dashboard, but this is your typical web application. I also want to ask questions using AI to get access to insights from my data. How can I do this? Usually people build retrieval augmented generation pipelines, so you get data from the database, pass that data to the inference model, and then perform inference on top of that data.

What we did here is use agents. We have an agent using the Heroku endpoint and tools, tools that we maintain which are essentially MCPs that are hosted on Heroku to give my agent access to the database. I am not giving my database credentials. I am just giving read-only access to a follower of my database so I can safely retrieve data and analyze that data. Let me open the agent here. I already have two queries that I did, but I am going to do something here live.

The first one, I ask about what was my lowest production day this month. I want to know from all this month, what was the day that I had the lowest production. The agent is performing a tool execution which is the database query. It is generating this query, so my agent already knows the shape of my database, but we also have tools to retrieve the schema of the database so it doesn't hallucinate the shape of your data. It is generating the query, running the query, and giving me the response, and then the inference is taking care of the answer.

The inference takes care of the answer, so we are essentially feeding the context of my prompt with tools. For the other question about creating a chart or image of the hourly production data for today, the approach is pretty much the same. I need to go to the database, but then the question becomes how to create that image. For that, I'm using another tool which is code execution. In this case, I'm using Python and Python dependencies to generate images like matplotlib, pandas, and NumPy to create the image on the fly, and then I'm uploading this image to Amazon S3 and returning a pre-signed URL.

We can see here that we have the database query fetching the data hour by hour for the day, and then it's passing that data to a Python execution tool. Since you can run any application on Heroku, what we're doing here is generating this code. The LLM is generating this code, spinning a dyno on Heroku, which is a virtual machine, running the code in a sandbox safely on Heroku, and getting the return. You're only consuming the seconds that the application was running. I get the return, which is the pre-signed URL on Amazon S3, and now I have the answer, which is the report and the analysis. Let me trigger another prompt here.

The question is: what was the peak production in the past seven days? Again, I need to use a tool to get this data. In this case, I'm expecting that my agent, which I configured, is going to perform a database query. Getting that data from the database and then doing an analysis over the data. This is the query. It failed. Interestingly, the system is now trying another query. Now it got the response, so you can see that it has a self-healing capability. This is not scripted; this is a live demo. It failed and now it got the data and created the report that I just asked for on the fly.

So how does this work? How do I provision this model to my application? This is an application that is deployed on Heroku. It's a Node.js application, an API that accesses the Heroku AI services. So how can I provision these AI services like any other service on Heroku? I'm on the Heroku dashboard in the resources section, and I will search for Heroku managed inference and agents. It will provide a list of the supporting models that we have. I can select the model. Let's say we want Claude 4.5. I click on the order form to provision. Now I have the model available to my application. How can I access that model from my application? Basically, this gives me an API URL and an API key, and now I can use an SDK or directly perform an HTTP request to the inference endpoint and get the response. Everything is working on Heroku infrastructure, so your data is not going out to a third party. Everything stays within the same infrastructure, remains safe and trusted, and gets access to the database.

If I go to the settings of my application, I can reveal the configuration variables. You might see here the different API keys and API URLs for the models I have provisioned. So what I have to do in my app, I will show you the code. This is my agent. I have a system prompt. In the system prompt, I'm specifying what you do: you are a solar energy agent. This is the type of tools and libraries you have access to. As I mentioned, I'm giving access to the library to upload to S3, to do matplotlib and charting in Python. I'm specifying everything here very specifically. After defining the prompt and the tools, you can see we have Heroku tools. These are tools that run on Heroku, not tools that I have to write. I'm giving access to the database to those tools.

What I'm doing is making an ACTTP request, which you can just call an API. I'm using the same OpenAI API shape, so if you're using an SDK that talks to that specific API—for example, Langchain, the Bedrock AI SDK, or LlamaIndex, depending on the technology of your choice for building agents—you can use Heroku AI for that.

Heroku Vibes: Building Applications from Natural Language Prompts

Now let me switch to the other AI tool that we're offering. This one is more for pro development for developers and builders who can use the services. The other tool is more for business users, people who aren't 100% technical but want to build applications and deploy to Heroku. That's Heroku Bikes. Here on Heroku Bikes, I start from a prompt. I need to define the prompt of the application that I want to build. For this example, I'm going to be building an agenda builder for AWS Re:Invent 2025. I'm asking to create this agenda builder application so I can track my sessions and store the information on local storage, but I can also store it on a PostgreSQL database on Heroku. It will take care of the backend, and I'm giving a data source—get the data from this CSV that lives on the internet.

Now I'm going to create that application. There are a couple of things that are going to happen here. First, it's going to enter into plan mode and create a step-by-step plan of how the agent is going to build this application. You can approve the plan or suggest changes to it. Once you have that plan approved, it's going to start building. Since I just have two minutes in my presentation, I'm going to show you exactly this prompt and the whole process and a couple of iterations that I did, changing this prompt and the final result.

Here you can see that I have a long conversation with the agent. But it started with the same exact prompt: I want my agenda builder, this is how I want it, this is where you're going to be storing the data, and this is where you're getting the data. It provides a plan, I approved the plan, and then it went and implemented the application. Since we're using Heroku here, it is deploying the application directly to Heroku. So the application that I have here is the final result. It's living on Heroku. You can see the Heroku app domain. It is live, and I can just use it. I can just add sessions to my agenda, go check the agenda, and that's exactly what I asked for.

I then ask for some changes. For example, I want to have a dark and light theme, or I want to change this color to be something different, or I want you to have a backend and store the information on a PostgreSQL database. That's the iterative process that you're going to do with these AI tools to build an application and take it live just using a prompt and going to production.

So that's what Heroku has for you. One is the Heroku AI services for you to build custom agents and deploy them. Then we're also giving you a tool for building applications using AI that simplifies that process. That's pretty much it. Thank you so much. If you have any questions, we're located at booth 838, just right next to the Salesforce booth. You can go there, visit us, and ask more about what we have for you to build AI applications and other types of applications. Please complete the session survey on your mobile application, and thank you so much for coming.

; This article is entirely auto-generated using Amazon Bedrock.