Introduction
In this post, we will extend the implementation of an AI Agent. This time, we have improved the agent’s accessibility and its capacity to use different tools for a specific context.
TLDRS
All code used in this article is available in the repository.
Context
In addition to using AWS serverless services in the context of rapid experimentation, we have included the agent’s capacity to generate responses based on a certain predefined set of tools and the ability to operate using Gemini as a provider.
What’s the objective?
We want to obtain information in different formats through tools and then present it to the client within a mobile application.
Why mobile?
Greatest benefits of AI are when the technology is accessible on the devices you frequently carry.
I definitely don’t carry a laptop everywhere, do you?
Below is a high-level diagram representing what you want to achieve.
While there are no significant changes to the architecture, you will see the necessary considerations in how we handle responses from our Lambda.
Let’s move ahead.
Scope
We will use the Davis Instruments console manuals as data sources, and we will build a private Chatbot for iPhone that will answers from these manuals, as well as displaying data from the stations.
Functional requirements
- The system should allow you to consult and receive information related to Vantage Pro2 equipment, such as setup questions and/or requests for data from the weather stations.
- This interaction must be in natural language.
- The system should allow sending messages and receiving a response in real time as it is generated.
Non-Functional requirements
As usual, you can rely on serverless capabilities and add a few additional characteristics.
- The system that provides data supports to the app should be highly available.
- The system should scale automatically to handle variable traffic patterns.
- The system should remain secure with domain-level access control and encrypted traffic.
The following is a high level diagram of the architecture.
Out of scope
We will omit the feature of being able to request actual data from the WeatherLink API.
However, you can find a project of mine that enables an integration using AI and MCP.
It is also worth mentioning that the fact that this implementation is possible does not mean this is the ideal way to expose an API.
With that out of the way, let’s move on.
Cost breakdown
Considering a volume of 1,000 requests per month, which for our use case is very generous, we are running our solution with the following components.
Google AI API Calls
We will use Gemini 2.5 Flash for language inference and text-embedding-004 for embeddings.
- gemini-2.5-flash input (500k tokens x $0.30/M) plus output (500k tokens x $2.50/M) = $0.15 + $1.25
- text-embedding-004 (500k tokens x $0.10/M) = $0.05
This gives us an approximate total of $1.45 USD per month.
AWS Lambda
We will configure a function with 1024 MB and a 60 second timeout. AWS provides 1M free requests and 400,000 GB-seconds monthly, which covers typical usage.
This results in $0.00 USD per month (within free tier).
CloudFront
We will use PriceClass_100 (US and Europe) assuming moderate traffic of 5 GB data transfer and 50,000 HTTPS requests.
- Data transfer: 5 GB x $0.085/GB = $0.43
- HTTPS requests: 50,000 x $0.01/10,000 = $0.05
This results in about $0.48 USD per month.
S3 Storage
For storing static assets and application data, estimating 2 GB of storage with 100,000 GET requests.
- Storage: 2 GB x $0.023/GB = $0.05
- GET requests: 100,000 x $0.0004/1,000 = $0.04
This results in about $0.09 USD per month.
Container Registry (AWS ECR)
For storing 3 images of 400 MB each (1.2 GB total). Billable storage after 500 MB Free Tier is 0.7 GB.
- 0.7 GB x $0.10/GB = $0.07
This results in about $0.07 USD per month.
Route 53
We will use a hosted zone ($0.50/month) and include a standard .com domain registration fee ($13.00/year or $1.08/month).
Get warike-technologies’s stories in your inbox
Join Medium for free to get updates from this writer.
Subscribe
This results in about $1.58 USD per month.
CloudWatch
For logging with seven-day retention, estimating minimal volume (0.2 GB ingestion).
- Log Ingestion: 0.2 GB x $0.50/GB = $0.10
This results in about $0.10 USD per month.
Apple Developer Account
The Apple Developer Program membership is $99 USD per year. This is a fixed annual fee required to publish apps on the App Store.
- Annual Cost: $99.00 / 12 months = $8.25
This results in a monthly allocation of $8.25 USD per month.
Total Monthly Cost
This brings us to a total of approximately $12.02 USD per month.
Implementation
Let’s start from the App and we can progress towards to the infrastructure
Mobile App
For the technical implementation of a mobile application, I will suggest using the Galaxies Dev repository. Big shout out for his work.
Given that the video explains how to build the application from scratch, we will highlight the changes we have included on our part.
The first thing to consider is that the application will use the following dependencies:
import { useChat } from '@ai-sdk/react';
import { DefaultChatTransport } from "ai";
The important parts from the implementation at this point are the following:
- We need to sign POST Requests as it is mentioned in the documentation.
- We need to add Clerk token into our Resquest Headers.
const transport = useMemo(() => new DefaultChatTransport({
fetch: async (url, options) => {
if (options?.method === 'POST' && options.body && typeof options.body === 'string') {
const hash = await Crypto.digestStringAsync(
Crypto.CryptoDigestAlgorithm.SHA256,
options.body
);
options.headers = {
...options.headers,
'x-amz-content-sha256': hash,
};
}
return expoFetch(url as string, options as any);
},
credentials: 'include',
api: generateAPIUrl("/api/chat"),
headers: async () => {
const token = await getTokenRef.current();
return {
"Authorization": token ? `Bearer ${token}` : '',
"X-Clerk-Token": token || '',
};
},
}), []);
...
const { messages, stop, sendMessage, status, setMessages } = useChat({
transport: transport,
onFinish: async ({ message }) => {
scrollViewRef.current?.scrollToEnd({ animated: true });
// Save assistant message to database
const currentChatId = chatIdRef.current;
if (currentChatId && message) {
const chatIdNum = parseInt(currentChatId);
if (isNaN(chatIdNum)) return;
// Filter out tool invocations and empty parts
const validParts = filterPartsForDB(message.parts);
// Only save if there's actual text content
if (validParts.length === 0) return;
try {
await addMessage(db, chatIdNum, {
parts: validParts,
role: Role.Assistant,
});
} catch {
// Fail silently
}
}
},
onError: (err) => {
console.error('Chat error:', err);
},
});
And later, to display the messages from the API, we will assume that certain messages include the tools within the payload.
type ChatMessageProps = {
message: UIMessage;
isLoading?: boolean;
status?: ChatStatus;
};
const ChatMessage = ({ message, isLoading, status }: ChatMessageProps) => {
const { user } = useUser();
const { role, id, parts } = message;
const isUser = role === Role.User;
const isAssistant = role === Role.Assistant;
// Check if this is a streaming message with no content yet
const hasTextContent = parts?.some(
(part) => part.type === 'text' && part.text && part.text.trim().length > 0
);
// Check if we have a data tool response to avoid duplicate rendering
const hasTools = parts?.some(
// @ts-ignore
(part) => part.type === 'tool-dataTool' || part.type === 'tool-weatherTool' || part.type === 'tool-gddTool'
);
// Show loader for assistant messages that are empty and we're streaming
const showLoader = isLoading || (isAssistant && !hasTextContent && !hasTools && status === 'streaming');
// Don't render empty user messages
if (isUser && !hasTextContent) {
return null;
}
return (
<View style={[styles.row, isUser && { flexDirection: 'row-reverse' }]}>
<View style={[styles.item, { backgroundColor: '#fff', paddingTop: 5 }]}>
<Image
source={
isUser
? { uri: user?.imageUrl || 'https://placehold.co/250x250?text=U' }
: require('@/assets/images/logo-white.png')
}
style={styles.avatar}
/>
</View>
<View style={[styles.text, { flex: 1, paddingBottom: 10 }, isUser && styles.userMessageBubble]}>
{showLoader ? (
<LottieLoader />
) : (
parts?.map((part, i) => {
switch (part.type) {
case 'text':
if (!part.text || part.text.trim().length === 0) {
return null;
}
return <CustomMarkdown key={`${id}-${i}`} content={part.text} />;
case 'dynamic-tool':
return <LottieLoader key={`${id}-${i}`} />;
case 'tool-weatherTool':
const weatherOutput = part.output as WeatherDataProps;
if (weatherOutput) {
const { temperature, unit, description, forecast } = weatherOutput;
return <MessageData key={`${id}-${i}`} temperature={temperature} unit={unit} description={description} forecast={forecast} />;
}
return <LottieLoader key={`${id}-${i}`} />;
case 'tool-dataTool':
const soilOutput = part.output as SoilChartProps;
if (soilOutput) {
const { labels, datasets, legend } = soilOutput;
return <MessageChart key={`${id}-${i}`} labels={labels} datasets={datasets} legend={legend} />;
}
return <LottieLoader key={`${id}-${i}`} />;
case 'tool-gddTool':
const gddOutput = part.output as GDDDataProps;
if (gddOutput) {
const { labels, datasets } = gddOutput;
return <MessageGDD key={`${id}-${i}`} labels={labels} datasets={datasets} />;
}
return <LottieLoader key={`${id}-${i}`} />;
default:
return null;
}
})
)}
</View>
</View>
);
};
Having all that, most of the application should be operating. We can proceed to build the application so that it responds to your requests.
Implementing Mastra in Lambda
The first step would be to create a simple flow in Lambda where you must validate the request’s credentials, then check the path, and finally generate the response from the messages.
export const handler = awslambda.streamifyResponse<APIGatewayProxyEventV2>(
async (event: APIGatewayProxyEventV2, responseStream: awslambda.HttpResponseStream) => {
responseStream.setContentType('text/plain; charset=utf-8');
if (!await checkAccess(event)) {
responseStream.write('Unauthorized');
responseStream.end();
return;
};
if (event.rawPath !== '/api/chat') {
const httpResponseMetadata = {
statusCode: 404,
statusMessage: "Not Found",
headers: {},
};
responseStream = awslambda.HttpResponseStream.from(responseStream, httpResponseMetadata);
responseStream.write('Not Found');
responseStream.end();
return;
}
try {
const httpResponseMetadata = {
statusCode: 200,
statusMessage: "OK",
headers: {},
};
responseStream = awslambda.HttpResponseStream.from(responseStream, httpResponseMetadata);
// Parse request body
if (event.body) {
const body = JSON.parse(event.body);
const { messages } = body;
await runAgent(messages, responseStream);
}
} catch (error) {
console.error(`Error in handler:`);
console.error(error);
} finally {
responseStream.end();
}
}
);
Clerk authentication
The validation that the user is valid is quite simple through Clerk. You simply verify the token sent from the app, and Clerk takes care of the rest.
import type { APIGatewayProxyEventV2 } from "aws-lambda";
import { verifyToken } from '@clerk/backend';
import { env } from './env.js';
export const checkAccess = async (event: APIGatewayProxyEventV2): Promise<boolean> => {
try {
const headers = event.headers || {};
const authHeader = headers['x-clerk-token'] ?? headers['authorization'] ?? headers['Authorization'];
const token = authHeader?.replace(/^Bearer\s+/i, '');
if (!token) {
return false;
}
await verifyToken(token, {
secretKey: env.CLERK_SECRET_KEY,
});
return true;
} catch (error) {
return false;
}
}
All secrets and configuration variables are available on the Clerk dashboard.
Mastra
Our implementation cannot use the native functions of ai-sdk or Mastra. We must use the pipelines function in NodeJS to pipe the response to the stream of our Lambda function.
import { mastra } from './app.js';
import { UIMessage, stepCountIs } from 'ai';
import { pipeline } from 'node:stream/promises';
export const runAgent = async (messages: UIMessage[], responseStream: awslambda.HttpResponseStream) => {
const agent = mastra.getAgent("basic");
const result = await agent.stream(messages, {
stopWhen: stepCountIs(5),
modelSettings: {
temperature: 0.0,
topK: 3,
},
});
const sseResponse = result.aisdk.v5.toUIMessageStreamResponse({ sendReasoning: false });
await pipeline(sseResponse.body!, responseStream);
}
Also, we must consider creating the agent and its tools. I am going to reuse S3 functionalities and use static functions that represent the data functionalities.
Feel free to implementing them yourself, or DM me for more specifics.
// agents.ts
import { Agent } from '@mastra/core/agent';
import { rag_system_prompt } from './prompt.js';
import { languageModel } from './model.js';
import { queryTool, weatherTool, gddTool, dataTool } from './tools.js';
export const basicAgent = new Agent({
id: "basic",
name: "basic",
instructions: [
{
role: "system", content: rag_system_prompt
},
],
model: languageModel,
tools: {
queryTool,
weatherTool,
gddTool,
dataTool,
},
});
// tools.ts
import { MastraLanguageModel } from '@mastra/core/memory';
import { createVectorQueryTool } from '@mastra/rag';
import { createTool } from "@mastra/core/tools";
import { embeddingModel, languageModel } from './model.js';
import { S3VectorsStoreName } from './vector.js';
import { env } from '../env.js';
import { z } from 'zod';
export const queryTool = createVectorQueryTool({
id: 'tool_knowledgebase_s3vectors',
description:
'Use it to search for relevant documentation about Vantage Pro 2',
vectorStoreName: S3VectorsStoreName,
indexName: env.AWS_S3_VECTORS_INDEX_NAME,
includeVectors: false,
model: embeddingModel,
reranker: {
model: languageModel as unknown as MastraLanguageModel,
options: {
topK: 3,
},
},
});
export const weatherTool = createTool({
id: "weather-tool",
description: "Fetches temperature from weather stations for a location",
inputSchema: z.object({
stationId: z.string(),
}),
outputSchema: z.object({
temperature: z.string(),
unit: z.string(),
description: z.string(),
forecast: z.array(z.string()),
}),
execute: async (inputData) => {
return {
temperature: "10",
unit: "C",
description: "Sunny",
forecast: ["10", "11", "15", "10", "4"],
};
},
});
export const dataTool = createTool({
id: "data-tool",
description: "Fetches soil moisture data from weather stations for a location",
inputSchema: z.object({
stationId: z.string(),
}),
outputSchema: z.object({
labels: z.array(z.string()),
legend: z.array(z.string()).optional(),
datasets: z.array(z.object({
data: z.array(z.number()),
color: z.string().optional(),
strokeWidth: z.number().optional(),
withDots: z.boolean().optional()
}))
}),
execute: async (inputData) => {
return {
labels: ["08-12", "09-12", "10-12", "11-12", "12-12"],
legend: ["20cm", "50cm", "75cm"],
datasets: [
{
data: [80, 78, 108, 90, 85, 80],
color: "rgba(80, 119, 29, 1)",
strokeWidth: 2
},
{
data: [92, 91, 91, 91, 92, 91],
color: "rgba(120, 160, 250, 1)",
strokeWidth: 2
},
{
data: [101, 101, 102, 102, 101, 101],
color: "rgba(250, 200, 50, 1)",
strokeWidth: 2
},
{
data: [160],
withDots: false,
color: "transparent",
strokeWidth: 0
},
{
data: [0],
withDots: false,
color: "transparent",
strokeWidth: 0
}
]
};
},
});
export const gddTool = createTool({
id: "gdd-tool",
description: "Fetches GDD data from weather stations for a location",
inputSchema: z.object({
stationId: z.string(),
}),
outputSchema: z.object({
labels: z.array(z.string()),
datasets: z.array(z.object({
data: z.array(z.number())
}))
}),
execute: async (inputData) => {
return {
labels: ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"],
datasets: [
{
data: [5.64, 15, 7.64, 1, 8, 6.1, 10]
}
]
};
},
});
Let’s move ahead to define the infrastructure.
Infrastructure
The infrastructure doesn’t change that much from previous implementation.
Lambda definition
It is important to be able to map the corresponding variables.
locals {
lambda_chat = {
name = "chat-${basename(path.cwd)}"
image = "${aws_ecr_repository.warike_development_ecr.repository_url}:chat-v1.0"
description = "Lambda chat function for ${local.project_name}"
memory_size = 1024
timeout = 60
env_vars = {
GOOGLE_GENERATIVE_AI_API_KEY = var.google_generative_ai_api_key
GOOGLE_LANGUAGE_MODEL = var.google_language_model
GOOGLE_EMBEDDING_MODEL = var.google_model_embedding
AWS_S3_VECTORS_BUCKET_NAME = var.vector_bucket_name
AWS_S3_VECTORS_INDEX_NAME = var.vector_bucket_index_name
LANGWATCH_API_KEY = var.langwatch_api_key
CLERK_SECRET_KEY = var.clerk_secret_key
NODE_OPTIONS = "--enable-source-maps --stack-trace-limit=1000"
NODE_ENV = "production"
}
}
}
## Lambda Chat
module "warike_development_lambda_chat" {
source = "terraform-aws-modules/lambda/aws"
version = "~> 8.1.2"
## Configuration
function_name = local.lambda_chat.name
description = local.lambda_chat.description
memory_size = local.lambda_chat.memory_size
timeout = local.lambda_chat.timeout
## Package
create_package = false
package_type = "Image"
image_uri = local.lambda_chat.image
environment_variables = merge(
local.lambda_chat.env_vars,
{}
)
create_current_version_allowed_triggers = false
## Permissions
create_role = false
lambda_role = aws_iam_role.warike_development_lambda_chat_role.arn
## Logging
use_existing_cloudwatch_log_group = true
logging_log_group = aws_cloudwatch_log_group.warike_development_lambda_chat_logs.name
logging_log_format = "JSON"
logging_application_log_level = "INFO"
logging_system_log_level = "WARN"
## Response Streaming
invoke_mode = "RESPONSE_STREAM"
## Lambda Function URL for testing
create_lambda_function_url = true
authorization_type = "AWS_IAM"
cors = {
allow_credentials = true
allow_origins = ["*"]
allow_methods = ["*"]
allow_headers = ["*"]
expose_headers = ["*"]
max_age = 86400
}
tags = merge(local.tags, { Name = local.lambda_chat.name })
depends_on = [
aws_cloudwatch_log_group.warike_development_lambda_chat_logs,
aws_ecr_repository.warike_development_ecr,
null_resource.warike_development_build_image_seed
]
}
Cloudfront
It is important to be able to map the corresponding domain names that you have chosen to use.
locals {
cloudfront_oac_lambda_function_url = "chat_lambda_function_url"
zone_name = var.zone_name
app_domain_name = var.app_domain_name
}
module "warike_development_cloudfront" {
source = "terraform-aws-modules/cloudfront/aws"
version = "5.0.1"
## Configuration
enabled = true
price_class = "PriceClass_100"
retain_on_delete = false
wait_for_deployment = true
is_ipv6_enabled = true
create_monitoring_subscription = true
## Extra CNAMEs
aliases = [local.app_domain_name]
comment = "Chat CloudFront Distribution"
## Origin access control
create_origin_access_control = true
origin_access_control = {
"chat_lambda_function_url" = {
description = "CloudFront access to Lambda Function URL"
origin_type = "lambda"
signing_behavior = "always"
signing_protocol = "sigv4"
}
}
origin = {
"chat_lambda_function_url" = {
domain_name = trimsuffix(replace(module.warike_development_lambda_chat.lambda_function_url, "https://", ""), "/")
origin_access_control = local.cloudfront_oac_lambda_function_url
custom_origin_config = {
http_port = 80
https_port = 443
origin_protocol_policy = "match-viewer"
origin_ssl_protocols = ["TLSv1", "TLSv1.1", "TLSv1.2"]
}
}
}
default_cache_behavior = {
target_origin_id = local.cloudfront_oac_lambda_function_url
viewer_protocol_policy = "redirect-to-https"
allowed_methods = ["HEAD", "DELETE", "POST", "GET", "OPTIONS", "PUT", "PATCH"]
cached_methods = ["GET", "HEAD", "OPTIONS"]
## Cache policy disabled
cache_policy_id = "4135ea2d-6df8-44a3-9df3-4b5a84be39ad"
origin_request_policy_id = "b689b0a8-53d0-40ab-baf2-68738e2966ac"
## Forwarded values disabled
use_forwarded_values = false
## TTL settings
min_ttl = 0
default_ttl = 0
max_ttl = 0
compress = true
}
viewer_certificate = {
acm_certificate_arn = module.warike_development_acm.acm_certificate_arn
ssl_support_method = "sni-only"
}
tags = merge(local.tags, { Name = local.cloudfront_oac_lambda_function_url })
depends_on = [
module.warike_development_acm,
module.warike_development_lambda_chat,
]
}
Testing & Observability
As the implementation was getting ready, We need to way to understand if the tools are being used correctly.
Running tests for the Lambda functionality.
The implementation of the tests for the Lambda logic is quite straightforward.
PASS src/__tests__/mastra/app.spec.ts
PASS src/__tests__/mastra/agent.spec.ts
PASS src/__tests__/mastra/tool.spec.ts
PASS src/__tests__/auth.spec.ts
PASS src/__tests__/mastra/prompt.spec.ts
Test Suites: 5 passed, 5 total
Tests: 10 passed, 10 total
Snapshots: 0 total
Time: 3.628 s, estimated 4 s
Ran all test suites.
Watch Usage: Press w to show more.
However, what we ideally want to know is whether the agent will behave according to the prompt we have defined.
LangWatch
Here is where I will use the LangWatch scenarios tool. It helps to test out my AI agent by integrating quite easily on Mastra.
import { type AgentAdapter, AgentRole, AgentReturnTypes } from "@langwatch/scenario";
import { mastra } from "../mastra/app.js";
export const basicAgent: AgentAdapter = {
role: AgentRole.AGENT,
call: async (input) => {
const basic = mastra.getAgent("basic");
const result = await basic.generate(input.messages);
return result.response.messages as unknown as AgentReturnTypes;
},
};
Next, the tests for the agent’s behavior are defined.
import scenario from "@langwatch/scenario";
import { describe, it, expect } from "vitest";
import { basicAgent } from "./agent-adapter.spec";
describe("Weather Station RAG Agent", () => {
it("should provide accurate calibration instructions", async () => {
const result = await scenario.run({
name: "barometric pressure calibration request",
description: `The user needs help calibrating their Vantage Pro2 weather station's barometric pressure sensor.`,
agents: [
basicAgent,
scenario.userSimulatorAgent(),
scenario.judgeAgent({
criteria: [
"Agent should provide calibration steps directly without asking follow-up questions",
"Response should include step-by-step instructions",
"Response should cite the Vantage Pro2 Operations Manual as source",
"Response should use telegraphic style (imperative verbs, minimal articles)",
"Agent should not include greetings or filler text",
"Instructions should mention setting correct elevation first",
],
}),
],
script: [
scenario.user("How do I calibrate pressure?"),
scenario.agent(),
scenario.judge(),
],
});
expect(result.success).toBe(true);
}, 30_000);
it("should use weatherTool for temperature and humidity requests", async () => {
const result = await scenario.run({
name: "current weather data request",
description: `The user wants current temperature and humidity readings.`,
agents: [
basicAgent,
scenario.userSimulatorAgent(),
scenario.judgeAgent({
criteria: [
"Agent should use the weatherTool to fetch current weather data",
"Agent should respond with 'Here's the weather data that you requested'",
"Agent should not ask follow-up questions",
],
}),
],
script: [
scenario.user("What's the current temperature and humidity?"),
scenario.agent(),
scenario.user("Station ID is TEST123009"),
scenario.agent(),
scenario.judge(),
],
});
expect(result.success).toBe(true);
}, 30_000);
it("should use dataTool for soil moisture requests", async () => {
const result = await scenario.run({
name: "soil moisture data request",
description: `The user wants current soil moisture data from their weather station.`,
agents: [
basicAgent,
scenario.userSimulatorAgent(),
scenario.judgeAgent({
criteria: [
"Agent should request for Station ID",
"Agent should use the dataTool to fetch soil moisture data",
"Agent should respond with 'Here's the soil moisture data that you requested'",
"Agent should not provide additional explanation beyond tool usage",
],
}),
],
script: [
scenario.user("Show me current soil moisture levels"),
scenario.agent(),
scenario.user("Station ID is TEST123009"),
scenario.agent(),
scenario.judge(),
],
});
expect(result.success).toBe(true);
}, 30_000);
it("should use gddTool for growing degree day requests", async () => {
const result = await scenario.run({
name: "gdd data request",
description: `The user wants current growing degree day data from their weather station.`,
agents: [
basicAgent,
scenario.userSimulatorAgent(),
scenario.judgeAgent({
criteria: [
"Agent should request for Station ID",
"Agent should use the gddTool to fetch growing degree day data",
"Agent should respond with 'Here's the growing degree day data that you requested'",
"Agent should not provide additional explanation beyond tool usage",
],
}),
],
script: [
scenario.user("Show me current GDD data"),
scenario.agent(),
scenario.user("Station ID is TEST123009"),
scenario.agent(),
scenario.judge(),
],
});
expect(result.success).toBe(true);
}, 30_000);
});
And then I can view the results both in my terminal and on the LangWatch platform.
Additionally, I can configure LangWatch for the observability of my traces.
import { OtelExporter } from "@mastra/otel-exporter";
import { env } from "../env.js";
export const observabilityExporter = new OtelExporter({
provider: {
custom: {
endpoint: "https://app.langwatch.ai/api/otel/v1/traces",
headers: { "Authorization": `Bearer ${env.LANGWATCH_API_KEY}` },
},
},
})
Then you can see the traces in the dashboard.
Human Testing the app
To test the application, I will use the iPhone simulator.
et voila!
Conclusions
This article demonstrates a complete, end-to-end implementation for deploying a powerful, private AI assistant accessible via a mobile application.
By combining the accessibility of Expo on the frontend with the robustness of AWS Lambda and S3 on the backend, we created a scalable RAG solution.
A core takeaway is the effective orchestration of the AI Agent using Mastra. The agent is configured with multiple tools, including RAG for documentation and specific data fetching functionalities. This tool-calling capability is crucial for delivering diverse and context-specific responses directly to the user.
Furthermore, integrating LangWatch proved invaluable for ensuring the agent’s reliability. The use of scenarios for testing agent behavior, especially its tool-use decisions, confirms that the system adheres to the defined prompts and requirements before deployment.
This makes a personal assistant a very achievable goal.






Top comments (0)