DEV Community

Cover image for AWS re:Invent 2025 - Toss Securities’ AI Transformation with automated Super App Ops. (GBL301)
Kazuya
Kazuya

Posted on

AWS re:Invent 2025 - Toss Securities’ AI Transformation with automated Super App Ops. (GBL301)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Toss Securities’ AI Transformation with automated Super App Ops. (GBL301)

In this video, Jongmin Moon from AWS and Namkyu from Toss Securities demonstrate how Toss Securities transformed their super app testing operations using Amazon Bedrock and Amazon Nova. Toss Securities, which became Korea's number one overseas stock trading platform in just three years, faced critical challenges: shipping dozens of daily releases across banking, payments, securities, and commerce while struggling with manual testing across diverse devices and time zones. Their solution was building an AI-powered automation platform on AWS that enables remote device control with zero latency, automated workflow creation through natural language conversations, and intelligent test execution using Amazon Nova Micro and Nova Lite for multimodal analysis. The platform uses RAG for retrieving similar test cases and combines text analysis with spatial image understanding. Operating in an isolated AWS environment with Direct Connect for secure access, the system complies with Korea's strict financial regulations. The agent follows four steps: plan, analyze, act, and review. Future plans include Figma integration and an Atlas map project that automatically captures app screens and transitions, working toward fully autonomous testing through natural language scenarios.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Thumbnail 30

Introduction to Amazon Bedrock and Amazon Nova: Foundation Models for AI Applications

Hi everyone. Thank you for joining us for this session. I'm Jongmin Moon, Senior SA Manager at AWS. I'm so excited to be here with Toss Securities, one of the top Korean tech companies. Today we will show you how Toss Securities transformed their super app operations using AI technology and AWS services. Here's the plan. I will give you a quick overview of Amazon Bedrock and Amazon Nova, then hand it over to Namkyu to share their real story, pain points, and solutions and results.

Alright, I'm talking about Amazon Bedrock. This is our fully managed platform for building generative AI applications with five key capabilities. First, the choice of model. You can choose the best model for your business among many industry-leading foundation models, and you don't have to change your production code when you want to change the foundation model. Second, cost and performance. You can optimize cost and performance through choice of model, intelligent prompt routing, prompt caching, and flexible inference options. Third, easy customization. You can fine-tune models with your own data and can set up RAG without complexity. Fourth, security. Your data stays private within AWS with enterprise-grade security built in, plus guardrails filtering harmful content. Last but not least, agents. You can build and deploy agents. Agents can work on complex tasks beyond simple chatbots.

Thumbnail 40

Now, let me briefly introduce Amazon Nova. This is our own family of foundation models only available on Bedrock, with industry-leading price performance. Our Nova family has four categories. For understanding and text, we have Nova Micro for text input and text output, Nova Lite, Pro, and Premier. Lite, Pro, and Premier handle multimodal inputs—text, images, and videos—to generate text outputs. For creative content generation, Nova Canvas generates images and Nova Reel creates videos. Nova Audio is our speech-to-speech model that processes voice input and generates voice output. And Nova Interactions automatically performs tasks on behalf of users within web browsers. Among these models, Toss Securities is leveraging Nova Micro and Nova Lite for test automation.

Thumbnail 120

Thumbnail 190

Thumbnail 200

Toss Securities' AI-Powered Test Automation Platform: Solving Rapid Growth Challenges with AWS

Now let's hear from Namkyu on how Toss Securities solved their challenges on AWS. Hi, I'm Namkyu from Toss Securities. Let me briefly introduce Toss. Toss is a super app that powers almost every part of daily life in Korea, from banking, investing, and shopping to gaming and ride-hailing. With this broad ecosystem, Toss has been growing at extraordinary speed. To give you one example, Toss Securities, the business I'm part of, became the number one platform for overseas stock trading in Korea in just three years after launch.

Thumbnail 240

Thumbnail 250

Thumbnail 290

With this kind of rapid growth, we naturally face significant growing pains. In short, fast growth created even faster problems. We ship dozens of new versions every single day. These updates cover banking, payments, securities, commerce, and even games, all happening at the same time. The problem was simple. Our testing and validation just couldn't keep up. Unit tests were already painful, and integration tests made everything even slower. So we ended up with a critical problem. Too many releases, too little time. Testing across countries, devices, and OS versions was a daily struggle. Working remotely made it even harder. Not everyone had access to the same device. Sometimes we had to hunt down someone with the exact model just to test a specific feature.

Thumbnail 320

When daylight saving time begins, global market hours shift unexpectedly. Our trading system has to be tested exactly when the market opens, even if that was 7 a.m. Trade engineers often stayed up through the night manually checking whether trades executed correctly. It was exhausting, off-running, and hard to sustain.

Thumbnail 350

Thumbnail 370

In summary, the challenges we faced fell into three main categories. It was hard to maintain quality at the speed we were deploying, testing across diverse models, and the remote environment was extremely complex. We lacked real-time visibility to monitor, schedule, and verify logs. That's exactly why we built this platform.

Thumbnail 390

Thumbnail 400

Thumbnail 410

Thumbnail 420

You can access the web interface and control any device instantly with virtually no latency. As you can see, there is no latency for the YouTube video. You can freely inspect all network packets generated during a test, so you can instantly verify whether the expected logs were produced through an automated report. You can create automated workflows that run tests repeatedly. Visually you'll review and you'll find each step. You can schedule them to run exactly when you need.

Thumbnail 430

Thumbnail 440

Thumbnail 450

Thumbnail 460

Thumbnail 480

Thumbnail 500

This video shows an example workflow to launch YouTube and automatically switch through different tasks. After making a workflow, you just click it. Then you make an opening YouTube workflow so you can find the elements that are clickable. After that, when I click such number 2, then the tab will be changed.

Thumbnail 510

Thumbnail 520

Thumbnail 530

Thumbnail 540

Thumbnail 550

Thumbnail 560

Thumbnail 570

But we needed to make sure making this didn't feel like work. This next video shows how an LLM can access the site and actions. I just asked to launch Naver.com, which is like a Google for Korea. The test creates workflows that are saved so they can be reused anytime. The agents say what are the interactive buttons. Like that way, I can call the MCP tools. And the agent says what is the summary of my name.

Thumbnail 580

Thumbnail 590

You can build workflows in real time through a conversation with the agents. In this example, I first asked it to send me a Slack DM saying hello world every 5 seconds. Then I asked it to change it to 3 seconds, and it's shown on the right side. The workflow updated immediately and worked exactly as expected.

Thumbnail 610

Thumbnail 620

Thumbnail 660

Thumbnail 670

Thumbnail 690

Thumbnail 700

Thumbnail 710

Here's how we design and build it. We operate a fully isolated AWS environment, and with Direct Connect, our teams can securely access the development network from both office and home. Our services are deployed seamlessly on each tool, wire, or code, and workflow scripts are efficiently managed through our Git server in AWS. To comply with Korea's strict financial regulations, we leverage isolated models on Amazon Bedrock to build AI features safely. We use a component called the provider, which can be an employee's PC or server running in our data centers. When the phone is connected to the provider via USB, the mobile controller is automatically installed on the device. The provider communicates with this controller over gRPC to control the device. The provider communicates with the automation platform servers through HTTP. Now users can control multiple devices and connect their own device and learn existing test scripts without any additional steps.

Thumbnail 720

Thumbnail 740

Thumbnail 770

Most of the existing workflow solutions were overly complex. It is really difficult to localize for strict financial regulations, and they didn't deliver the ETL performance we need. So we set out to build a workflow system that is fast enough, flexible enough, and simple enough. Instead of relying on complex algorithms like topological sorting, we designed the system to operate using only the information about connecting nodes—the previous and next ones—plus the trigger ID. This reflects not just my personal development philosophy, but also the broader culture that simple is best.

Thumbnail 790

Thumbnail 800

Thumbnail 810

Thumbnail 810

In terms of flexibility, we designed the system so that new nodes can be added at runtime and immediately affect any workflow. We chose TypeScript because its type system makes it easy to define and validate the input parameters for each node. Default values can be used in input parameters. Now let's take a look at how AI actually works in this project. When a user says something like "tell me the Amazon stock price," the agent first performs a RAG argument search to retrieve a similar test case. If no similar examples are found, the agent falls back to direct inference without retrieval. Next, to determine which elements need to be tapped on the screen, the agent uses Amazon Nova to interpret both the text and image components. We use Amazon Nova because it is multimodal. It first analyzes the dumped text and structured data from the device, and if there is no element that can be found there, it falls back to spatial image analysis and identifies the element directly through image understanding.

Thumbnail 870

Thumbnail 880

In summary, what the agent really does comes down to just four steps: it plans, analyzes, takes action, and reviews. In fact, the workflow agent follows the very same principles. It starts by analyzing the existing workflow, then identifies any missing components or required changes and applies them directly. Building an agent is much like creating a humanoid robot.

Thumbnail 940

All we really need is eyes to understand the context and arms to carry out the necessary actions. Today, we live in a time where those arms can be implemented effectively through approaches like MCP. In the end, it was simply about putting the pieces together and reinventing what already existed. Now, let's look at the impact this platform has created.

Thumbnail 970

We are no longer tied to physical locations. We can access any device remotely anytime, just like through digital nomad work, and we can learn and verify tests automatically wherever and whenever they need it, ensuring consistent quality. We get to focus on meaningful work instead of repetitive tasks.

Thumbnail 990

Thumbnail 1020

As our next step, we plan to integrate with design systems like Figma, giving the agents much richer context about our products. We are also building an Atlas map that automatically captures and analyzes every new screen and transition detected in the app. This Atlas project acts like a neural map of our app, linking all transitions together and making the agent significantly smarter. This becomes the foundation for our ultimate goal: an agent that can understand natural language, test scenarios, and autonomously test every possible interaction.

Thumbnail 1040

As always, these technologies ultimately make our work and our lives easier. With coding agents helping out, I get to manage my team and have fun building full tech projects like this end to end on my own. Thank you for your time and attention today.


; This article is entirely auto-generated using Amazon Bedrock.

Top comments (0)