DEV Community

Cover image for AWS re:Invent 2025 - TDK SensEI built scalable IoT platform with AWS for sensor insights (NTA204)
Kazuya
Kazuya

Posted on

AWS re:Invent 2025 - TDK SensEI built scalable IoT platform with AWS for sensor insights (NTA204)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - TDK SensEI built scalable IoT platform with AWS for sensor insights (NTA204)

In this video, Oren Waldman and Bob Roth from TDK SensEI demonstrate how their edge intelligence platform transforms industrial operations on AWS. The solution evolved from condition-based monitoring to agentic AI that predicts machine failures and automatically orchestrates responses—coordinating maintenance, verifying parts, and assigning technicians. Built on AWS IoT Core, Greengrass, SageMaker, and Bedrock, the architecture features security-first design with tenant isolation and outbound-only connectivity. The platform creates digital twins combining sensor data, maintenance logs, and production schedules, enabling AI agents to autonomously solve problems before failures occur. Now available on AWS Marketplace, it serves manufacturing, smart buildings, logistics, and energy sectors.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Thumbnail 10

Introduction: TDK SensEI's Edge Intelligence Platform Transforming Industrial Operations

Good afternoon. I'm really excited to share with you today how TDK SensEI is transforming industrial operations using AWS. For those of you who don't know, TDK SensEI is building something truly innovative, a next generation edge intelligence platform that fuses advanced sensors, AI, and machine learning to deliver real-time actionable insights for manufacturing and industrial environments.

Now what makes this particularly compelling is the problem that they're trying to solve. In manufacturing, when something goes wrong, technicians typically spend 80% of their time just searching for the problem. TDK SensEI is changing that equation entirely. Their platform has evolved from basic condition-based monitoring to predictive maintenance and now to agentic AI, meaning it doesn't just predict when a machine will fail, it actually recommends solutions and orchestrates a response.

Think about that, automatically coordinating maintenance schedules, verifying parts availability, and assigning qualified technicians all before a failure even occurs. And they've built this entire solution on AWS. They're leveraging IoT Core and Greengrass for edge computing, SageMaker for machine learning, and Bedrock for their agentic AI capabilities. The architecture is security first with comprehensive device provisioning and tenant isolation built in from day one.

And the business impact is real. They're reducing unplanned downtime, optimizing maintenance costs, and creating digital twins of operational data that enable better decision making across manufacturing, smart buildings, logistics, and energy sectors. Now what's particularly exciting for all of us is that TDK SensEI is now available on the AWS Marketplace, making this solution accessible to your customers who are facing these exact challenges.

So I'm Oren Waldman. I'm a senior solutions architect here. I've had the great privilege of closely working with TDK SensEI throughout this year, and it's been incredible to watch them go from development to active commercialization. And now it's my pleasure to introduce Bob Roth, CTO of TDK SensEI, who will walk you through their journey and show you how they built the platform. Bob, take it away. Thank you, Oren.

Thumbnail 130

Thumbnail 140

Evolution from Condition-Based Monitoring to Agentic AI in Manufacturing

All right. So I'll be talking a little bit to basically five key areas as we go through the presentation. First, a little bit more of an overview and a little more depth. Oren did a nice job of introducing the company. I have a little more there. Talk a little bit about the evolution that we see going on with industrial automation from condition-based monitoring to predictive and now agentic. A little bit about the platform we've built, how the architecture works together, and then talk a little bit about where we're going and what we see as the evolution of this space.

Thumbnail 170

So first and foremost, who is TDK SensEI? Most people when they hear TDK, they think of cassette tapes or audio tapes or other things, and so it's a bit of a change. We were formed just a little over a year ago with the mission to focus on bringing real-time intelligence and predictive and AI driven technologies to factory management and manufacturing solutions. As a division of TDK, we do have the privilege of having access to these industrial sensors and other components that we can bring into our solution, as well as a broad spectrum of TDK's internal factories that we can leverage for building our solutions as well.

Overall, we have built a strong SaaS-based platform that has both cloud and on-premise capabilities, and we've got a team that has a global footprint across North America, Asia, Japan, and other parts of the world. Currently, we have four key market areas that we focus on: manufacturing, smart buildings, logistics and distribution centers, and the energy sector, and I'll talk a little bit more about how we apply to each of those soon.

Thumbnail 230

All right, so what's going on in the area of machine maintenance and facility management? This is an ongoing and an evolving space. It continues to do so today. Initially, the focus was on condition-based monitoring. It was really about the detection of problems as they occur, leveraging machine learning to evaluate what's going on in real time. The primary value proposition of this is really relieving the burden of monitoring and decreasing the time to reaction when a problem does occur.

However, that space is evolving more towards predictive maintenance, trying to detect problems before they occur so you can reduce and remove downtime, not just minimize it, but prevent it from happening in the first place. And then where we're focused today, which is where I like to think the magic is, is on the prescriptive or agentic, not just predicting the problem, but also analyzing and providing the solution to that problem and executing it for you, basically allowing you to have a strong solution that not just figures out what's wrong, but how to fix it.

Thumbnail 300

All right. So a little more depth. This is a little bit deeper dive into the overall ecosystem that we've put together.

Successfully automating and driving efficiency across factory management requires gathering significant amounts of data. Data is a key factor. We collect sensor data from machines, operations information like maintenance logs, and other types of information around the facilities such as production schedules. We combine all of this into a data twin that can then be operated on through AI agents and LLMs. This is an area that they're optimally suited for, and all of that is then realized for the customer in an overall factory solution, either through mobile or other types of edgeRX platforms with notifications that basically tell you what the system is doing, when, why, and how.

Thumbnail 360

As we look through what our solution encompasses, this is really what we have deployed today. We have a solution that has a gateway device connected to many sensors which are located within the factory and read information. All that's connected to the SaaS and the cloud where we have AWS services providing data storage and ML training models. I will go into quite a bit more detail on the architecture of what lies behind that in the near future here in this presentation. Overall, this is all hosted within various regions of AWS, so in different parts of the world we have a global platform for this, and then we're asynchronously communicating what's going on to the customer through the dashboard and other types of notification mechanisms.

Thumbnail 410

AWS-Based Architecture: Security-First Design with IoT Core, Greengrass, and Multi-Tenancy

Now I'm going to take a pretty deep dive into what's behind this, what's the architecture, and what have we learned as we were building this platform. This diagram represents a high-level overview of the overall set of services that we've built for this solution. This contains the components of the solution which are on the edge in the factory itself, as well as the components which live in the cloud. These are key things that we've built ourselves and areas where we've leveraged components from AWS and their services.

Things like IoT Core and Greengrass provide a strong foundation for the IoT software integration, communications to the facilities, and provisioning solutions. We have significant amounts of time series data and information, so we have the AWS Timestream database that provides that. We leverage Iceberg for long-term data storage, as well as containers and Kubernetes and other types of solutions for the things that we've built. For the AI and ML components, we heavily leveraged Amazon SageMaker pipelines and Step Functions for building out the solutions, as well as some other components which aren't depicted here, things like Cognito for our identity management. Overall, the AWS foundation provides a very strong platform that we can build a custom solution on top of that blends the two worlds of the facility as well as the cloud.

Thumbnail 490

Here I'll dive down a little bit into the boxed area. This really highlights the interface between the facility itself and what's in the cloud. I'm zooming in here on the IoT Core and Greengrass and MQTT. We've built a solution where all the communication activity from the facility out to the cloud is all done over a single protocol, the MQTT protocol with encrypted TLS. This is a very important part of this architecture because security and access to the facility in these manufacturing facilities is a massive concern for them. They really don't want to expose something that's making thousands of computer chips or wafers to data outside their firewall. This single solution is a very important part of this, and IoT Core and Greengrass provides for that provisioning, provides for the ability to communicate and manage this data in an efficient way and very effectively.

Thumbnail 550

As I was saying, security is a huge part of the solution that we've built. Factories are very concerned about this, the IT situation, and this is another place where we're able to build on top of what AWS provides. They have a very strong set of services, things like logging and monitoring with CloudWatch and EventBridge. These are tools that automatically allow us to capture what's occurring in the facility and to alert customers to it. They've embodied AI and ML into these tools as well with things like threat intelligence through their GuardDuty solutions. We're basically able to understand whether or not there are concerns for our customers from a security perspective by leveraging these solutions. The other thing is that many of these factories have ISO 9001 and other certifications. Compliance is a huge thing, making sure that the processes and data that they're using is being used in the correct way.

The automated compliance solutions and automated solutions through AWS Config allow us to make sure that we are staying compliant to their ISO 9001 rules for all the things that we deploy inside their factory.

Thumbnail 620

The other thing that we've built out here is we have tens and are going to have tens of thousands of facilities, multiple facilities per customer. So the architecture that's been built here is a very strong federated and segmented approach for multi-tenancy. The only services that are exposed publicly are the ones that allow basically the devices to connect, but each of the customers lives in their own private subnet with their own private services. Each facility is isolated from each other. That means the ML data is not crossing from one client's training situation to another.

So data isolation and the ability to keep all this training information separate is also another critical aspect of us building a platform in the cloud with AWS at scale. The other thing that's interesting is all of the services and all the connectivity from the facility is out to the cloud. There's never any inbound connectivity from the cloud to the facility. This reduces IT configuration. They don't have to open holes in firewalls. It's all outbound. All of the updates to firmware, all of that happens through that sort of phone home mechanism that we've built into the architecture.

Thumbnail 690

So here this is a quick example of one of the things I was speaking to. This is the device provisioning. We ship out sensors and gateways and all of those devices are equipped with a certificate that simply allows them to connect to what we call the lobby. They connect out to the lobby, they understand what device account, what customer account they're registered to, and then they receive new credentials that allow them to connect directly to that customer's federated tenancy area of the solution.

So through this approach, we can ship out and preconfigure from our manufacturing all these devices, but they're all secure because they can only speak to this lobby area until they've received the appropriate credentials through this automated facility. This allows us to not have to preconfigure certain things in the manufacturing process. We can ship them out and then rapidly get them deployed.

Thumbnail 750

AI and ML Capabilities: Automated Predictive Maintenance with Agent-Driven Solutions

All right, so I'm going to shift gears a little bit here, and now I want to talk about what is probably my favorite part of this, which is the AI and ML components, the area, the new stuff that we're focused on. We've really, in the same architecture space, we now have what I consider to be the magic, the hallmark of the TDK SensEI solution, which is really the power in AI and ML being applied to automate and predict and create a solution here.

So the core foundation here highlighted, we leverage Amazon SageMaker and the Step Functions to automatically build and train ML models for our sensors. The customer, our facility leaders don't have to go out and deploy an ML model, do any kind of training. We actually deploy the sensors. They learn what normal is for operations in the facility, and then that gets built into our device and runs directly in the sensor. From that point on, it can detect anomalous behavior.

Thumbnail 820

So this is a very automated solution. There's no manual configuration required for getting these sensors deployed, and that's all built on top of this rich set of AWS primitives that we've built out here. All right. Now, where's this all going? And this is, I'll talk a little bit about the next phase of this, which is not just telling you you're going to have a problem, but solving the problem for you before it's occurred. That's really where agents come to bear, agents taking action on your behalf with what permissions you've given them based on all this rich set of data we have from our sensors.

Thumbnail 840

So I'm going to walk through for a second here. This is sort of what would this look like if you're a factory manager and you have a failure that's upcoming. This is the experience of the EdgeRX platform that we've built out. So you as a factory manager would get a notification saying, hey, you've got a pump that has bearing wear. We've detected it's going to fail in the next two weeks. We already know what parts are required because our agent has read the manual for this pump and it knows how to do the repair.

Thumbnail 890

It knows what's in your ERP system. It knows what parts you have available, it knows what parts are necessary, and it knows what skills from the manufacturing installation process are required. So you get all that in a summary. As the factory manager, you can dive down and understand a little bit more depth. So you can say, okay, why. What's going to happen? So this is here, now an example of we've gone through and we're looking at the various information that's coming from our sensors and detecting the fact that we're predicting this failure here because we're going to cross over the threshold of acceptable vibration on this particular pump.

Thumbnail 910

Thumbnail 940

We know what parts are required to make the repair on this, and the system has looked at it and recognized that one of the parts is not in inventory. If you trust it and have it configured that way, we have an agent that could make the purchase for you and then track the availability of that part being shipped. Alternatively, it could inform your parts department that they need to make the purchase if you prefer to keep the financial aspect inside your organization. It will track when the part is available and then schedule the repair when it's ready to be done. It knows exactly what skill set is required and is able to connect into your time management system that's available in the factory to determine what resources are available for scheduling. Effectively, what happens is this factory manager now has a solution to this problem without having had to do any actual work to address it.

Thumbnail 960

So what's happening behind the scenes is a framework of agents that work together. We have an orchestration agent that's responsible for understanding the workflow here. For example, do you allow your agents to make purchases or not, or do we need to call out to a human to send them a notification that they need to make a purchase? All of the information and data that's available to the sensors and tools are all connected together. We use RAG, retrieval augmented generation solutions to do this, and we have a rich digital twin effectively of the architecture. That digital twin is not just what's happening in the factory, but all the supporting systems like ERP, personnel scheduling, and so on.

Thumbnail 1010

A little bit of a summary here. Overall, we've built a platform leveraging AWS and all the things we've learned with the sensors and capabilities that TDK has in its rich 90-plus year history to produce something that has the ability to have low latency and do real-time processing. We built privacy and data security, which is such a critical aspect for these factories, from the get-go. It's built in from the very beginning, not something we added later. We have the ability to do resource optimization with the edge AI resources we've got here. We know when things might go down and are able to keep you from outages.

All of these capabilities can be personalized directly to the role of the persona. Are you the factory manager? Are you actually the maintenance worker that needs to get access to what parts are available? All of this is very persona-driven in terms of interaction. The connectivity independence here is the fact that this solution continues to work even if you had a cloud outage. It's still receiving data, it's still processing, and we can still access the vibration data. We can still detect failures even if we would not have availability due to an outage. So the system is resilient to those types of aspects, and we're moving into this space away from just prediction to creating solutions as well.

Thumbnail 1090

As an overall summary, as I think Orren mentioned, we are available on the AWS Marketplace now. The solution here has a variety of different sensors that we can bring to bear. We have mobile as well as web-based dashboard interactions. Again, we're focused on some key areas in infrastructure, the manufacturing space, and energy. Smart buildings are almost like manufacturing plants these days. They have all the smart pumps and fans and all those things. It's almost a manufactory in and of itself, as well as logistics and distribution solutions, which is another really big area where we're focused today.

That's it. Thank you for coming and hearing a little bit more about TDK SensEI and kind of where we're trying to take this. We're building a solution that we think is really the next generation of industrial automation and factory management. Thank you.


; This article is entirely auto-generated using Amazon Bedrock.

Top comments (0)