Kazuya

Posted on Dec 4

AWS re:Invent 2025 - Beyond SFTP and NFS: Automate enterprise file transfers at scale (STG361)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Beyond SFTP and NFS: Automate enterprise file transfers at scale (STG361)

In this video, Jeff Bartley and Smitha Sriram from AWS demonstrate how DataSync and Transfer Family modernize managed file transfers across three key use cases: migrations, recurring transfers, and B2B data exchange. They showcase DataSync's enhanced mode achieving 8x faster speeds for petabyte-scale migrations, with London Stock Exchange Group moving 30 petabytes in 3 months while reducing storage costs by 80%. A live demo illustrates DataSync copying data from NFS to S3, then accessing it via Transfer Family web apps. The session covers FICO's infrastructure reduction using Transfer Family, Trustwell Food Logic's improved uptime with AS2/SFTP automation, and a food & beverage company saving over $1 million annually by migrating 100+ trading partners to AWS B2B Data Interchange. Real architectures demonstrate event-driven workflows combining Step Functions, EventBridge, and DynamoDB for automated file processing, malware scanning, and EDI document translation using generative AI.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction: Modernizing Managed File Transfers with AWS

Good morning and welcome to Reinvent. We're excited to be with you and hope you have a wonderful week here. If this is your first time at Reinvent, it's a marathon, so pace yourself. You'll be exhausted by Thursday, but hopefully it's a great week for all of you. My name is Jeff Bartley, and I'm a Principal Product Manager on the DataSync team. I'm going to be joined by Smitha Sriram, who owns product management for a couple of our services, including Transfer Family. She'll talk about that as well. So again, we're excited to have you with us.

In this session, we really want to show you how the services that AWS provides can help you modernize your managed file transfers. When we talk about managed file transfers, we're talking about the ability to move files at scale, whether it's migrations, synchronization, or part of your business processes. We're really enabling you to move files at scale quickly, securely, and reliably. We've put a lot of work into making it effortless to integrate our services into your business workflows, and we hope you'll leave this session with ideas on how you can better leverage AWS for your file transfer needs.

We'll start this session by talking about some of the challenges that our customers experience today when it comes to managed file transfers, some top use cases they're trying to address, and then our approach to making it easy for our customers in each of those use cases. As we go, we'll weave in some real customer examples, talk through some architectures, and I'll show you a demo of using two of our services, DataSync and Transfer Family web apps together, so you can see how those can integrate with each other. Then we'll wrap up, leaving you some resources to try it out in your own AWS environment.

Managed file transfers are critical to a wide variety of use cases across businesses, organizations, and industries such as financial services, healthcare, life sciences, manufacturing, and many more. When we talk about unstructured data, we're referring to things such as files like spreadsheets, machine-generated data, AI models, reports, images, and videos. It's everywhere. If you manage storage of any kind, you're probably well aware of the proliferation of this data. There's a ton of it out there. Enabling customers to move this data quickly and reliably is key to managed file transfers.

With data increasing in size and volume and many organizations being spread more globally, exchanging this data quickly and reliably has become more important than ever.

As we go out and talk to our customers about what they're trying to do, especially when it comes to file transfers, we consistently hear about some of the challenges they face with their legacy file transfer solutions, and maybe you face similar challenges. These include managing things like hardware and software renewals or responding to security issues that can take so much time and effort to deal with. We're also hearing more and more from customers about how difficult it is to find talent that's familiar with these legacy solutions. They need to manage them effectively, but it's becoming difficult to find people with the right kind of experience to manage these solutions. So there's a variety of challenges that our customers are facing when it comes to managing these legacy MFT solutions.

AWS's Seven-Year Journey in MFT: Services, Principles, and Partner Success

Over the last seven years, we've been helping customers address some of these challenges, helping them migrate their legacy MFT solutions to AWS and taking away a lot of that undifferentiated heavy lifting so they can focus on the core competencies of their business. DataSync and Transfer Family were both launched at Reinvent 2018, just about seven years ago, with the goal of making data movement effortless for our customers regardless of where their data is located. Over the years, as we've talked to customers like you, we've expanded our capabilities to support more data types, more protocols, and more ways of moving data quickly and securely.

A significant expansion of our capabilities came with the launch of B2B Data Interchange, which we often call B2BI, enabling customers to automate the exchange and processing of EDI documents with their trading partners. EDI is core to many workflows that depend upon managed file transfers, and B2BI addresses key challenges in this area around automation and document translation. Smitha is going to talk about that a little bit later in this session.

Our approach to tackling the challenges of MFT really focuses on some of these key principles. Our services encrypt data in flight and at rest, and we validate that all data is moved over the wire byte for byte using checksums all along the way. Automation is key to MFT workflows. If you've built some of your own, you know this, and our solutions integrate seamlessly with other AWS services like Step Functions, Lambda, and EventBridge so you can build those automated workflows.

We also know that generating reports of file transfers is critical to meeting regulatory and compliance needs, and our services provide extensive logging to help you meet those needs.

This focus on addressing customer challenges is why tens of thousands of customers around the world have benefited from our services in the MFT space, and you can see just some of the customers across a wide variety of industries. MFT workflows can cover a broad array of business-specific challenges, so often this requires a high degree of customization to meet the needs of your specific business. This is where partners can really help, and we've had a variety of customers who have told us that working with partners has really helped them streamline and accelerate the migration process of bringing their legacy MFT solutions into AWS, leveraging services like Transfer Family and DataSync. Partners can help tailor these solutions, providing customization and automation, integrating automation with some of those other services I mentioned to really meet the needs of their specific business.

As an example, these are a number of partners who are validated to follow best practices with AWS Transfer Family. Scale Capacity, down there in the right-hand corner, helped the City of Los Angeles who was working on a file transfer system where they needed to enable their employees to download financial records or allow partners to access these records. They were initially looking at a solution to run on-premises and found that it was going to cost them hundreds of thousands of dollars annually. By working with Scale Capacity, they were able to implement a cloud-native solution leveraging AWS Transfer Family, and it saved them a significant amount of money, many thousands of dollars annually. As a citizen of Los Angeles, it's always nice to see my city using tax dollars efficiently. This was a great success story for the City of LA, leveraging a partner specifically to get this work done.

AWS DataSync: Accelerating Large-Scale Data Migrations

So as I mentioned, we're going to cover a few use cases. The first of them is migrations, so we'll dive into that and I'll start by talking about AWS DataSync, which was really built to help our customers streamline their migrations into AWS. DataSync was built to move data quickly. It uses a custom in-house built protocol that moves data over a network in a parallel fashion, enabling it to move data very quickly at high scales and high speed. It's built to be secure and reliable. All data is encrypted in flight using TLS, and we check every byte that gets transferred with checksums all along the way, from reading from the source to writing to the destination. We're validating checksums all along the way to ensure the data is transferred securely and reliably.

DataSync can scale to massive sizes. We've got customers using DataSync to move petabytes of data, billions of files, very large data sets. DataSync can scale to that, and it's designed to be easy to use, so it has built-in filtering, scheduling, and audit logging, which is critical for a number of MFT use cases. DataSync really focuses on moving data in three areas, and the first of these is enabling on-premises transfers. DataSync supports systems that can talk a variety of protocols. This could be like a NAS file server that can communicate over NFS or SMB protocols. It supports on-premises object storage systems that use the S3 protocol, and it also supports Hadoop clusters. So if you're transferring or migrating data from Hadoop into the cloud, you can use DataSync for that.

At the other end, DataSync works natively with a variety of our storage services in AWS, including Amazon S3, FSX, and we support all file system types in FSX as well as Amazon EFS. So you can use DataSync to copy data both to and from these various locations. DataSync also supports moving data between other clouds and AWS. We support a number of clouds including Google Cloud, Azure Blob, and Oracle. Basically, if a cloud has cloud storage that supports the S3 protocol, we can typically talk to it.

You can use DataSync to move between other clouds and AWS. We've also got a lot of customers who are using DataSync to move data between storage services within AWS entirely. This may be for backup purposes or to move data around between regions. Using DataSync, you can move between any combination here. You could go S3 to FSX, EFS to S3, any combination of those. We support intra-region, cross-region, and cross-account transfers. All data transferred with DataSync is encrypted and written over the AWS network. The data never leaves the AWS network at all in this particular situation.

To help our customers streamline their migrations, we launched enhanced mode, which helps customers move data sets with virtually unlimited numbers of files at speeds up to 8x faster than basic mode. Enhanced mode provides detailed metrics that help you track files throughout the data transfer process. It also streamlines cross-cloud transfers, allowing you to move data between clouds without deploying any infrastructure. We've had a number of customers take advantage of that, and they really like the ability to move data between clouds completely API driven with no infrastructure to deploy, making it super easy for customers.

Let's dive a little deeper into how DataSync helps with accelerating migrations. If you've performed a data migration in the past, you're probably familiar with this kind of pattern. Most migrations go this way, where they start with an initial transfer of data—a lot of data that needs to be copied to the destination. Then you run a series of incremental transfers over time on a typically fixed period, maybe daily or every couple of days, until you're ready to do your cutover, which is the point at which your application goes from moving your initial data set to the data set that you've migrated to.

DataSync is really built to help with each of these stages, so it has a number of key capabilities that help make migrations simple and efficient. It transfers both data and metadata, which is critical when you want to maintain things like file system permissions. It also provides filters that help you ensure that only the data you need to move is actually migrated. It can scale up to maximize your network bandwidth, and it has those metrics I mentioned earlier with enhanced mode that can really help you understand the scope of your data and what was moved.

Here's a quick video of DataSync running in an initial transfer so you can see it's using enhanced mode right now. These metrics can help you understand exactly what's being moved. This is a data set of about 200,000 files. You can see as DataSync goes, it's listing, preparing, transferring, and verifying. One of the advantages of enhanced mode is that it's doing all of those in parallel. You can see as it's going, it's transferring and verifying. You can see it's hitting a throughput speed of close to about 600 megabytes per second, but this is really nice for that initial transfer. It helps you see that all the data that was actually found at the source was actually transferred all the way through and ultimately verified with good throughput. In under 30 minutes, we transferred close to 1 terabyte of data.

That was the initial transfer. As you go into incremental transfers, DataSync also has additional capabilities. After that initial transfer, you can configure DataSync to run on a schedule so that you can perform those incremental transfers until it's time to cut over. Running transfers on a schedule is really important because it can help you estimate how long your cutover will take, which is a frequent ask of customers when they're working with us. They ask, "When do I have to cut over? Because cutover involves downtime, how long is that cutover going to take?" So when you run that incremental transfer with DataSync, it can help you estimate roughly how long you need for a cutover period.

Here's an example of an incremental transfer using that same data set. In this case, what you'll see is that DataSync is still listing all the files and finding them. But during the prepare phase, it's comparing those files against the destination. In this case, there were only 581 files that actually changed, which is pretty typical. When you're doing an incremental transfer, you'll find that most of the data is static and not changing as you go, so those are all marked as skipped, meaning they didn't need to be transferred or verified. In this case, you can see it took a lot less time—under 3 minutes. So if this was typical of the change rate of my files, I would probably estimate a 5-minute cutover.

If you had more data or your change rate was higher, it could take a little bit longer, but this is a good way to estimate how long you need for that cutover period when you're performing migrations with DataSync. Then there is a pattern that a number of our customers use when they really want to maximize their network bandwidth. They may have tens of gigabytes or tens of gigabits per second of network bandwidth, and maybe a single DataSync task isn't enough to move data to totally fill that pipeline.

So they can leverage this pattern, which is basically partitioning your data set into multiple subsets of data and then running individual tasks to move that data in parallel. We've had customers use this to move petabytes of data a day. If you want to learn more about that pattern, definitely check out that QR code there. That was a storage blog that was written by one of our solutions architects who's sitting in this crowd right now.

In fact, we had a customer, London Stock Exchange Group, who needed to do a large migration of about 30 petabytes of historical market data from another cloud into AWS. They leveraged a partner, Data Art, to help them scale out this migration. They moved 30 petabytes in about 3 months. I remember when they were doing this, they were moving close to a petabyte of data a day. That's a lot of data to move, but DataSync can scale and they achieved their timelines well underneath their targets.

Another win for them is that moving off of the other cloud and into AWS, specifically into Amazon S3 and S3 Glacier, they were able to see a significant reduction in their storage costs. They saw about an 80 percent reduction in this case, so a big win. They were able to meet their migration timelines as well as save a significant amount of money on their storage costs.

Recurring Transfers: From Data Protection to Machine-Generated Workflows

So we talked about migrations. Let's move into a second use case around recurring transfers. Whereas migrations are usually project-based or one-time data movement projects, there are other use cases where data is being moved on a regular basis. We're going to cover some of these with customer examples throughout this section.

A lot of our customers continue to have data on on-premises storage either because they haven't migrated the application yet or maybe they can't move the application and it will need to continue to run on-premises for an indefinite period of time. But they still want to make a second copy of that data just to protect it. So they'll often leverage DataSync using this pattern where they'll set up a DataSync agent in their on-premises environment and configure DataSync to run on a schedule, then just run it on a regular basis. This enables them to meet their data protection goals.

Using DataSync will automatically run on the schedule, safely and securely moving that data anywhere it's needed. A lot of our customers will often use this to make a second copy of data in S3, taking advantage of its low-cost storage. It's a great pattern. In fact, we had a customer, Santos, who needed to do something similar to this. They produce terabytes of backup files on a daily basis, and they needed to make a second copy of that data necessary in order to meet some of their internal compliance needs.

The customer started off, as I've seen a lot of customers do, trying to write their own solution and they quickly ran into a number of headaches that DataSync was designed to address. Things around the overhead of managing and deploying scripts, dealing with errors, and validating data along the way. After learning about DataSync, they were able to get it up and running quickly and vastly simplify their processes. It really enabled them to focus on more of their critical business challenges and just get away from do-it-yourself solutions.

Now at this point we're all well aware of the explosion of the use of generative AI, and we're seeing more customers where DataSync is becoming increasingly important to their data pipeline, particularly when it comes to building out data lakes on AWS for AI training and inference purposes. These data sets can often consist of billions of files and petabytes of data. DataSync can help customers get that data into AWS storage services like FSX for Lustre or Amazon S3, which were built to scale out and really achieve the levels of data movement that's necessary for a lot of these generative AI workflows.

Now another area where we see DataSync helping customers with recurring transfers is with machine-generated data. And if you think about all the machines out there that are running in on-premises environments, they're generating enormous amounts of data.

These include genome sequencers, manufacturing systems, and lab instruments in hospitals. Our customers want to get all of that data into the cloud so they can process and analyze it without building out compute systems on premises. DataSync helps customers streamline these data pipelines and reduce the time it takes to get results from their data.

This is an example of a common solution we see with customers using machine-generated data. The data is typically generated and stored temporarily in some kind of on-premises storage system, such as a file server. Customers set up DataSync to copy the data, triggering a job when the lab instrument job starts, and transfer the data into something like S3. When the job is complete, it triggers an EventBridge notification, which then triggers cleanup of the on-premises environment to remove the data that's no longer needed since it's all up in the cloud. This makes room for new jobs that need to be run on premises, allowing customers to keep their on-premises storage uncluttered and available for all their processing needs.

Merck is an example of a customer in the life sciences space that had the challenge of reducing high levels of rejects in their manufacturing processes. Working with AWS, they realized that some of the AI services and machine learning services we provide could help them address this challenge. They utilized DataSync to bring in a lot of the reject data and defect images they were gathering across various sites into S3 so it could be utilized by Amazon SageMaker. Leveraging these services, Merck was able to root cause their issues, optimize their processes, and ultimately reduced their rejects across their product lines by fifty percent. This is a great story showing how DataSync was part of the overall workflow of bringing the data in so it could be better analyzed in AWS.

Demo: Integrating DataSync and Transfer Family Web Apps for Seamless Data Access

Now I'm going to go into a demo of DataSync and Transfer Family web apps working together, building upon that pattern in life sciences and healthcare where machines or lab instruments are generating data on premises and customers need to process that data in the cloud. Many times these processes produce reports that need to be reviewed by users in on-premises environments, such as scientists across the organization or quality control personnel. Using Transfer Family web apps, which we launched at re:Invent last year, end users can now securely and easily access their data stored in S3, such as generated reports. With Transfer Family web apps, there's no need to give your users access to the S3 console or complicated third-party applications. Using these two services, DataSync and web apps together, our customers are able to streamline their machine-generated workflows and pipelines and make that data really available to their global workforce.

Let me dive a little bit deeper into web apps and what it is. Web apps is a capability of Transfer Family that you enable to give you access to data stored in S3. It uses a web application that runs entirely out of the AWS console, using AWS Identity Center for authentication and S3 access grants to authorize access to data in your S3 bucket. This enables you to leverage your existing mechanisms. You can also customize web apps with your own branding, and it has a simple interface that makes it easy for you to get your end users up and running.

What I'm going to do now is go into a demo. I'll specifically show DataSync being used to copy data from an NFS server in the demo, and then it will be copied into an S3 bucket. With the data in S3, I'll then show how easy it is to leverage web apps in order to access that data in S3 and make it available to your end users.

Let's kick off the demo. Here I'm logged into my NFS server, and you can see that

my current directory is /mnt/nfs/demo, which is the folder that contains the data I want to copy. I have an NFS export on my server at /mnt/nfs, which will be important when we configure DataSync. I have five folders in there. I want to copy the two archive folders up into my S3 bucket, but I'm going to keep the production ones on there for now. You can see what's in my archive folders. I have a handful of files. I have a text file that gives a listing of MD5 sums in the archive one folder, and then I have this ignore.tmp file which I don't want to copy. I'll use that to show how you can use exclude filters with DataSync to avoid copying data that you don't want to copy.

From here, we'll move over to the AWS console. This is the S3 bucket that I'm going to use to store my data. This is where I'm going to copy data from my NFS server into this bucket. You can see it's currently empty. There's no data in it right now, but it's called jb-web-apps. That's where we'll store our data in a little bit. I'm going to jump over to the DataSync console and create a task. A task tells DataSync how to move data from one place to another.

I start off by configuring a source location. My source is my NFS server. I deployed an agent earlier, and an agent is a virtual machine that I use to connect to my on-premises NFS server. I select that agent and then specify the IP address or domain name of my NFS server and the mount path, which is how DataSync is going to access the server. If you remember earlier, I showed you that mount point which was /mnt/nfs, so I use that. Then I'm going to specify my destination location, which is an S3 bucket, that jb-web-apps bucket that I showed you earlier. That's going to be my destination location.

The next phase is to configure my DataSync task. I'll give it a simple name so I can recognize it when I'm working in the DataSync console. Then I'm going to specify which data I want to copy. In this case, I'm going to use filtering. I'll add an include filter. If you remember, I only want to copy my archive folders, so I'll specify an include filter that looks like that. Then, if you remember, I also wanted to exclude the temporary file, so I'm going to exclude anything that ends in .tmp. I'll just copy those files and exclude the temp file. Everything else I'll keep the same. I'll review it, everything looks good, and I'll go ahead and create my task.

At this point, I've created my DataSync task and I'm going to go ahead and start it. That will kick off the process of actually running the task to copy the data from my NFS server into my S3 bucket. While that's running, we're going to move over to the AWS Transfer Family console and take a look at a web app that I configured already. I already configured this web app server with my identity provider. You can see up there that's the IAM Identity Center provider that I created earlier, and I have a user Jeff down there. I've given Jeff access via S3 access grants to my bucket, my S3 bucket. I have my web apps endpoint, which is the actual endpoint that I will access in my browser to access the web apps interface itself. That gets automatically generated for you. This is what it looks like, a very simple interface. You can see my bucket is in there. I click on my bucket and it's empty right now. There's no data in there. Once the data is copied, we'll come back and take a look at that.

We'll jump back over to the DataSync console. You can see we've copied our data. It's about 14 files, a small amount of data, not a lot, but it was copied in a quick amount of time. We can see that the data was copied successfully. Then if we jump over to our S3 bucket and do a refresh, we can see that the data was copied in there successfully. I have my two archive folders. The production folders were not copied. There's the MD5 text file that I'll look at later, and then you can see that the temp file was not copied because it was excluded using that exclude filter that we specified in DataSync.

So we successfully copied the data over. I go and refresh my web apps environment and you can see it just mirrors the data that's in S3. This is because of how I configured the S3 access grants to provide access to my web app. You can see I can download the file. If I open up that MD5 text, it matches what I had on my NFS server.

You can use that if you want to look at any of the MD5 sums for any of the files there. You can see we utilized DataSync to copy the data from on-premises up into AWS, and then from there, leveraging web apps to access that data in S3. The great thing about web apps is that you don't have to go through the S3 console. You can easily enable your end users to access data stored in S3, leveraging single sign-on capabilities or whatever authentication mechanisms you have available today.

This is a great example of showing DataSync and Transfer Family web apps working together. With that, I'm going to hand it off to Smitha, who is going to talk more about Transfer Family and some other use cases.

AWS Transfer Family: Secure File Exchange Across Business Boundaries

Thanks, Jeff. Let's switch gears a little bit here. Jeff talked about DataSync and a little bit about Transfer Family. Now, what if you don't have control over both ends of the file transfer? Let's say there's an application that's generating data that you need, or you need to deliver data to an application that's outside of your purview in a business partner's environment. That's where AWS Transfer Family comes in with its fully managed file transfers over industry standard protocols like SFTP and AS2.

There are a lot of use cases that customers use AWS Transfer Family for today. Notably, one is exchanging documents that are subject to regulatory needs like PII, PHI, and HIPAA. If you're operating in financial services, healthcare, pharmaceuticals, or oil and gas, a number of industries need to exchange data securely, and that's where Transfer Family's MFT capabilities are used. Second is ERP integration. Let's say you have an SAP implementation and so does your business partner, and you want both of these to talk to each other because maybe you operate a supplier and you need to get purchase orders from your manufacturer or your end customer. That's where MFT Transfer Family is used to connect these two applications that are across business boundaries.

If you are a content distributor and you produce value-added data sets, whether that data is for analytics, media, or software, you want an MFT that can scale and expand your subscriber growth as well as protect your revenue source. Transfer Family's fine-grained access controls help you give access to just the data that your users are subscribed to. Fourth, if you are operating a manufacturing facility and you want to capture that data into S3 so that you can detect errors soon or run proactive maintenance, Transfer Family's SFTP is commonly used as a data ingestion pipeline. There are many more use cases where Transfer Family is used, and these are just a few examples.

Looking at this architecture, Transfer Family helps you capture data from a number of different source types. Whether it's manually written scripts like SFTP or FTP scripts, or you have non-technical users, similar to what Jeff showed in the demo with Transfer Family web apps, who want to get data in through a commonly used client called the web browser. Third is a remote SFTP source, so you might be processing healthcare claims or you're part of a payments reconciliation workflow and you need to talk to a clearing house's SFTP server. Finally, let's say you are using the protocol called Applicability Statement 2 and you need to interact with a partner's AS2 implementation.

Transfer Family gives you a number of resources, from servers to web UI to connectors, to be able to get the data in your Amazon S3 bucket or EFS file system. After that, as the slide says, the possibilities within AWS are endless. One specific service that I'll talk about later is AWS B2B Data Interchange for managed EDI integrations. We've invested a lot of time into making MFT event-driven so that you can automate your file processing end to end.

Let's take a quick look at this architecture. Let's say your business partner sends you a file over one of these industry standard protocols and you're using AWS Transfer Family. As soon as the file lands in your S3 bucket, AWS Transfer Family emits a very specific file transfer event that has all the context about the file transfer.

These detailed events contain all the context about the file transfer, including whether the transfer failed or was successful, whether it was a partial upload, the user's source IP, where the file came from, the username, and the filename. You get a lot of information that the Transfer Family knows about whether the file transfer was a success or a failure. Along with that, you can easily kick off a Step Functions workflow, and as part of that workflow, you can tag the file and archive it for auditors later on.

Many of our customers want to perform file format checks. For example, if you expected a CSV from your training partner but they sent you a JSON, you want to reject that file right away so you can make it right as part of that workflow. Pretty Good Privacy encryption is commonly used, so your trading partner encrypts the file because it contains PII data. As soon as the file lands, you want to be able to automatically decrypt the file. Soon after that, if the file traversed over the internet, many of our customers want to scan for malware. If you detect any malware, you want to be able to discard the file right there, and then only let clean files be used by your internal business systems.

You can run all of this within Step Functions, and the beauty of this is that you combine Amazon EventBridge so that you can tailor your processing on a per-user or per-source file basis. There are two components here that help you do that customization. One is our integration with EventBridge, and the second is the use of Amazon DynamoDB to store a lot of this information where the service can read from. The service emits these detailed events, as you can see. You can parse these events and run rules-based processing on them. For example, if the file came from this source IP, this is what you want to do, or if this file was a partial upload, you don't want to take it into your data pipelines because it's going to mess up your calculations. You can run a lot of these rules as part of your Step Functions workflow.

Because the service recognizes your DynamoDB tables, you can use it to store the username, the credentials, what data they have access to, what source IP you expect them to come from when they access your data, and most importantly, what rules you want to run. For example, if this trading partner is from Company X and they sent a JSON, how do you want to process it? This is super customizable at the end of the day in terms of what kind of processing you want to run in AWS. There is a QR code there with a self-paced workshop where you can build the exact same architecture that I showed you on the previous slide and automate your MFT for greater scale and flexibility.

I want to talk about FICO because they built this exact architecture. FICO helps businesses globally in around 80 countries with anything from protecting credit cards from fraud to improving financial inclusion or even increasing supply chain resiliency. As a global leader in credit card analytics, FICO processes massive volumes of sensitive data using MFT. This means secure and efficient file transfers are super critical to their operations. Earlier, they were using a legacy MFT implementation, and with that, they had to manage a whole lot of infrastructure, like SFTP servers and storage, whether it was in use or not. That resulted in a lot of management overhead and costs.

Once FICO switched to AWS Transfer Family, that eliminated a lot of the infrastructure they needed to manage before. They were able to deploy their infrastructure for MFT globally using infrastructure as code within minutes, which was not possible before. They have also lowered their total cost of ownership, and not just lowered TCO. Now they can calculate the costs of running an MFT more granularly and more accurately across different business units at FICO. There is a QR code there that details FICO's MFT transformation journey that I would love for you to check out.

Streamlining Regulatory Submissions with Transfer Family

Let's talk about another important use case for many of you who need to submit regulatory filings as part of either being in pharmaceuticals or the food industry. What our customers have been able to do using Transfer Family is super streamline that submission. For example, let's say a file is generated by an application running in AWS, such as a PDF of medical records, and you drop that file into S3.

Once you drop it, Transfer Family can listen to that event and send those files over to a documents portal or a documents server hosted by the regulatory body like the FDA or the SEC. Once they receive the file, if you're using AS2, you can receive a confirmation of receipt from the regulatory body so that you can audit and archive that receipt for future purposes. You can automate this whole thing using Transfer Family and Amazon S3. That way, the whole submissions process scales as your business grows and remains reliable, ensuring you submit these submissions on time.

Let me talk about Trustwell's Food Logic platform. They help their customers manage supply chain and traceability challenges, which is important for customers operating in the food and drug industry. Trustwell's Food Logic platform uses AS2 and SFTP to receive messages from their supply chain partners. Earlier, they were also using a legacy MFT implementation. When the license expired, their MFT would go down, or if there was a certificate expiration, they wouldn't even know when that happened. This brought down the MFT and disrupted their business for days.

Food Logic consolidated all of that file transfers using AWS Transfer Family. Now they run their AS2 and SFTP using our service. As a result, they have improved availability and uptime. They're able to onboard new relationships in a much shorter time than before, and because it's a managed service, they're automatically up to date with the latest security and compliance standards.

AWS B2B Data Interchange: Automating EDI Processing with Generative AI

Now let's talk about when processing is part of the file transfer. Electronic Data Interchange is a major driver of that type of processing used in different industries. If you're in healthcare and doing claims processing, or in transportation and logistics tracking shipments, or in manufacturing with an SAP implementation sending a purchase order to your supplier's ERP, this is where EDI is prevalent because it's an industry standard. Some of the standards are X12, EDIFACT, and in healthcare, HL7v2 is very common.

While EDI is used as part of moving data, the problem is it's not compatible internally with your business applications or your data lakes as you want to build these data pipelines for, say, even AI/ML. That's exactly the reason we launched AWS B2B Data Interchange at re:Invent 2023 to automate the validation and processing of X12 documents to and from JSON and XML. JSON and XML are common representation formats that you could use with your business systems and data lakes. The service gives you extensive logging and monitoring. Errors are common in this technology, so you want to be able to troubleshoot faster before they become a business problem for you.

Being an AWS managed service, the service automatically scales to your needs and has high uptime and availability. My favorite feature about this service is it empowers you to use generative AI to generate mapping code between X12 and your custom format for XML and JSON. That really reduces the time and effort needed to generate that mapping code. It's a one-time setup so that automatically documents are translated between these different formats.

Let's take a look at an architecture diagram. Say you interact with a trading partner and they send you a purchase order like X12 850. Many of you might have heard of that. If you receive that document over AWS Transfer Family, say SFTP, FTPS, or AS2, an event is emitted, as I mentioned earlier. Now B2B Data Interchange listens on that event in your bucket to automatically pick up that file, transform it, tell you whether it's good or bad, and give you acknowledgment.

The processed file is then placed into another S3 bucket, or the same bucket if it's yours, so you can use that data to either build your data lake or feed it to an application like SAP, your transportation management system, or your claims processing engine.

There are two things I really want to call out about this architecture. First, you're combining time-tested industry formats like SFTP, AS2, and X12 with modern paradigms like event-driven and generative AI so that you can achieve scale, flexibility, and lower your total cost of ownership of managing an end-to-end MFT. Second, which may not be very obvious, is that you've leveraged your trading partner relationship into a data pipeline. Your trading partners and your trading relationship are now doing double duty by giving you a data pipeline so that you can get real-time insights into your transactional data.

There is a video we just published last week that demonstrates this architecture where the business application is SAP. I'd love for you to watch the video. It's just about eight minutes long and talks about how you can build this end-to-end architecture using SAP as the ERP. Obviously it will work with many other business applications, but the demo focuses more on SAP.

BisCloud Experts is an IT consulting company that automates, accelerates, and empowers customers to scale and innovate with confidence. A customer in the food and beverage industry approached BisCloud Experts because they were using a legacy EDI solution that was very difficult to manage and extremely expensive. BisCloud Experts stepped in, developed their expertise around our services, and helped this customer move their 100+ trading partner relationships and 250,000 message volume per month from their legacy implementation to AWS B2B Data Interchange in a very short period of time.

They also avoided renewing their license and now save over a million dollars annually in licensing costs as a result of this migration. Additionally, they are able to onboard trading partner relationships much faster, in days as you can see from the slide, because the customer now controls the entire process themselves. I just wanted to give a quick shout out to BisCloud Experts who really helped this customer in their time of need.

Wrap-Up: Getting Started with AWS Managed File Transfer Solutions

I know we talked about a lot of concepts today. Just to quickly wrap it up, here's my request. This is the time to pull out your mobile phone and scan that QR code. Send an email to your AWS account manager saying, "Hey, I attended STG 361 with Jeff and Smither, and I learned about how making file transfers super effortless can help me unlock innovation in my company. I learned about three use cases and architectures for each of these use cases, and I want to easily meet my compliance and regulatory needs. I want to get started today and I really hope to hear from your account manager so we can help you with that."

There are a few more sessions later today, tomorrow, and Wednesday where each of these sessions will dive into the services that we talked about. I'd love for you to attend and learn more. Finally, thank you for attending. I'd appreciate it if you can fill out the session survey and leave us feedback so we can keep refining our content. Thank you.

; This article is entirely auto-generated using Amazon Bedrock.