DEV Community

Kazuya
Kazuya

Posted on

AWS re:Invent 2025 - Amazon S3 security and access control best practices (STG316)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Amazon S3 security and access control best practices (STG316)

In this video, Bryant and Akshat from Amazon S3 present security and access control best practices, covering three core tenets and eight actionable recommendations. They explain how S3 views security as an integrated component alongside durability, availability, and performance, emphasizing continuous improvement through features like Block Public Access and encryption by default. The eight best practices include: blocking public access, enabling bucket-level keys (which has saved customers $400 million in KMS costs), dividing responsibilities through S3 Access Points, Access Grants, token vending machines, and the newly launched ABAC support, testing security changes on model environments, leveraging AWS Organizations with resource control policies, extending S3 security to applications through checksums and encryption, enabling logging for reactive security, and planning for durability with features like conditional writes, versioning, Object Lock, and replication. They provide specific takeaways for developers (adopt new tagging APIs, enable logging, implement checksums) and leaders (enforce Block Public Access at organization level, adopt ABAC, invest in test stacks).


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Introduction: Amazon S3 Security and Access Control Best Practices

Hello everyone. Welcome to our session on Amazon S3 Security and Access Control Best Practices. We are excited to talk to you about this today. We have given smaller chalk talks in previous years about our best practices and what you can do in S3 to ensure your data stays secure, but we are glad to be able to talk to a broader audience and hopefully give you some really good tips today.

My co-presenter here is Akshat. He is one of the best product managers I have worked with at AWS. This is Bryant, Principal Engineer at Amazon S3. The running joke is that between the two of us we have 22 years of experience, with 17 of that being Bryant's. If anybody has any questions after the session, we will be outside and happy to answer them.

Thumbnail 70

We are going to be talking today about some of the ways that we think about security at S3, some of our tenets that guide us when we are trying to make decisions. We also have some best practices for you. This presentation goes to 11 because we have 3 tenets and 8 best practices. We will end this session with some actionable next steps, things that you can do today as leaders in your organizations or as developers to make your data more secure.

Thumbnail 110

Thumbnail 130

S3 Security Philosophy: An Integrated Approach to Durability, Availability, and Performance

Let me start with how AWS thinks about security. We went back and forth on this part of the session and essentially we boiled it down to this: security is one part of an integrated whole. Security is not unto itself for us at S3. When we say one part of a whole, the question is what are the other parts of this whole. This is what we talk about internally. The real value is how security works with the other three pillars: durability, availability, and performance, and how they enable customers to meet their needs.

Thumbnail 140

Moving on to what these needs are, working backwards from customer requirements, these are the things that we think about. Customers function on AWS and S3. They are focused on ensuring continuous business access to data. They need their users or their end customers to have access to data at all times. We are looking at ways in which we minimize and mitigate risks to the business, and then we think about delivering business value efficiently. When we think about security, it is all about how security contributes to these things. We want to help serve these business needs.

Thumbnail 190

At S3, we think about security as being everybody's job. This is the whole AWS shared security model. AWS is responsible for security of the cloud. Customers are responsible for security in the cloud. At S3, as designers, it is my job to think about security even on features that might not be necessarily security related. We think about security as a lateral function that functions across all of the products that we work on and launch.

Operators at S3 each own their own security model, and then there is a central security team that partners with them. When we talk about it being everybody's job, I might be working on a performance optimization in S3 or a brand new feature with new APIs, but as I think about those things, I need to also be thinking about security. We have a centralized security team in S3 that helps those teams focus on the right things and concentrate on those things, because there is pressure to stay in your lane, to do performance or work on availability improvements without thinking about how that impacts security. We really need to think of it as one integrated whole, like the whole product is trying to solve a problem for customers.

Thumbnail 270

The last one is customers. All of you here, we think security is your job as well. We build tools essentially for any job. We think about S3 as the bottom total. If it is turtles all the way down, the only thing lower is the hardware. It is just the hardware, right? It is hardware, S3, and then everybody builds on top. We think about that, and it is part of our best practices today. We will talk about 8 different things that we have seen customers do that has helped them stay safe and secure in the cloud, and also things we generally recommend that customers do.

Thumbnail 310

The last piece I want to talk about is continuous iterative improvement is a must. This is something that we take extremely seriously at S3. Continuous improvement essentially works in two different ways. One is how do we launch new features based on customer feedback to make S3 easier to use and help customers stay safe on the cloud. Second is how do we simplify security.

Thumbnail 340

Continuous Iterative Improvement: Simplifying Security Through Default Settings and Feature Reduction

Some examples of that over the last couple of years, the top two on the left, S3 Block Public Access enabled by default and ACL disabled by default. This is actually the first launch that Bryant and I worked on when I came to S3. It's a remarkable change. You create new buckets and Block Public Access is turned on. Most customers don't use S3 for public access. Most people don't turn it off. They just stay secure by default.

All new objects are encrypted. You cannot land unencrypted objects into S3. Even if you send us plain text bytes, we encrypt the data as the first step before we store the objects in storage. Checksums are now enabled by default. That was one of the things I worked on last year. It helps you guarantee data durability end to end. Since launch, S3 has taken checksums at the front door and stored it all the way down to disk to storage. Now you can extend that all the way to your applications. If you upgrade your SDK, you get that for free. So everybody please upgrade your SDKs.

The SOAP interface is end of life. SECC, which is a niche encryption offering that very few customers use, is going to be disabled by default soon. The announcements are out there. As you look at these iterative improvements, you see some of them are launching new things like always encrypting your data by default. That's one more thing you don't have to worry about. It becomes even easier. But others of these things are shutting things down. It's making sure that the surface area of S3 doesn't grow and grow and grow over time as we launch new features. We want to keep it as small and focused and effective for you to think about as possible.

Thumbnail 460

Thumbnail 470

If we just always add features and never make them smaller at all, then you'll get lost in all the options. Too many things to think about. We talk about this a lot as security isn't only about saying no, even though no is probably the easiest word to say. Ships in harbor are safe, but that is not what ships are built for. It's the same thing with S3. Disconnected data is safe. That is not what S3 is built for. S3 originally launched as internet facing storage. It is meant to be used. It is meant for people to interact with their data and make use of the data.

This is what we think about when we think about security. It's not just about denying access to everybody or denying access to the wrong users. It's about building those paths to get the right users access to the right data. This ties in with those earlier tenets as well. When we think about security being everybody's job, we don't want it to be security people saying no and other people trying to push past security as an obstacle. As an organization, security needs to be about delivering new features and offering new capabilities to customers and providing it as efficiently as we can.

Thumbnail 530

Thumbnail 550

Best Practice #1: Block Public Access and Use CloudFront for Web Hosting

I'm going to hand off to Brian to talk about some best practices. We'll talk about eight specific things that we strongly recommend that our customers do that all of you take away from this today. As we do this, we're going to try to keep in mind those tenets. We're going to keep in mind that security is just part of a broader product. It's not about saying no, and that we need to iteratively improve all the time. The first of the best practices is some really basic stuff that you should all be doing. You should all be blocking public access to your data. It's just table stakes.

S3 launched Block Public Access more than five years ago. Block Public Access examines the policies that you write and ensures that you never write so broad a policy that your data is publicly accessible. That analysis is done by an automated reasoning library we use internally. It protects you from inadvertently misconfiguring your data. When you've got a developer who's just trying to get something done, the easy button is to say allow all. Block Public Access prevents you from tripping over that, not having those aligned incentives. Block Public Access makes sure that it never sneaks in.

Thumbnail 600

Block Public Access was turned on by default for all general purpose buckets a few years ago when you create a new bucket now. It is already enabled for Block Public Access.

For blocked public access, our new S3 resources that we've launched recently, whether that's tables or table buckets, vectors and vector indices, these new S3 resources have block public access intrinsically turned on. We just don't allow you to ever even turn that off. The one use case we see people trying to use public access for is for web hosting. Estuary was Internet facing storage when it launched, but we strongly recommend that you go to CloudFront for all of your web hosting needs. It's just a better option. It provides you with TLS support and arbitrary domain names rather than trying to route everything through a subdomain. You get better caching, so your end users get better performance. The list goes on and on.

Turn on block public access and use Origin Access Control with CloudFront. That's the way to give only CloudFront access to your bucket. You can use Amplify as the easy button for hosting a static website on top of S3. Your S3 bucket stays private, but it's the origin of a public CloudFront distribution. If you go to the S3 console and try to set up a website, there's actually a one-click button that takes you to Amplify and shortens the process. You don't even have to work through all of the steps required to set up a CDN distribution. You can just click the button and go.

Thumbnail 730

Best Practice #2: Turn On Bucket-Level Keys to Reduce KMS Costs

Best practice number two is to turn on bucket-level keys. Bucket-level keys is a great cost-saving feature that we've recently launched. We're all about delivering business value efficiently. We want to make sure that this doesn't compromise security in any way, but that it helps you do things with less cost and easier to use. When you want to use your own keys with S3's encryption options, you do that with KMS, our Key Management Service. That means that every time you put or get an object, S3 needs to go talk to KMS on your behalf and use the key that you have managed there with your objects.

Thumbnail 770

If you have millions, billions, or trillions of objects in your S3 storage, that request volume can really add up. Bucket-level keys allows you to tell S3 that we're allowed to use an intermediate key, a key that will be used to encrypt the data keys used in our envelope encryption on our buckets. That key is going to be decrypted using KMS and then cached securely temporarily, so that we don't have to go to KMS every single time. It is optional on general purpose buckets. We do know that there are compliance use cases that require that you use that KMS access as a backstop and a second level of authorization, but for everybody else, this is an easy button to click.

You just push that button in the console and you're going to dramatically reduce your KMS volume. Since launch, customers have saved four hundred million dollars in AWS KMS costs just by turning on bucket keys. Four hundred million is not a small amount of money to save in requests, and this is outside of their S3 request costs as well. This is automatically turned on on Estuary tables, so there's no setting required. It's already in place for you. These first two that we talked about are things everyone should do. Everyone should turn on bucket keys, and everyone should block public access.

Thumbnail 830

Thumbnail 860

Best Practice #3: Divide and Conquer with S3 Access Points, Access Grants, Token Vending Machines, and ABAC

This next one is a little bit more nuanced and it's about how you scale. When we talk about a big problem like millions of users or thousands of applications or billions of end users, if you're a big application or social media property, how do I solve that problem? How do I manage so many different pieces? We think that the right way to do this is to divide and conquer, to eat that problem a little bit at a time. That's not only about trying to keep your applications separate or your production stages separate architecturally, but it's also about making sure that organizationally you divide the responsibilities for management of your bucket efficiently between teams.

You might have a centralized security team that is more important when they are enforcing the best practices or enforcing the guard rails as we talk about it in AWS for your whole enterprise. They're going to be the gatekeepers. But then you're going to have other teams who are just trying to get stuff done. They're trying to launch an application and they need to create a new S3 bucket. Who has to approve my policy? We don't want that to be a bottleneck for your developers.

Thumbnail 910

Thumbnail 930

Let your users do their thing, but also maintain the security of a centrally managed set of guardrails. We're going to talk about a couple of different options here today—four different options for dividing the responsibilities between the guardrails and the enforcement of policy, and the enabling and the agile development. The first one we're going to talk about today is S3 Access Points. Access points are use case specific endpoints that you can provision in front of an S3 bucket. You can have many different access points all pointed at the same bucket. Each one of those endpoints gets its own access control policy. These are layered on top of the S3 bucket policy, and so you divide the responsibilities here. We're talking about how you divide and conquer the problem. You put the guardrails in your bucket policy. Nobody should ever access this bucket from outside of your VPC, right? And then your access points can individually enable every new application that comes along.

Maybe you need one read-only access point for your application and one read-write access point for your auditors. Maybe you have a development access point that can only talk to the partition of your bucket or the prefix within your bucket that's for ephemeral storage and not to the customer data. Access points allow you to modularize your bucket policy and allow you to deploy applications independently without touching a shared policy resource. They allow you to scale organizationally by splitting up the responsibility for the buckets and the storage and the primitives of how you're storing data from individual use cases where you're enabling access.

Yes, and the scale really works well. We see customers put tens of thousands of access points in front of single buckets and then lend them out specifically for different applications and teams. Access points scale up to 100,000 per customer account per region, and that's a soft limit if you need to talk to us about a larger use case. However, access points may not be the best solution if you're thinking about individual users having access to individual files. If you're a large enterprise with 1,000 developers and 10 million customers, some specific PDF that they can see or some specific spreadsheet they need access to, you don't want to manage the cross product of all those files and all those users as access points.

Thumbnail 1060

That's why we launched S3 Access Grants. Access Grants allow you to use either IAM users or users federated in from a corporate directory to get access to specific S3 files or prefixes or buckets. They're stored in a list of grants. It's very easy to programmatically add and remove grants from that list without affecting anyone else's access. The way this works is that your application is going to talk to Access Grants, request access for a specific S3 bucket or prefix, and then it gets a temporary credential that's been issued by AWS STS, or Security Token Service. That temporary credential will then be used for the S3 access.

We've got some magic sprinkled into our AWS SDK so it will juggle all of these credentials for you. If you need to access hundreds of different prefixes or thousands of different files, it will on demand fetch the credentials it needs to make those requests so you can make that access opaque to your application. Your application doesn't have to think about it if you use the SDK. One key point to note here is how powerful the corporate directory is. It just shows up as one part of the design, but when we talk about bringing in federated users and groups, you could have a prefix called finance that's mapped to your IDP group called finance. So when you join the finance team, you automatically now get access to the files you need. When you leave the finance team, you lose access to the files you don't need.

It goes back to the separation of responsibilities that we were talking about, because the guardrails live on the bucket. You have your bucket policy that prevents any public access, that prevents unencrypted access, that prevents access maybe from outside your organization's VPCs. But then you don't have to stand in the way of individual users granting access to files or adding new grants. In fact, this is the foundation of our storage browser product, which gives you a non-technical file browser kind of interface on top of S3. It's going to show you only the things that you can see because it has this list of grants to pull from. It doesn't have to evaluate your bucket policy every time.

S3 Access Grants are a special case of a broader pattern we call the token vending machine. The token vending machine hands out on-demand temporary credentials that are tightly scoped

Thumbnail 1210

to a specific S3 bucket and a specific S3 table or a specific vector store . These credentials are acquired from STS, or a security token service, and we do that via a scoping policy. You take a role as your foundation—the role that has access to those buckets and tables—and then you add a scoping policy to that temporary session that restricts it to only a subset of that resource, such as a specific object or a specific column.

These custom sessions that you create could be based on any kind of authorization you want, and this is the power of the token vending machine concept: it's arbitrary logic. When I want to only allow Bob from accounting to have access to a file, I can do that with S3 Access Grants. But if I want him to only have access to a file on alternating Tuesdays if the sun is high and it's a certain time in the afternoon, and only if I've recently turned on an audit and he's in the office, all of those data sources—whether they're directories, corporate databases, or policy files—you're going to use your own arbitrary logic, which might be a Lambda function, an EC2 instance, or an EKS container. You're going to process all that, make a decision, and then issue a temporary credential that embodies that decision, a temporary grant of access.

Thumbnail 1300

That's the token vending machine as a concept. We've got one more way to divide these responsibilities, and that is attribute-based access control, also known as ABAC. You may have heard this also called tag-based access control. When you do tag-based access control, rather than basing your access control decisions on the names of the users or the resources involved, you use tags or metadata that's been associated with those principals and resources. Rather than writing a policy that says Bob, a user, has access to this S3 bucket, I would say users like Bob who are on the accounting team have access to buckets that are tagged as accounting.

This makes my policies more semantically meaningful, and it means that if I am a centralized compliance team—somebody who's writing these security policies—I can write those policies once. I can say accounting should have access to the accounting stuff and marketing should have access to the marketing stuff, and my mobile users should have access to their own data. Then I never touch that policy again because I've devolved the responsibility for tagging individual users as developers, as marketing people, or as end users to another piece of the system. So one person owns the tagging and another person owns the policy authoring.

Thumbnail 1390

We're very pleased to announce that just a few days ago we added ABAC support to S3. This is a huge deal for us and has been multiple years in the making. We are now supporting the standardized AWS TagResource API for bucket tags. Previously, you would have used the PutBucketTagging API to replace all of the bucket's tags with a single API call. Now you have granular tag and untag resource permissions on individual tags being added to buckets. These are the same tags you use today for cost allocation, so you don't have to re-tag your buckets if you've already carefully tagged them, but they're now available for access control as you opt buckets in.

Tables and vectors now also support ABAC. If you want to start today with a policy on a user that has access to only resources tagged development, you can go ahead and tag your tables, tag your table buckets, your vectors, and your S3 buckets now with that tag, and then write a policy that means something. To tie those things together to the organization, I think the key piece here is that this works across AWS services. Using tags to bring semantic information from your world into AWS makes it easy to reason about. You can do that with STS with all of our bucket types, and you can also do that with all of the other services and just have a simple model that everybody can think and reason about within your organizations.

This is coming back to that tenant about making sure that we enable you to deliver things efficiently. This is yet another feature launching in S3, but because it's a standardized feature across other AWS services, it's actually a simplifier. It makes it easier to think about all your resources in the same way. This isn't an S3-specific thing at all.

A specific thing about this is that when you opt into a bucket supporting ABAC, you're going to be opting yourself out of that old bucket tagging API. It's fundamentally incompatible with the tag resource permissions, so you really need to take this as a staged thing. You're going to start using the new API, check to make sure that all your tags are the way you want them to be, and then you will opt into ABAC, which opts you out of those older APIs.

Thumbnail 1550

We have four different ways to divide and conquer. We talked about access points, access grants, token vending machines, and ABAC. This can sound like a lot, right? It's a lot of acronyms and a lot of saying access over and over again. Access points and access grants and access control lists. People say, "Isn't this supposed to be the simple storage service?" The answer to this is that the right tool for the job changes as you scale up.

Thumbnail 1560

When you start with just a simple application, stereotypically you'd be in a garage hacking away on something. You've got a bucket and a policy, and life is good. You don't have to do more than that. If that's the scale of your application, we've got a simple answer for you. But if I'm talking about millions of users accessing trillions of objects stored in S3, a file that listed who has access to everything would be horrible. That would be the worst possible experience trying to manage everything in one big policy like that.

We are trying to provide you with the right tools, trying to meet you wherever you're at on that scaling curve. You might be a tiny little baby's first application all the way up to a huge enterprise deployment, and we're trying to provide you with tools to meet wherever you are. The analogy I like to use is that S3 has pickaxes and S3 also has earth movers. Depending on the stones that you want to move, you either choose a pickaxe or use an earth mover, and there's everything in the middle. It's about finding the right one to use. It would be hard to hoe a garden with an earth mover, but it would be awfully hard to build a giant building if you only had a pickaxe.

Thumbnail 1650

Best Practice #4: Always Test Security Changes on a Model Environment

We're going to move on to our next best practice, and this is that you should always test your security changes on a model. This is probably the number one thing I've seen as a development practice that customers struggle with. So often, somebody says, "I made a change to my bucket policy, and now my application doesn't work and my users don't have access. Everybody's getting errors. Things are going wrong." We strongly recommend that you do a model of your production system.

This is easier than ever with infrastructure as code tools like AWS CloudFormation. There are plenty of third-party tools as well. You want to just set up a copy of everything that's in your production environment: all the same buckets, all the same replication and lifecycle rules, all of the same EC2 deployments. It doesn't have to be all the same data, right? It's not about duplicative copies. It's just having the same structure that you have. No production data touches this test stack, but it's set up with all the same controls.

Thumbnail 1710

Then you want to make a tweak. You change your policy in that test stack and run all your production workflows through it. Does it work, right? You also test the negative test cases. What shouldn't work. We've seen customers who say, "I changed my policy and it broke everything," so I quickly changed it back to an allow star and everything went back to working. So now I'm good, right?" It's like, no, no, no. You missed the part where you get it back to its original state where it didn't allow the things that shouldn't happen.

Once you've tested your positive and your negative test cases, the secret sauce here is that you keep those test cases and you run them every single time that you make a change. Whenever I decide I'm going to add a new bucket, I rerun all the test cases. Whenever I decide my production environment is going to have a slightly different access grants setup, I run all the previous test cases. This corpus of test data is going to give you confidence that you are deploying into production with something that's going to work.

It's never going to be a matter of causing customer impact and then rolling that back quickly. You can stop holding your breath every time you deploy. For what it's worth, this is exactly how we do it in S3. In a service as large as S3, our developers write tens of thousands of test cases. There's a running bank of them, and every new code change has to go through all of the test cases before it's deployed. We also use some of our own internally built automated reasoning tools that continuously test against the stack with expected behavior. It doesn't always have to be positive. Sometimes you have negative test cases and it checks for the expected behavior. We've done both of these to run a service like S3. That's the best practice across the board. Everybody should be setting up a test stack for all their deployments to make sure that everything works.

We think about it as a cost because now we have to go and do all that setup over again, but it's really not a cost. It's an enabling factor. It's a way to make things move faster. It's a way for security to say yes, we have confidence in your change, rather than saying we don't know if this is going to work or not.

Thumbnail 1840

Best Practice #5: Use AWS Organizations to Enforce Enterprise-Wide Guardrails

Best practice number five is about AWS Organizations' product. AWS Organizations is a way to model your multi-account deployment with explicit organizational units and organizations. Your organization as a whole is going to contain multiple organizational units which can be nested. Each of those organizational units contains one or more accounts. You've got an administrator account with special privileges that lives outside of any of those. At each level in this tree of organizational units, you have resource control policies which apply to every resource owned by any account in that organizational unit, and service control policies which apply to every principal in every account within that organizational unit.

Thumbnail 1890

Organizations are a great way to organize your accounts, but they are also not just for keeping things in tidy boxes. They're for separation of concerns. When we talked about scaling previously, you want to divide and conquer. We recommend that you put your guardrails, your most important enterprise-wide guardrails into a resource control policy. Anyone in your enterprise that creates a new bucket is going to have a resource control policy that prohibits unencrypted access to that bucket. It's going to prohibit access from outside of your organization. Maybe you've got a specific organizational unit for your web hosting, and it's got some special provisioning to allow it to still have a public CloudFront distribution. But the whole rest of your organization lives in an organizational unit where that's prevented.

These guardrails allow your developers the creative freedom to build whatever they need to. They can set up their accounts however they like. It doesn't matter what they do to their bucket policies, to their table policies, to their vector policies. You know that they're going to be compliant with the organization-wide guardrails that you've set with the resource control policies. It's also a great ease-of-life enhancement. Imagine you now allow a million buckets, general purpose buckets in an account. You have to go apply the exact same set of policy statements to each of these buckets. Just do it once at the organization level and then you can forget about it.

We see customers with thousands of accounts running tens of thousands of buckets, and every one of those buckets has a boilerplate snippet that's been copied and pasted by their infrastructure as code. They hope that those policies are never mutated to remove that important piece. With resource control policies, you just don't have to worry about that. Another thing organizations can do is provide you with easier debugging. S3 earlier this year launched the ability to get more informative error messages whenever you have an access denied problem and the caller and the resource owner are within the same organization.

Thumbnail 2070

Rather than just saying access denied with an opaque error message that leaks no information about your S3 deployment, you get something more helpful. It's going to help you pinpoint exactly which policy failed to allow you access or that explicitly denied you access. This is the same improved error messaging that you've been able to get within an account for a while now, but now within an organization, you also get those more informative error messages. We're also very pleased that just after Thanksgiving, we launched a new organizational policy for S3. The organizational policy applies to all the resources in an organizational unit and now allows you to set block public access controls at that organizational unit level because this overrides any account level settings that you have in those accounts.

If you turn on block public access at the top of your organization, it's going to apply to all of the buckets owned by all of the accounts and all of the organizational units in your entire organization. With one single API call, you just never have to worry about public access being a thing in your enterprise ever again. You also don't have to worry about somebody mistakenly or out of frustration changing it at the account level because you have the administrator account. Those are the keys to the kingdom when it comes to block public access. In this case, only the administrator account is ever going to touch this setting.

Thumbnail 2080

Thumbnail 2100

Best Practice #6: Extend S3 Security Beyond S3 with Checksums, Encryption, and TLS

Let's move on to the next best practice, which is to extend S3 security beyond S3. This might sound a little convoluted, but we want customers to think about security the same way we do internally. There are three things to think about. Since launch, when data lands at S3's front door, we check some of the data down to storage. We use those checksums to validate the data every time we pull it out of storage.

Our recommendation is to extend that durability all the way to your applications. It doesn't have to be only when you're talking to S3. A simple example would be if you're a studio with cameras and recording equipment. Take checksums of the data as soon as the data is generated, right, and send the checksums along at every step of the process. On the way to S3, we can validate the checksums before accepting the data. Packets with this data flowing all the way across the world could experience a bit flip in a server in the Bahamas, right? Your data could be affected, but using checksums is a way to guarantee that doesn't happen during transit.

I want to give a quick plug here for the work we did last year. If you upgrade your SDKs and use the AWS SDKs, they will calculate a checksum for your data automatically. My recommendation is to extend that all the way to your applications to where the data is generated, but at the minimum, use the upgraded SDKs that will calculate the checksum so that when you start the upload process to S3, your data is protected.

Our solutions architects at S3 have actually already built some tooling that will take those returned checksums. If you keep hold of them and store them longer term on your client side, you can then provide those to tooling that will verify the authenticity and integrity of the objects in your bucket. Without ever having to go grab all the objects or look at all the data, you're just checking to make sure that those checksums match years or decades later. You're able to make sure that it's exactly the same data because you are being a part of this, right? It's all one integrated whole and security is everybody's job.

It can be partly your job as well as you build checksums into your application. This is going to make everything better, not just your S3 interface, but also your local access to that data. We spent the last couple of years really expanding the list of algorithms we support. We went from supporting only MD5 to CRC32, CRC32C1, SHA256. We added CRC64 last year. So there's a wide array of options based on wherever you are on your journey. If you need something really fast and performant, you can use CRC64. If you need a secure algorithm, use SHA256, the best of the line.

Whichever algorithm you want, we validate for free and we store the checksums with your objects. The second best practice is on the encryption side. The same encryption that you want to do in S3, we recommend that you do at your own end. That goes down to thinking about if you're using KMS, Key Management Service, with customer-managed keys on S3. There is probably a compliance or regulatory requirement that requires you to do so. You can use the same keys to encrypt data in your local systems. It doesn't make sense to me that you would have data encrypted on S3 but not on the local client-side systems.

The AWS S3 encryption client is a great way to use KMS keys to encrypt your data. You can do that client-side and keep your encrypted copy locally, but then also have that same encrypted copy to S3. The last one is to always use encrypted communications when you talk to S3. Use TLS, and if you use VPCs, don't transit the public internet. Use private VPCs. We use secure communications between all of the things within S3, and so we recommend that everybody talks to S3 with your data encrypted in transit. It's just a basic protection that you can have across the board. Just do the same thing externally that we do internally in S3 and keep your data safe.

Thumbnail 2340

Best Practice #7: Turn On Logging to Enable Reactive Security and Automated Remediation

Next, turn on logging to enable reactive security. The biggest ask that we hear from customers that we cannot meet is, can you give me logs when I haven't turned on logging? It's really hard because we've tried to invent time travel, but it's not really working out. We have on prime ships by yesterday, exactly. Turn on logging proactively so that you can react to security things that happen within your organization.

Thumbnail 2370

You've got two options. AWS CloudTrail supports logging from across AWS with really easy-to-use event filters, CloudWatch ingestion, and dashboarding. There are a bunch of options. On the other hand, you have S3 Server Access Logs, which launched when S3 launched. These are S3-specific activity logs that can be useful when your buckets are really noisy because you only pay for the storage, not for the cost of generating the logs themselves.

Thumbnail 2410

Reactive security is more than just logging. The typical use case we hear about is forensic audit—when you think something has gone wrong and need to figure out who did what action against which resource. That's the usual reason customers think about logging. But there's also anomaly and drift detection. When you get enough scale with tens of thousands of buckets, trillions of objects, and hundreds of users, all of these things can get really unwieldy to manage. The only way to detect anomalies and drift is to look at the logs and figure it out.

So many times when we think about security, we think about access control and encryption—those are preventative controls. But you need that reactive control as well. You need to treat security not just as an afterthought where something has happened but you didn't have logging turned on. It needs to be there from day one so you've got an audit story.

Automated instant remediation is what we talk to a lot of customers about, especially when organizations get really large. You can set up rules in advance that act based on your logs. A simple example is if somebody turned off BPA, this action can turn it back on. So if BPA is turned off, it's immediately turned back on. We've linked a blog post that we really like, which we worked with some of our solution architects to get out there. The logic is similar—you have some sort of event, Lambda fires, takes some other action, and fixes it. That's something you can do across a bunch of use cases. It doesn't have to be limited to just S3 logging. You've got a bunch of different services that do monitoring of your configuration and monitoring of your traffic patterns.

Thumbnail 2530

We have AWS Config, AWS Control Tower, Amazon GuardDuty, and AWS Security Hub. AWS Config has rules that you can deploy and check against. AWS Control Tower helps you create safe environments by automatically creating resources with the right configurations. GuardDuty is for detection and monitoring. Security Hub brings all of these findings together. As we were talking about tools for the job, there are lots of tools available. It's about choosing the right one depending on your skill. Ultimately, all of these rely on turning on logging. You need the events there to have GuardDuty findings, to have Security Hub alert you to problems, and to have Config notice things. You need to turn on logging on your buckets.

Thumbnail 2590

Thumbnail 2600

Best Practice #8: Plan in Advance for Durability and Recovery with Versioning, Object Lock, and Replication

The last best practice is to plan in advance for durability and recovery. A similar concept applies here—you can only react if you've planned in advance and taken action up front. I'm going to talk through some of the options we offer. On your extreme left, you have S3 conditional writes, which is something we launched last year. It's very simple to use and simple to enforce. You can set it up so you can only put the object if it's not present to prevent overwrites of your data. You can also enforce that using the bucket policy, ensuring that all writes on your bucket use those conditions and never accidentally overwrite something. It's a simple way to ensure your data is not overwritten, especially if you have downstream applications that rely on this data in a bucket where it's not ephemeral storage and always needs to be used.

Next is S3 object versioning. You can create a version stack so every write that comes in doesn't overwrite your data but just adds a version on top. All of the versions remain accessible. You use the version ID to make your request, so your applications continue to function while you accept new data that lands on top of the same key.

What if someone were to delete one of those object versions that you wanted to keep? That's where S3 Object Lock comes in. You've got compliance mode and governance mode with lots of options. You can set up rules for how much time an object cannot be deleted, or you could say it can't be deleted until a hold is taken off. You can set that up accordingly.

The next one is S3 Replication. The typical use case for how people think about replication is that they need their data in another region because their compute is there and all their users are there. They would rather pay for the move once rather than use it there. But it's also a good way to think about disaster recovery.

It's not just latency improvements or something. It's also having a second copy of your data, exactly. And then the last one is AWS Backup, right? So this is for your most business critical information, things that you consider essential for your business to function. Back it up, right? Create a full copy, have that ready to go. AWS Backup is a native service that does a lot of these things and also provides you with RPO, like all of these things that you would care about if you were to create a full second copy of your data. Yeah, they say one is none, right? So for anything that's going to be business critical, anything that you can't replicate. This is not for like every ephemeral bit of training data that comes out of your EC2 clusters, but it is super important for that long term financial data for that long term user data that needs to be retained indefinitely. Yep, cool.

Thumbnail 2770

Key Takeaways: Actionable Next Steps for Developers and Security Leaders

So these are 8 best practices. Again, the first two, block public access to your data and turn on bucket keys, are just things every single AWS developer should do. Some of these others are more nuanced. You gotta think about exactly which of the approaches matches your scale when it comes to dividing and conquering scaling problems. Everyone should be testing. You can use AWS Organizations to simplify things if you've got a multi-account setup. Akshat talked at length about the checksums that you can do and the local encryption and the local TLS connections that you can use to extend S3 security out into your application.

Thumbnail 2810

Turn on logging, like you can't turn it on yesterday, but you can turn it on today, right? You can have that audit record from now and you need to have a plan for your durability and recovery. So let's talk about the specific takeaways from today. If I am a developer, right? I'm an individual contributor. I'm just a regular engineer, which is what I am, by the way. What I need to take away today is that I should be switching to the new tagging APIs right away. Put bucket tagging still exists, right? We're not turning anything off today. But those new tag resource and untag resource APIs are AWS standard and they work now on buckets. You should be doing tagging the right way and so start today and get ahead of that inevitable shift as we move to the new tagging APIs.

Thumbnail 2830

Can't emphasize it enough, we've said it already. Turn on logging on your critical buckets today, and you can do that in a click, right, on the console. Logging doesn't feel like it's necessary until suddenly it is necessary and it's necessary that you already did it. So turn that on today and then last of all, really think about this concept of turning on checksums in your application, right where you create your data, having a checksum from the very beginning to ensure that object's integrity and durability. This is partly a mindset thing as well. Checksums are a great technical way to ensure integrity. But making sure you always create checksums where you create your data is part of getting your whole organization to shift to a mindset like S3 where we think about the durability of data as we're creating it.

If I've got a camera or an IoT device or a user application taking submissions, right, these are all things that I should be worried about from the beginning. What's the durability of this data? How am I going to make sure that it is safe? Part of the journey to thinking about that all the time is checking something from the beginning, check some early, check some often, and make sure that those match system-wide at the minimum. Upgraded SDKs at least let the SDKs do it automatically for you, right? Yeah. And then leaders in the audience, if you're a CISO or you're part of a team that manages security for the organization, first is please enforce BPA at the organization level. Can't recommend this enough. It's not that people in the organization want to do the wrong thing, but you want to set up guardrails in a way that it's very, very hard, if not impossible to do the wrong thing, right?

Thumbnail 2960

Enforce BPA at the organization level. Second is push teams towards using ABAC. Attribute-based access control is great because it works across AWS. You could also take all of the other options of using the token vending machine, et cetera, but ABAC is a great way to bring context from your organization into AWS and use that for access control. ABAC is great because it scales, right? Tagging and just making sure the tags match can be a useful access control paradigm when you are at 2 buckets and 3 developers, right, but it scales all the way up to millions of users and trillions of objects. So it's one of those things you can learn those skills early on and then keep with you as your organization grows.

And then the last one is allocate resources for test stacks and audits. This might seem like an expense and everybody's short on headcount, but it's not an expense, it's an investment. It's about starting early as you're building your platform because then as you scale, it's an investment that keeps paying dividends with the problems that do not happen because you've invested in these. Yeah. All right, that's the last of our best practices today as a breakout session. We don't have a Q&A section, but we're happy to meet with you outside the room here after we finish our presentation. Akshat and I have a long history of having wonderful hallway conversations that maybe are even better than some of the sessions at re:Invent if I'm allowed to say that. So we'd love to talk to you.

Also we'd strongly recommend you check out some of the other sessions that are going to be talking about S3 security here at re:Invent. I particularly like to recommend I think it is STG 414 about more of these best practices and sort of a hands-on on how you would implement some of these things as well as some of the durability thinking and the disaster recovery that's going to be STG 344. Yeah, cool. All right, thanks very much.


; This article is entirely auto-generated using Amazon Bedrock.

Top comments (0)