Felix

Posted on Jul 2, 2023

How to spot and fix issues with publicly accessible AWS S3 buckets

#devops #aws #s3 #opensource

Introduction

The issue of public access to AWS S3 buckets is one of the most common problems encountered with AWS S3 services. This can lead to sensitive data stored in S3 being exposed. The presence of this problem is due to the configuration of public access policies for the storage bucket. In this article, we will discuss how to detect and prevent this issue.

Understanding the Problem

When we can directly access files in a storage bucket, it indicates that the bucket has a public access issue.

As seen in the image above, we can directly access files in the storage bucket. Now let's understand why this is happening.

If we examine the policy for this bucket, we might see a configuration similar to the following:

{
    "Version": "2012-10-17",
    "Id": "test",
    "Statement": [
        {
            "Sid": "test",
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::selefra-test-omkqt/*"
        }
    ]
}

In this policy, everyone is granted the s3:GetObject permission for the selefra-test-xxxx bucket, resulting in public access to objects in the bucket.

Remediation

We should follow the principle of least privilege, which means allowing only specific users to have specific permissions instead of granting access to all users.

For storage buckets that require public access due to business requirements, sensitive data should be avoided in those buckets.

Using Selefra for Quick Discovery

Manually discovering these issues can be time-consuming and cannot be done in bulk. Using Selefra can help you quickly identify these risks.

Selefra project repository: github.com/selefra/selefra

Typical Usage of Selefra

First, let's install Selefra:

brew tap selefra/tap
brew install selefra/tap/selefra

Next, create a new project folder:

mkdir selefra-test
cd selefra-test

Copy the following YAML file into this folder:

selefra:
    name: selefra-test
    connection:
      type: postgres
      username: your_username
      password: your_password
      host: 127.0.0.1
      port: 5432
      database: postgres
      sslmode: disable
    log_level: info
    providers:
        - name: aws
          source: aws
          version: v0.1.0
providers:
    - name: aws
      provider: aws
      cache: 7d
rules:
    - name: bucket_publicly_readable
        metadata:
      title: S3 bucket public readable
    query: |-
      SELECT
        DISTINCT(a1.*)
      FROM
        aws_s3_buckets a1,
        json_array_elements (a1.policy :: json -> 'Statement') a2
      WHERE
        (
          a2 ->> 'Action' = 's3:GetObject'
          OR a2 ->> 'Action' = 's3:Get*'
          OR a2 ->> 'Action' = 's3:*'
        )
        AND a2 ->> 'Effect' = 'Allow'
        AND (
          a2 ->> 'Principal' = '*'
          OR a2 -> 'Principal'

 ->> 'AWS' = '*'
        );
        output: "S3 bucket public readable, arn: {{.arn}}"

In the Selefra module, configure your own PostgreSQL database connection address, username, and password in the connection block. The cache block in the providers module sets the data retrieval cache time. The rules module is related to the configuration for detecting issues. The title block represents the title of the detection policy, and the SQL query block is used to execute the detection policy. It executes this SQL query in the database to search for resources with risks.

Before starting the detection, configure your AWS credentials using the following command:

aws configure

Then run the following command to start Selefra:

selefra apply

Selefra will start the detection process. Here is an example of the result:

In the result, we can see the at-risk buckets. Apart from the above method, Selefra also integrates a ChatGPT feature that allows you to discover risk points by directly asking Selefra.

Selefra GPT Feature

Similar to the previous steps, create a new folder and copy the following YAML file into the folder:

selefra:
    name: selefra-test
    connection:
      type: postgres
      username: yourusername
      password: yourpassword
      host: 127.0.0.1
      port: 5432
      database: postgres
      sslmode: disable
    log_level: info
        openai_api_key: your_openai_api_key
    openai_mode: gpt-4
    openai_limit: 10
    providers:
        - name: aws
          source: aws
          version: v0.1.0
providers:
    - name: aws
      provider: aws
      cache: 7d
rules:

In this case, you need to provide your OpenAI API key and choose whether to use GPT-4 or GPT-3.5 in the openai_mode field. Additionally, keep the rules block empty as the AI will generate it automatically.

Before starting the detection, configure your AWS credentials, and then you can use the GPT feature:

selefra gpt "Query publicly accessible S3 buckets."

As seen in the example above, with a simple sentence, you can find publicly accessible buckets. It's very convenient.

Conclusion

Public access to S3 buckets is a frequent and pressing issue. We hope this article has helped you understand and address the problem of public access to storage buckets in AWS S3. Selefra makes the cloud more secure.

DEV Community

How to spot and fix issues with publicly accessible AWS S3 buckets

Introduction

Understanding the Problem

Remediation

Using Selefra for Quick Discovery

Typical Usage of Selefra

Selefra GPT Feature

Conclusion

Top comments (0)

Read next

My (non-AI) AWS re:Invent 24 picks

Why Quick Fixes Fail: Rethinking Microservices Testing

Introduction to Amazon VPC and Its Fundamentals

Glue cross-account setup