Data API for Amazon Aurora Serverless v2 with AWS SDK for Java - Part 1 Introduction and set up of the sample application

#aws #serverless #java #database

Introduction

End of December 2023, AWS finally announced Data API for Amazon Aurora Serverless v2 and Amazon Aurora provisioned clusters and at the same time announced that Aurora Serverless v1 will no longer be supported after December 31, 2024.
In this article series, I'd like to dig deeper into the new Data API for Amazon Aurora Serverless v2. In part 1, we'll introduce this new Data API and set up the sample Aurora cluster and application to demonstrate its functionality.

What is Data API?

The Data API is an intuitive, secure HTTPS API for running SQL queries against a relational database.

According to the documentation, AWS has rebuilt the Data API for Aurora Serverless v2, and Aurora provisioned to operate at the scale and high availability levels required by our biggest customers. The following are some of the improvements:

Because the Data API now works with both Aurora Serverless v2 and provisioned instances, database failover is supported to provide high availability.
We have removed the 1,000 requests per second limit. The only factor that limits requests per second with the Data API for Aurora Serverless v2 and Aurora provisioned is the size of the database instance and, therefore, the available resources.
Although the Data API for Aurora Serverless v2 and Aurora provisioned has been initially launched on Amazon Aurora PostgreSQL-Compatible Edition, support for Amazon Aurora MySQL-Compatible Edition will soon follow.

Setting up sample application including Aurora cluster

For the demonstration of the Data API, I wrote a small application which I published in my GitHub account. The application basically has an API Gateway in front of Lambda functions, which communicate with Aurora Serverless v2 PostgreSQL database via Data API to retrieve the product by id stored in the database.

Let's look into the infrastructure as a code part for which we use SAM.

This is how we setup Aurora Serverless v2 cluster:

  AuroraServerlessV2Cluster:
    Type: 'AWS::RDS::DBCluster'
    DeletionPolicy: Delete
    Properties:
      DBClusterIdentifier: !Ref DBClusterName
      Engine: aurora-postgresql
      EnableHttpEndpoint: true
      MasterUsername: !Join ['', ['{{resolve:secretsmanager:', !Ref DBSecret, ':SecretString:username}}' ]]
      MasterUserPassword: !Join ['', ['{{resolve:secretsmanager:', !Ref DBSecret, ':SecretString:password}}' ]]
      DatabaseName: !Ref DatabaseName
      ServerlessV2ScalingConfiguration:
        MinCapacity: 0.5
        MaxCapacity: 1
      DBSubnetGroupName:
        Ref: DBSubnetGroup

We use aurora-postgres as a database engine. With EnableHttpEndpoint set to true, we enable the usage of the Data API. For the sake of saving cost, we start with 0.5 ACUs and can only scale up to 1.

Here we also see the reference to the AWS Secret Manager, where we store the database user and the generated password.

  DBSecret:
    Type: AWS::SecretsManager::Secret
    Properties:
      Name: !Ref UserSecret
      Description: RDS database auto-generated user password
      GenerateSecretString:
        SecretStringTemplate: !Sub '{"username": "${DBMasterUserName}"}'
        GenerateStringKey: "password"
        PasswordLength: 30
        ExcludeCharacters: '"@/\'

This is the part to set up the Aurora V2 Database Instance:

  AuroraServerlessV2Instance:
    Type: 'AWS::RDS::DBInstance'
    Properties:
      Engine: aurora-postgresql
      DBInstanceClass: db.serverless
      DBClusterIdentifier: !Ref AuroraServerlessV2Cluster

DBInstanceClass equals to db.serverless means that we use Aurora Serverless.

To make this work for you, please delete the default subnets in the parameter area (see the code snippet below) of the SAM template, and you will be required to enter your subnets:

Subnets:
    Type: CommaDelimitedList  
    Default: subnet-0787be4d, subnet-88dc46e0
    Description: The list of SubnetIds, for at least two Availability Zones in the
      region in your Virtual Private Cloud (VPC)

We also have our Lambda function named GetProductByIdViaAuroraServerlessV2DataApiLambda, which needs the permissions to communicate via Data API and also access the secret manager (which is required to communicate via Data API).

   Policies:
        - Version: '2012-10-17' # Policy Document
          Statement:
            - Effect: Allow
              Action:
                - rds-data:*
              Resource:
                 !Sub arn:aws:rds:${AWS::Region}:${AWS::AccountId}:cluster:${DBClusterName}
            - Effect: Allow
              Action:
                - secretsmanager:GetSecretValue
              Resource:
                !Ref DBSecret

Instead of giving Lambda access to do everything with the Data API via

Effect: Allow
Action:
   - rds-data:*

we can define more fine-grained permissions like:

- rds-data:BatchExecuteStatement
- rds-data:BeginTransaction
- rds-data:CommitTransaction
- rds-data:ExecuteStatement
- rds-data:RollbackTransaction

We also pass some parameters like Aurora Cluster ARN and Secret Manager ARN via the environment variable.

After we deploy the SAM template which should see our Aurora V2 Cluster and Serverless database instance up and running.

In the "Connectivity & Security" Tab of the Aurora V2 Cluster, we can check whether the RDS Data API has been enabled as intended.

We can use Query Editor in the RDS console

to connect to the Aurora Cluster via Data API with a Secrets Manager ARN.

Let's create the products table there:

CREATE TABLE tbl_product (
    id bigint NOT NULL,
    name varchar(255) NOT NULL,
    price decimal NOT NULL,
    PRIMARY KEY (id)    
);

and insert some random products with id 1 to 50 like this:

INSERT INTO tbl_product (id, name, price)
VALUES (1, 'Photobook A3', 52.19); 
...
INSERT INTO tbl_product (id, name, price)
VALUES (50, 'Calender A5', 43.65);

Now you can make the call via API Gateway with ${YOUR_GENERATED_API_GATEWAY_URL}/products/{id}

to retrieve those products.

Conclusion

In this part of the series, we introduced a new Data API for Aurora Serverless v2 and set up the sample application, which has an API Gateway in front Lambda function that communicates with Aurora Serverless v2 PostgreSQL database via Data API to retrieve the product by id stored in the database. In the next part, we'll dive deeper into the new Data API for Aurora Serverless v2 itself and its capabilities for executing SQL statements, and we will use AWS SDK for Java (of course) for it.

Please also check out my website for more technical content and upcoming public speaking activities.