Implementing Multi-Tenancy in DynamoDB with Python

#aws #python #architecture #database

When building scalable applications, especially in a SaaS (Software-as-a-Service) environment, multi-tenancy is a common architecture pattern. In a multi-tenant system, a single instance of an application serves multiple clients, ensuring data isolation for each tenant.

Amazon DynamoDB, a fully managed NoSQL database, is an excellent choice for such systems due to its ability to scale horizontally and its high availability. However, for a multi-tenant setup, the design of your data model is essential to ensure proper data isolation and performance.

In this article, we will demonstrate how to implement multi-tenancy in DynamoDB using Python and the boto3 SDK. We'll create a single shared table, store data in a way that isolates tenants' data, and interact with the data by adding users and orders for multiple tenants.

What is Multi-Tenancy in DynamoDB?

In a multi-tenant architecture, you need to logically partition the data to keep each tenant's data isolated. This can be done using a single shared table and partitioning the data by a unique tenant identifier (tenant_id).

For example, let's say we have two tenants, Tenant A and Tenant B, each having users and orders. We can store the following in DynamoDB:

Tenant A's data:
- PK = tenant#tenantA, SK = user#user1
- PK = tenant#tenantA, SK = order#order1
Tenant B's data:
- PK = tenant#tenantB, SK = user#user2
- PK = tenant#tenantB, SK = order#order2

Each piece of data is associated with a PK (Partition Key) representing the tenant, and the SK (Sort Key) differentiates users from orders within that tenant.

This design ensures that each tenant's data is logically isolated but stored in a shared table, which is more cost-effective and easier to manage.

Designing DynamoDB for Multi-Tenancy

To implement multi-tenancy, we'll follow these design rules for the DynamoDB table:

Partition Key (PK): Will include tenant_id, for example, tenant#tenantA.
Sort Key (SK): Will differentiate between data types for each tenant, such as user#user1, order#order1.

This design ensures that all data belonging to a particular tenant is grouped together under the same partition key but differentiated by the sort key. We can then query tenant-specific data by using the partition key (PK) and apply further filtering based on the sort key (SK).

Python Code to Implement Multi-Tenancy

Now that we have our design, let's look at the Python code that will perform the following actions:

Add Users: Add a user to a specific tenant.
Add Orders: Add an order associated with a user.
Retrieve Users: Retrieve a user by tenant_id and user_id.
Retrieve Orders for Tenant: Retrieve all orders for a given tenant.

We'll use the boto3 library to interact with DynamoDB, so make sure it’s installed by running:

pip install boto3

Full Python Code:

import boto3
from uuid import uuid4
from boto3.dynamodb.conditions import Key
from decimal import Decimal

# Initialize the DynamoDB resource
dynamodb = boto3.resource('dynamodb')

# Function to create the DynamoDB table
def create_table():
    # Create the table if it doesn't exist
    try:
        table = dynamodb.create_table(
            TableName='MultiTenantTable',
            KeySchema=[
                {
                    'AttributeName': 'PK',
                    'KeyType': 'HASH'  # Partition key
                },
                {
                    'AttributeName': 'SK',
                    'KeyType': 'RANGE'  # Sort key
                }
            ],
            AttributeDefinitions=[
                {
                    'AttributeName': 'PK',
                    'AttributeType': 'S'
                },
                {
                    'AttributeName': 'SK',
                    'AttributeType': 'S'
                }
            ],
            ProvisionedThroughput={
                'ReadCapacityUnits': 5,
                'WriteCapacityUnits': 5
            }
        )
        print("Creating table... Please wait until it's created.")
        table.meta.client.get_waiter('table_exists').wait(TableName='MultiTenantTable')
        print("Table 'MultiTenantTable' created successfully!")
    except Exception as e:
        print(f"Error creating table: {e}")

# Initialize table after creation (if it does not exist)
create_table()

# Now, we can reference the table
table = dynamodb.Table('MultiTenantTable')

def add_user(tenant_id, user_name, user_email):
    user_id = str(uuid4())  # Generate a UUID for user_id
    pk = f"tenant#{tenant_id}"
    sk = f"user#{user_id}"

    # Add user item to DynamoDB table
    table.put_item(
        Item={
            'PK': pk,
            'SK': sk,
            'user_name': user_name,
            'user_email': user_email,
            'user_id': user_id  # Storing the user_id here for future use
        }
    )
    print(f"User {user_name} added for tenant {tenant_id}")
    return user_id  # Return user_id so we can use it later for querying

def add_order(tenant_id, user_id, order_amount):
    order_id = str(uuid4())  # Generate a UUID for order_id
    pk = f"tenant#{tenant_id}"
    sk = f"order#{order_id}"

    # Add order item to DynamoDB table
    table.put_item(
        Item={
            'PK': pk,
            'SK': sk,
            'user_id': user_id,
            'order_amount': Decimal(order_amount)
        }
    )
    print(f"Order {order_id} added for tenant {tenant_id}")

def get_user(tenant_id, user_id):
    pk = f"tenant#{tenant_id}"
    sk = f"user#{user_id}"

    response = table.get_item(
        Key={
            'PK': pk,
            'SK': sk
        }
    )

    item = response.get('Item')
    if item:
        print(f"User found: {item}")
    else:
        print(f"User {user_id} not found for tenant {tenant_id}")

def get_orders_for_tenant(tenant_id):
    pk = f"tenant#{tenant_id}"

    response = table.query(
        KeyConditionExpression=Key('PK').eq(pk) & Key('SK').begins_with("order#")
    )

    orders = response.get('Items', [])
    if orders:
        print(f"Orders for tenant {tenant_id}: {orders}")
    else:
        print(f"No orders found for tenant {tenant_id}")

# Example of adding data for multiple tenants
tenant_1_id = str(uuid4())
tenant_2_id = str(uuid4())

# Add users and get user IDs
user_1_id = add_user(tenant_1_id, 'Alice', 'alice@example.com')
user_2_id = add_user(tenant_2_id, 'Bob', 'bob@example.com')

# Add orders using the generated user_ids
add_order(tenant_1_id, user_1_id, 150)
add_order(tenant_2_id, user_2_id, 200)

# Example of querying data
get_user(tenant_1_id, user_1_id)
get_orders_for_tenant(tenant_1_id)

Explanation of the Code

create_table():
This function creates a DynamoDB table named MultiTenantTable with a partition key (PK) and a sort key (SK). The PK contains the tenant identifier, and the SK differentiates between users, orders, etc. The provisioned throughput is set to 5 read and 5 write capacity units. Adjust this based on your application's scale.
add_user() and add_order():
These functions add users and orders to the table. Data is associated with the tenant using PK, and the specific item type (user or order) is differentiated using SK.
get_user() and get_orders_for_tenant():
These functions retrieve data based on the tenant and user identifiers, isolating data per tenant.

Conclusion

By using DynamoDB with a well-designed schema that incorporates multi-tenancy principles, we can efficiently store and query data for multiple tenants in a shared table. This approach ensures data isolation between tenants while leveraging DynamoDB’s scalability and performance.