DEV Community

Cover image for Optimizing Bank Data Growth with Sharding Architecture
Joao Marques
Joao Marques

Posted on • Updated on

Optimizing Bank Data Growth with Sharding Architecture

As the number of customers and the volume of related data increase, applications tend to become slower due to bottlenecks in the database. Databases need to perform increasingly complex searches on larger volumes of data.

In the case of banks, this is crucial! Most financial institutions prefer to have millions of customers rather than just hundreds. Therefore, developers need to architect the backend and the database from the outset to expand exponentially and avoid bottlenecks caused by data growth.

By adopting certain practices, developers can build robust and scalable systems that meet the demands of large financial institutions. The use of the Spring Framework, with its tools and libraries, facilitates the implementation of these techniques, providing a backend capable of growing along with the bank and its customers.

The Importance of Scalability in Banks

For banks, database scalability is crucial for several reasons:

High Availability: In a bank, any interruption in services can have significant consequences. A scalable database helps ensure that services remain available even with a sudden increase in the number of transactions.

Consistent Performance: As the volume of data increases, it is vital that queries and operations in the database maintain consistent performance to provide a good customer experience.

Compliance and Security: Banks handle sensitive data and face strict regulations. A scalable database architecture must also consider security and regulatory compliance aspects. Without this care, validations can become very costly.

Database Sharding

In this article, I will discuss one of the most commonly used techniques by large banks to ensure the scalability and efficiency of their databases: database sharding.

Sharding, a form of horizontal partitioning, is a scalability technique that involves dividing a large database into smaller, more manageable pieces called shards. Each shard is a complete and independent database containing a portion of the total data. These shards can be distributed across different servers, which helps distribute the workload and improve system performance.

Why is Sharding Important for Banks?

For large banks, sharding offers several advantages:

Better Performance: By dividing the data into shards, queries and operations are distributed among multiple servers. This reduces the load on each individual server, resulting in faster response times.
Horizontal Scalability: As the volume of data and transactions grows, new shards can be added as needed. This allows the system to expand efficiently and economically.
Fault Isolation: If a shard fails, only a portion of the data is affected, while the other shards continue to operate normally. This increases the resilience and availability of the system.
Region-Specific Data Management: For banks that expand internationally, like Nubank, shards can be located near the regions where the data is most accessed, further improving performance.

architecture

Note that in the image, the information about which shard contains the user's data is stored in the authentication database. After login, this information is passed to the API Gateway and then to the Spring Framework components. This ensures that requests are correctly directed to the appropriate shard, optimizing system performance and scalability.

Architecture Advantage for Banks
Banks have a significant advantage: users do not need to be aware of other users within the bank. Some banks, like PicPay, offer this functionality, but they are rare. The relationship is always between the user and the central bank. This simplifies the backend design since it only needs to capture user data during authentication. After that, all account data can be located in the shards.

Central bank data flow

Microservice code example

In order to receive the shard information, you can get it from the request coming from the API gateway.
And immediatly set the it to the shard context:

@PostMapping("/user/deposit")
public ResponseEntity<depositResponseDTO> userDeposit(
        @RequestAttribute long userId,
        @RequestAttribute String shard,
        @Valid @RequestBody depositRequestDTO request) {
    log.info(USER_DEPOSIT, userId);
    ShardContext.setCurrentShard(shard);
    return new ResponseEntity<>(userService.deposit(request, userId), HttpStatus.OK);
}
Enter fullscreen mode Exit fullscreen mode

Here we create a class ShardContextUsed to hold information of which shard the request belongs. During the execution of the current thread, we can always call ShardContext.getCurrentShard() to know which shard we should access.

public class ShardContext {

    private static final ThreadLocal<String> currentShard = new ThreadLocal<>();

    public static void setCurrentShard(String shardKey) {
        currentShard.set(shardKey);
    }

    public static String getCurrentShard() {
        return currentShard.get();
    }

    public static void clear() {
        currentShard.remove();
    }
}
Enter fullscreen mode Exit fullscreen mode

here we dynamically determine the actual datasource on the current context, calling the method that i explained above.

public class RoutingDataSource extends AbstractRoutingDataSource {

    @Override
    protected Object determineCurrentLookupKey() {
        return ShardContext.getCurrentShard();
    }
}
Enter fullscreen mode Exit fullscreen mode

And finally, we define the datasources, you should do it using the application-{profile}.yml file, but here I'm just explaining to you how it works.

Notice that Im using the class RoutingDataSource defined above.

@Configuration
public class DataSourceConfig {

    @Bean
    public DataSource dataSource() {
        // Configuration of multiple data sources (shards)
        DataSource shard1 = createDataSource("jdbc:mysql://shard1-url", "username", "password");
        DataSource shard2 = createDataSource("jdbc:mysql://shard2-url", "username", "password");

        // Routing data source
        Map<Object, Object> targetDataSources = new HashMap<>();
        targetDataSources.put("shard1", shard1);
        targetDataSources.put("shard2", shard2);

        RoutingDataSource routingDataSource = new RoutingDataSource();
        routingDataSource.setTargetDataSources(targetDataSources);
        routingDataSource.setDefaultTargetDataSource(shard1);

        return routingDataSource;
    }

    private DataSource createDataSource(String url, String username, String password) {
        HikariConfig config = new HikariConfig();
        config.setJdbcUrl(url);
        config.setUsername(username);
        config.setPassword(password);
        return new HikariDataSource(config);
    }
}
Enter fullscreen mode Exit fullscreen mode

And these principles form the foundation of a robust architecture. Banks, more than any other institutions, require highly scalable systems to manage the ever-increasing volume of data efficiently. By implementing sharding and leveraging frameworks like Spring, developers can ensure that banking systems remain resilient, performant, and capable of growing alongside their customer base. In an industry where high availability, consistent performance, and stringent security are paramount, a scalable database architecture isn't just an advantage, it's a necessity.

Top comments (2)

Collapse
 
thexdev profile image
M. Akbar Nugroho

Thanks for sharing, Marques!

I saw your code is written in Java. You know you can use this markdown snippet to colorize your code. So, the code looks nicer 😉

```java
public class ShardContext {}
```
Enter fullscreen mode Exit fullscreen mode

Here's an example...

@PostMapping("/user/deposit")
public ResponseEntity<depositResponseDTO> userDeposit(
        @RequestAttribute long userId,
        @RequestAttribute String shard,
        @Valid @RequestBody depositRequestDTO request) {
    log.info(USER_DEPOSIT, userId);
    ShardContext.setCurrentShard(shard);
    return new ResponseEntity<>(userService.deposit(request, userId), HttpStatus.OK);
}
Enter fullscreen mode Exit fullscreen mode

More about the editor here...

Collapse
 
joaomarques profile image
Joao Marques

wow! thats amazing!
thank you so much for this information!