DEV Community

Cover image for Automated Aerospike All Flash Setup
Ken Tune for Aerospike

Posted on

3 3

Automated Aerospike All Flash Setup

Introduction

Aerospike is a key value database maximising SSD/Flash technology in order to offer best in class throughput and latency at petabyte scale.

Standard Aerospike usage will have the primary key index in DRAM and the data on SSD. Although Aerospike's usage of DRAM is very low at 64 bytes per object, for very large numbers of objects (100bn+) users might wish to consider the all-flash mode in which the primary key index is also placed on disk. More detail at all flash usage.

There are a number of non-trivial steps to go through to set up all flash. For that reason I've extended aerospike-ansible to allow automation of this process. This article walks through the automated process. It's envisaged that this will be useful for those evaluating the feature, or looking to get up and running with it quickly.

A working knowledge of aerospike-ansible is assumed. This introductory article may also be useful.

All Flash Calculations

In order to correctly configure a system for all flash, you need to know the number of partition-tree-sprigs that are appropriate for the object count you will have in your database. You can think of a partition tree sprig as a mini primary key index - we use these in order to have a lower depth primary key tree, allowing us to lookup record location more rapidly. More detail at sprigs.

It's important for all-flash because we size the system so the sprigs fit inside single disk blocks, minimising read and write overhead.

You can find details of the calculation here, but to make life easier a spreadsheet can be found in aerospike-ansible at assets/all-flash-calculator.xlsx.

all-flash-calculator.xlsx

Populate the yellow cells - # of objects, replication factor and object size.

The spreadsheet will calculate required partition-tree-sprigs.

It will also determine the fraction of available disk space that should be given over to the primary key index, based on the object size. In the screenshot, we can see that for 100m records, replication factor 2, average record size 1024 bytes, the overhead per record is 172 bytes and the overall record footprint is 2220 bytes, so approx 1/13 of the disk space should be allocated to the index.

Using Aerospike-Ansible

In vars/cluster-config.yml

  • Set partitions_per_device to the value given in the spreadsheet - 13 in the example. The first partition on each device is used for the all flash index to ensure the correct index:data disk space ratio.
  • Add partition_free_sprigs: YOUR_VALUE - YOUR_VALUE would be 1024 for this example

You will also need to

  • Set all_flash: true
  • Set enterprise: true
  • Provide a path to a valid Aerospike feature key using feature_key: /your/path/to/key. You must therefore be either a licensed Aerospike customer, or running an Aerospike trial.

Having done that

ansible-playbook aws-setup-plus-aerospike-install.yml

You should check that the aggregate disk space across your cluster exceeds the amount recommended in the spreadsheet.

Verification

Once the setup process is complete, log into one of your cluster nodes

./scripts/cluster-quick-ssh.sh 
Enter fullscreen mode Exit fullscreen mode

then access asadm (admin tool) followed by info command

asadm

The index type comes up as 'flash' as per the highlight.

Data Load

You can follow the instructions in benchmarking to quickly load some data into the new configuration.

As before, we can use asadm to examine the (highlighted) disk footprint of the primary key index for (in this case) 10m records (20m includes replicas).

asadm-2

Conclusion

The aerospike-ansible tooling makes it easy to set up all flash for Aerospike and benefit from the DRAM saving it offers.


Cover image Michał Mancewicz

Image of Docusign

Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Immerse yourself in a wealth of knowledge with this piece, supported by the inclusive DEV Community—every developer, no matter where they are in their journey, is invited to contribute to our collective wisdom.

A simple “thank you” goes a long way—express your gratitude below in the comments!

Gathering insights enriches our journey on DEV and fortifies our community ties. Did you find this article valuable? Taking a moment to thank the author can have a significant impact.

Okay