Ken Tune for Aerospike

Posted on Jul 10, 2020 • Edited on Jul 14, 2020

Record Aggregation in Aerospike For Performance and Economy

#aerospike

A strong differentiator for Aerospike vs other key value databases is its DRAM economy. Every object has a 64 byte DRAM footprint no matter what the size of the actual object is. You can manage a billion objects using only 128Gb of DRAM, allowing for 2x replication.

Great news! A billion is a pretty big number and 3 * 512GB nodes gets me to 12bn. Within reason I can have as many objects as I like. I should start making them right away with no further thought required.

Hold your horses, cowboy. It might not be as simple as that.

For instance, what if your objects are very small? Worst case, they’re so small that they’re of the order of 64 bytes, so now your memory footprint is similar to your disk footprint. It might even be the case that your memory to disk ratio is such that your DRAM is full when your disk is half empty. In a bare metal situation, buy less disk / more DRAM for sure, but you might be in the cloud where you’re stuck with certain DRAM / storage ratios. Or maybe these machines got brought to you for re-purposing.

A technique informally known as blocking can help you. You store your objects within larger objects. Your API turns this into an implementation detail. Blocking can reduce your memory footprint, helping with the small object use case.

Aerospike lets you do this by offering a comprehensive List and Map API which allows you to place objects within objects as well as retrieving them in an efficient manner. List structures can be used for storing structured data such as time series of a regular frequency while maps can be used to reduce your key space. The API offered is a distinguishing feature of Aerospike when contrasted with other key value databases.

Let’s look again at our example. Suppose your key space is composed of device ids, and these are fundamentally UUIDs — 128 bit numbers or 32 digit hexadecimal numbers. Let’s say you anticipate you may need to store as many as 15bn of these, but each record is only around 200 bytes. Your DRAM requirement with Aerospike would be of the order of

64(bytes) * 15bn * 2(replication) = ~2Tb

Not the end of the world, but you could do better by working smarter.

Assume also that we want to keep our physical object size below 128kb — a good starting point for optimal block size on a flash device, which is recommended for Aerospike. We can put 655 200 byte objects into 128kb.
If each physical object contains 655 actual objects then we require 15bn / 655 = 22.9m container object keys. The question is, how then, given do we map from a device id to the container(physical) object key, and how do we reliably look up a logical object inside thecontainer object. The answer is that we do this using bit-masking.

A 128 bit key space can be converted into a key space of size 2,4 …. 65536… 2²⁰ keys by AND-ing the key with a binary number composed of 1,2 … 16 … 20 etc leading ones followed by trailing zeros. For our example we need a bit mask of size equivalent to the first power of two above our key space size which can be calculated as

ceiling(log(22.9 *10⁶) / log(2)) = 25 bits

This gives us a key space of size 2²⁵ = ~33.5m so we’ve got our maths correct.

Let’s look at how we make use of this in Aerospike

	public static final int BIT_MASK_SIZE = 25;
	public static final BigInteger BIT_MASK = (new BigInteger("2")).pow(BIT_MASK_SIZE).subtract(new BigInteger("1"));
	public static final String METADATA_BIN_NAME = "deviceMetaData";
	public static final AerospikeClient aeroClient = new AerospikeClient(hostName,aerospikeServicePort);

	/**
	* Store device meta data in Aerospike using 'blocking' technique
	* @param deviceIDasUUID - device id
	* @param deviceMetaData - metadat
	*/
	public void storeDeviceMetaData(UUID deviceIDasUUID ,Map<String,Object> deviceMetaData){
	// Turn UUID into a BigInteger to help with bit masking
	BigInteger deviceID = new BigInteger(deviceIDasUUID.toString().replace("-",""),32);
	// Do bit masking
	BigInteger physicalKey = deviceID.and(BIT_MASK);
	// Construct Aerospike key using key built using bit masking
	Key aerospikeKey= new Key(deviceDataNamespace,deviceDataSetName,physicalKey.toString());

	// Store using Map API - stores device metadata in an Aerospike object using the bit masked key
	// but within that object inserts to a map using the device id as a key
	aeroClient.operate(new WritePolicy(),aerospikeKey,
	MapOperation.put(new MapPolicy(),METADATA_BIN_NAME,Value.get(deviceIDasUUID.toString()), Value.get(deviceMetaData)));
	}

view raw storeDeviceMetaDataUsingBlocking.java hosted with ❤ by GitHub

This function stores our device metadata inside a physical object. As described, the physical object key is derived using bit masking. Note this is efficient from a network capacity point of view — only the metadata gets sent across the network, not the full physical object.

We also need to see how to retrieve our object

	public static final int BIT_MASK_SIZE = 25;
	public static final BigInteger BIT_MASK = (new BigInteger("2")).pow(BIT_MASK_SIZE).subtract(new BigInteger("1"));
	public static final String METADATA_BIN_NAME = "deviceMetaData";
	public static final AerospikeClient aeroClient = new AerospikeClient(hostName,aerospikeServicePort);

	/**
	* Get device meta data from Aerospike having employed 'blocking' technique
	* @param deviceIDasUUID - device id
	* @return device meta data
	*/
	public Map<String,Object> getDeviceMetaData(UUID deviceIDasUUID){
	// Turn UUID into a BigInteger to help with bit masking
	BigInteger deviceID = new BigInteger(deviceIDasUUID.toString().replace("-",""),32);
	// Do bit masking
	BigInteger physicalKey = deviceID.and(BIT_MASK);
	// Construct Aerospike key using key built using bit masking
	Key aerospikeKey= new Key(deviceDataNamespace,deviceDataSetName,physicalKey.toString());

	// Retrieve record from map. We use the 'physical' aerospikeKey to identify the object
	// and the device id to retrieve the metadata from the map
	Record r = aeroClient.operate(null,aerospikeKey,
	MapOperation.getByKey(METADATA_BIN_NAME,Value.get(deviceIDasUUID.toString()), MapReturnType.VALUE));

	return (Map<String, Object>) r.getMap(METADATA_BIN_NAME);
	}

view raw getDeviceMetaDataUsingBlocking.java hosted with ❤ by GitHub

The construction of the physical key is as before. This time we use the getByKey operation to retrieve the device metadata.

An important point to note is that only the metadata requested is transmitted across the network not the entire physical object. This consideration applies in general to the calls offered by the List/Map API. This is what we mean by ‘economy’ in the article title.

Finally, a code snippet showing how to calculate the bit mask using the ‘natural’ inputs.

	/**
	* Obtain bit mask required to ensure we have on average a specific number of logical objects in each physical object
	* @param keySpaceSize - size of key space
	* @param objectsPerBlock - average number of objects stored in each physical object
	* @return bit mask as a BigInteger
	*/
	public static BigInteger getBitMask(long keySpaceSize,int objectsPerBlock){
	// No of bits required is
	// log(keySpaceSize/objectsPerBlock) / log(2)
	// as reqd # of keys is keySpaceSize/objectsPerBlock
	// round up as bit count has to be integral
	int bitMaskSize = (int)Math.ceil((Math.log(keySpaceSize/objectsPerBlock) / Math.log(2)));
	// Bit mask - 2^mask size - 1 to get all bits set
	return (new BigInteger("2")).pow(bitMaskSize).subtract(new BigInteger("1"));
	}

view raw bitMaskCreationForBlockStorage.java hosted with ❤ by GitHub

The net benefit of all the above is that the memory footprint will be reduced in this case to

2²⁵ (keys) * 64 (DRAM cost per record) * 2 (rep factor) = 4Gb

from

15bn * 64 (DRAM cost per record) * 2 (rep factor) = ~1.8TB

The reduction factor is 447, slightly less than the '655' quoted above as on average, our 128kb blocks will not be completely filled.

Before we close, worth noting our Enterprise only ‘all flash’ capability which allows both index and data to be placed on disk thus reducing DRAM usage to very low levels. This was developed specifically with the use cases of small objects and/or very large numbers of objects (~10¹² = 1 trillion ) in mind. It will engender higher levels of latency (~5ms vs ~1ms at the 95th percentile ) but it’s still competitive vs any other database out there.

The above solution is a good example of a differentiating feature, our List and Map API , providing a distinguishing optimisation under constraints. The technique of ‘blocking’ can also be made use of for time series data which I hope to explore in a future article.

Cover image with thanks to Nana Smirnova

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Forem

Record Aggregation in Aerospike For Performance and Economy

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

Top comments (0)

See why 4M developers consider Sentry, “not bad.”

Read next

"Unlocking Robotic Mastery: The IKER Framework Revolutionizes Manipulation Tasks"

Mastering the Art of Conversational AI with Python: A Step-by-Step Guide

Exploring AI's Power: Building a Basic Chatbot in Python

New Method Reveals Hidden 'Fingerprints' in AI Language Models to Protect Ownership

Okay