Aquiles

Posted on Mar 31

JOLT: A Complete Guide to JSON Transformation (Basics to Advanced)

#json #api #java #webdev

JOLT is a powerful tool for transforming JSON, but understanding how it works—especially wildcards and nested structures—can be challenging.

This guide walks from basic concepts to advanced transformations with detailed examples.

Introduction

JOLT (JsOn Language for Transform) is a language used to transform JSON structures and is commonly found in integration layers where payloads need to be normalized, reshaped, enriched, or reduced before being consumed by downstream systems.

This article was written as a detailed learning reference for developers with little or no prior experience with JOLT. The objective is to start from the basics and gradually move to more advanced behavior, preserving the details that are usually necessary to truly understand how JOLT works in practice.

For testing all examples contained in this article:

JOLT playground

JOLT — JsOn Language for Transform

JOLT is a language used for the transformation of JSON. It uses the following basic structure:

[
  {
    "operation": "",
    "spec": {}
  }
]

Where:

operation: defines the type of transformation that will be applied
spec: field where the transformation is defined
[]: the basic JOLT structure is also a JSON list, therefore we can chain multiple operations inside it

The transformation details will always depend on the input JSON.

Operations

There are several types of operations in JOLT.

They are:

shift
default
remove
sort
cardinality
modify-default-beta
modify-overwrite-beta

Each operation has its own set of rules and works in different ways.

However, they all follow the same principle for transformations: navigation through the input JSON structure.

`Shift`

Used to change the structure of a JSON while preserving values contained in that same JSON.

Its usage consists of navigating the JSON structure to the field or object whose value we want to extract and then informing where this value should be placed in the new JSON.

Let’s see the example below.

We have an input JSON containing information about clients:

{
  "client": {
    "name": "Sample Client",
    "email": "sample-client@email.com",
    "ssn": "123.456.789.10",
    "birthDate": "02/15/1985",
    "address": "Sample Client street, 123",
    "country": "United States",
    "number": "8888-8888"
  }
}

And we want a new JSON with the following structure:

{
  "customer": {
    "fullName": "Sample Client",
    "birthDate": "02/15/1985",
    "phoneNumber": "8888-8888",
    "mobileNumber": "8888-8888",
    "address": {
      "street": "Sample Client street, 123",
      "country": "United States"
    }
  }
}

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "client": {
        "name": "customer.fullName",
        "birthDate": "customer.birthDate",
        "address": "customer.address.street",
        "country": "customer.address.country",
        "number": ["customer.phoneNumber", "customer.mobileNumber"]
      }
    }
  }
]

What we did above was navigate to the fields of interest and inform where the value of each one should be inserted.

Through the dot (.) notation, we are able to define levels in the new JSON we want to create.

With:

"name": "customer.fullName"

we take the value of the field name and place it into the field fullName inside the object customer.

And in:

"address": "customer.address.street"

we take the value of the field address and place it into the field street inside the object address, which is also contained inside the object customer.

Tip: We can take the same value and place it into more than one field at the same time.

In:

"number": ["customer.phoneNumber", "customer.mobileNumber"]

we take the value of the field number and place it into the fields phoneNumber and mobileNumber, both contained in customer. This approach allows us to transpose one value into n new fields.

Important: In this operation, only the fields explicitly manipulated in the transformation will be replicated. Any data in the input JSON that is not transformed will be discarded. In the previous example, ssn is ignored and does not appear in the output.

Final JSON:

{
  "customer": {
    "fullName": "Sample Client",
    "birthDate": "02/15/1985",
    "phoneNumber": "8888-8888",
    "mobileNumber": "8888-8888",
    "address": {
      "street": "Sample Client street, 123",
      "country": "United States"
    }
  }
}

`Default`

Used to add new fields or objects to a JSON if they do not already exist.

Its usage consists of navigating the JSON structure to the desired level and inserting the field or object with its respective value.

Important: If the field declared in the transformation already exists in the input JSON, the transformation has no effect.

Let’s see the example below.

We have an input JSON containing information about a customer:

{
  "customer": {
    "name": "Customer Default",
    "ssn": "123.456.789.10"
  }
}

However, we need a JSON that, in addition to the name and ssn, also contains the customer’s date of birth:

{
  "customer": {
    "name": "Customer Default",
    "ssn": "123.456.789.10",
    "birthDate": "01/01/1970"
  }
}

Our transformation will be:

[
  {
    "operation": "default",
    "spec": {
      "customer": {
        "birthDate": "01/01/1970"
      }
    }
  }
]

Above, we navigate to the object customer and add the field birthDate with a default value.

`Remove`

Used to remove fields or objects from a JSON.

Its usage consists of navigating the input JSON structure to the desired level and informing the field to be removed.

Let’s see the example below.

We have an input JSON containing information about a customer:

{
  "customer": {
    "name": "Customer Default",
    "ssn": "123.456.789.10",
    "birthDate": "01/01/1970"
  }
}

However, we need a JSON that only contains the customer’s name and ssn:

{
  "customer": {
    "name": "Customer Default",
    "ssn": "123.456.789.10"
  }
}

Our transformation will be:

[
  {
    "operation": "remove",
    "spec": {
      "customer": {
        "birthDate": ""
      }
    }
  }
]

What we did above was navigate to the field birthDate and assign it to empty quotation marks.

The field to be removed must always be assigned to an empty string, otherwise there will be an error in the transformation and it will not occur.

`Sort`

Used to sort fields and objects in a JSON in alphabetical order.

Important: The ordering of fields and objects cannot be configured, therefore the entire JSON is affected. Only field and object names are ordered, not their values.

Let’s see the example below.

We have an input JSON containing information about an employee:

{
  "employee": {
    "phone": "9 9999-9999",
    "name": "Employee Sort",
    "birthDate": "01/01/1980",
    "role": "JOLT Analyst"
  }
}

We need all fields contained in the input JSON to be ordered alphabetically:

{
  "employee": {
    "birthDate": "01/01/1980",
    "name": "Employee Sort",
    "phone": "9 9999-9999",
    "role": "JOLT Analyst"
  }
}

For the sort operation, we do not need to define the spec object.

Our transformation will be:

[
  {
    "operation": "sort"
  }
]

`Cardinality`

Used to transform simple fields and objects into lists of objects and vice versa.

Important: When we transform a list of objects into a simple field or object, only the first element of the list is considered.

Let’s see the example below.

We have an input JSON containing information about products:

{
  "products": {
    "name": "Product A",
    "id": "123-A",
    "value": 10
  }
}

We need to transform the products object into a list:

{
  "products": [
    {
      "name": "Product A",
      "id": "123-A",
      "value": 10
    }
  ]
}

Our transformation will be:

[
  {
    "operation": "cardinality",
    "spec": {
      "products": "MANY"
    }
  }
]

In case we have a list of products:

{
  "products": [
    {
      "name": "Product A",
      "id": "123-A",
      "value": 10
    },
    {
      "name": "Product B",
      "id": "456-B",
      "value": 20
    }
  ]
}

And we need to transform that list into a simple object:

{
  "products": {
    "name": "Product A",
    "id": "123-A",
    "value": 10
  }
}

Our transformation will be:

[
  {
    "operation": "cardinality",
    "spec": {
      "products": "ONE"
    }
  }
]

Advanced Concepts

Until now, we have seen essential concepts to understand JOLT and how to use it. Before moving to the remaining operations, we need to learn a few more elaborate concepts.

As a prerequisite for moving forward, we will see two concepts that are used in future explanations:

LHS (Left Hand Side) Used to reference the left side of the transformation.
RHS (Right Hand Side) Used to reference the right side of the transformation.

That is, all JSON content that is before the colon (:) is the LHS, and what is after the colon is the RHS.

Transformation example:

[
  {
    "operation": "shift",
    "spec": {
      "customer": {
        "name": "client.fullName",
        "birthDate": "client.dateOfBirth",
        "address": "client.address.street",
        "country": "client.address.country"
      }
    }
  }
]

Now we can move forward.

The great power of JOLT lies in the possibility of dealing with transformations in a dynamic way. For this, we use wildcards, which are specific characters allocated in different ways in our transformations, each with a different function.

One wildcard can have different functions depending on its usage (LHS or RHS), and we can also combine different wildcards in the same transformation.

Below we will see their definitions and some usage examples.

`&`

It uses the content declared in the LHS to compose the structure of the output JSON, without needing to make this content explicit in the transformation. This wildcard is based on the navigation performed during the transformation.

Usage: RHS
Operation: shift

Example

We have a JSON that contains customer data:

{
  "name": "Client Example",
  "email": "client-example@email.com"
}

And we need this data inside an object called client:

{
  "client": {
    "name": "Client Example",
    "email": "client-example@email.com"
  }
}

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "name": "client.&",
      "email": "client.&"
    }
  }
]

In &, we take the values of the fields name and email and assign them to fields called name and email inside a client object. Thus, we create a new JSON while preserving the input field names.

`*` (asterisk)

References all fields and objects in a JSON without needing to explicitly declare their names in the transformation.

Usage: LHS
Operations: shift, remove, cardinality, modify-default-beta, modify-overwrite-beta

Example

We have an input JSON containing customer data:

{
  "name": "Customer Example",
  "email": "client-example@email.com",
  "document": "1234567890",
  "birthDate": "10/31/1990",
  "address": "Customer Example Street"
}

And we need this data inside an object named customer, but we need to change the field document to a field named ssn:

{
  "customer": {
    "name": "Customer Example",
    "email": "client-example@email.com",
    "ssn": "1234567890",
    "birthDate": "10/31/1990",
    "address": "Customer Example Street"
  }
}

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "*": "customer.&",
      "document": "customer.ssn"
    }
  }
]

In the line:

"*": "customer.&"

we are taking any content that exists in the input JSON and placing it in an object named customer, keeping the original field name and value.

As for the field document, we are taking its value and assigning it to a field named ssn inside the same object.

Using the wildcard * next to & means that for each field that * finds, & keeps its name and value. This combined usage is very useful because it allows us to manipulate a JSON without needing to know and declare all of its content beforehand.

`@`

References the value of a field or object contained in the input JSON, but produces different effects depending on how it is used.

Usage: LHS and RHS
Operations: shift (LHS and RHS), modify-default-beta (RHS), modify-overwrite-beta (RHS)

`Shift` example

We have a JSON containing isolated product information:

{
  "key": "code",
  "value": "123-ABC"
}

And we need to group this into a product object, relating the key field to the value field:

{
  "product": {
    "code": "123-ABC"
  }
}

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "value": "product.@(1,key)"
    }
  }
]

In:

"@(1,key)"

we are taking the value of the field key to be used as the name of the field that will receive the value of the field value.

The use of @ involves declaring the level at which we are seeking information and counting levels from 1 onward.

In this case, the field key is at the same level as the field value, so we use the number 1.

The usage of @ in the LHS follows the same principle as in the RHS.

`modify-default-beta` and `modify-overwrite-beta` example

We have a JSON containing isolated product information:

{
  "product": {
    "name": "Product A",
    "price": 10
  },
  "manufacturer": "Company A"
}

And we need the product object to contain a company field with the value of the field manufacturer, which is outside product:

{
  "product": {
    "name": "Product A",
    "price": 10,
    "company": "Company A"
  },
  "manufacturer": "Company A"
}

The transformation will be:

[
  {
    "operation": "modify-default-beta",
    "spec": {
      "product": {
        "company": "@(2,manufacturer)"
      }
    }
  }
]

For modify-overwrite-beta, we would have the same transformation structure.

What we did was create a field named company and assign to it the value of the field manufacturer. For that, we moved up to level 2 in order to be able to see the field manufacturer and retrieve its value.

The difference between modify-default-beta and modify-overwrite-beta is that in modify-default-beta the inclusion of the field company only happens if there is no other field named company inside product.

For modify-overwrite-beta, the field company is included even if the field company already exists inside product. As the name suggests, if the content already exists in the input JSON, it is overwritten.

`$`

References the name of a field or object contained in the input JSON so that this name can be used as the value of a field or object in the output JSON.

Usage: LHS
Operation: shift

Example

We have an input JSON containing product data:

{
  "product": {
    "name": "Product Example",
    "value": 10,
    "category": "CATEG-1",
    "weight": 25
  }
}

And we need a JSON to know what product information is being provided:

{
  "product": [
    "name",
    "value",
    "category",
    "weight"
  ]
}

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "product": {
        "*": {
          "$": "product[]"
        }
      }
    }
  }
]

What we did was select all (*) the fields of the object product, then take the name ($) of each field and assign them to a list named product.

That way, we obtain the field names rather than their values.

`#`

If used in the LHS, it inserts values manually in the output JSON.

In the RHS, it is applicable only when creating lists and is used to group certain content of the input JSON within the list being created.

Usage: LHS and RHS
Operation: shift

LHS example

We have an input JSON with product information:

{
  "product": {
    "name": "Product Example",
    "value": 10,
    "weight": 25
  }
}

And we need a JSON that contains name, value, weight, and category:

{
  "product": {
    "name": "Product Example",
    "value": 10,
    "category": "CATEG-1",
    "weight": 25
  }
}

However, the input JSON never provides category, so we need to add this field manually:

[
  {
    "operation": "shift",
    "spec": {
      "product": {
        "*": "product.&",
        "#DEFAULT-CATEGORY": "product.category"
      }
    }
  }
]

The value contained after the wildcard # will always be assigned to the field declared in the RHS, which in our case is the field category inside the object product.

RHS example

We have an input JSON containing a list of products:

{
  "products": [
    {
      "code": "PROD-A",
      "value": 10
    },
    {
      "code": "PROD-B",
      "value": 20
    }
  ]
}

And we need only to change the name of the field value to price:

{
  "products": [
    {
      "code": "PROD-A",
      "price": 10
    },
    {
      "code": "PROD-B",
      "price": 20
    }
  ]
}

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "products": {
        "*": {
          "code": "products[#2].&",
          "value": "products[#2].price"
        }
      }
    }
  }
]

The use of # in the RHS involves declaring the level at which we are seeking information. The declaration [#2] represents:

the creation of a list ([])
and that it must group (#) all the information found 2 levels above

We need this declaration in order to guarantee the correct grouping of each product with its respective code and price.

That is, in:

"code": "products[#2].&"

we are taking the value from the field code and placing it into another field code inside a list named products.

And in:

"value": "products[#2].price"

we take the value from the field value and place it into a field named price inside the same list.

When creating the list products, we look 2 levels above (the level of the list products) and the way each item is grouped in the input list is preserved in the new list.

This level-based grouping is one of the most important details to understand when working with JOLT lists.

`|` (pipe)

Allows referencing multiple fields or objects of an input JSON so that, regardless of the name of the field or object, its value is allocated to the same destination in the output JSON.

Usage: LHS
Operation: shift

Example

We have an input JSON containing customer data:

{
  "customer": {
    "fullName": "Customer Example",
    "email": "customer-example@email.com"
  }
}

And we need a JSON with the following structure:

{
  "customer": {
    "name": "Customer Example",
    "email": "customer-example@email.com"
  }
}

However, in the input JSON there is the possibility that the field fullName comes as customerName, so we need the transformation to be prepared to recognize both possibilities:

[
  {
    "operation": "shift",
    "spec": {
      "customer": {
        "fullName|customerName": "customer.name",
        "email": "customer.&"
      }
    }
  }
]

Operations `modify-default-beta` and `modify-overwrite-beta`

As mentioned in the explanation of the wildcard @, these operations allow us to dynamically reference values. While modify-default-beta assigns a value to a field only if it does not already exist, modify-overwrite-beta overwrites the value even if the field already exists.

However, modify-overwrite-beta also allows us to apply functions to our JSON.

They are:

String
- toLower
- toUpper
- concat
- join
- split
- substring
- trim
- leftPad
- rightPad
Number
- min
- max
- abs
- avg
- intSum
- doubleSum
- longSum
- intSubtract
- doubleSubtract
- longSubtract
- divide
- divideAndRound
Type
- toInteger
- toDouble
- toLong
- toBoolean
- toString
- recursivelySquashNulls
- squashNulls
List
- firstElement
- lastElement
- elementAt
- toList
- sort
- size

Input JSON

{
  "STRING": {
    "product": "Product A",
    "company": "company a",
    "value": "100",
    "measureWithSpaces": "  10 meters "
  },
  "NUMBER": {
    "array": [3, 5, 2, 7, 1],
    "negativeValue": -100,
    "positiveValue": 50
  },
  "TYPE": {
    "value": 10.5,
    "stringBoolean": "true",
    "objectWithNull": {
      "fieldWithValue": "ABC",
      "nullField": null
    }
  },
  "LIST": {
    "array": ["c", "t", "m", "a"],
    "stringField": "123"
  }
}

Transformation

[
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "STRING": {
        "product": "=toLower(@(1,product))",
        "company": "=toUpper(@(1,company))",
        "product_company": "=concat(@(1,product),'_',@(1,company))",
        "joinProductCompany": "=join(' - ',@(1,product),@(1,company))",
        "splitProductCompany": "=split('[-]',@(1,joinProductCompany))",
        "substringProduct": "=substring(@(1,product),0,4)",
        "value": "=leftPad(@(1,value),6,'A')",
        "measure": "=trim(@(1,measureWithSpaces))"
      },
      "NUMBER": {
        "minArray": "=min(@(1,array))",
        "maxArray": "=max(@(1,array))",
        "absoluteValue": "=abs(@(1,negativeValue))",
        "averageArray": "=avg(@(1,array))",
        "sumArray": "=intSum(@(1,array))",
        "subtrArray": "=intSubtract(@(1,positiveValue),20)",
        "division": "=divide(@(1,positiveValue),2)",
        "divisionRound": "=divideAndRound(3,@(1,positiveValue),3)"
      },
      "TYPE": {
        "integerValue": "=toInteger(@(1,value))",
        "booleano": "=toBoolean(@(1,stringBoolean))",
        "stringValue": "=toString(@(1,value))",
        "stringBoolean": "=size",
        "objectWithNull": "=recursivelySquashNulls"
      },
      "LIST": {
        "arrayFirstItem": "=firstElement(@(1,array))",
        "arrayLastItem": "=lastElement(@(1,array))",
        "arrayElement": "=elementAt(@(1,array),2)",
        "fieldToList": "=toList(@(1,stringField))",
        "orderedArray": "=sort(@(1,array))"
      }
    }
  }
]

Output JSON

{
  "STRING": {
    "product": "product a",
    "company": "COMPANY A",
    "value": "AAA100",
    "measureWithSpaces": "  10 meters ",
    "product_company": "product a_COMPANY A",
    "joinProductCompany": "product a - COMPANY A",
    "splitProductCompany": [
      "product a ",
      " COMPANY A"
    ],
    "substringProduct": "prod",
    "measure": "10 meters"
  },
  "NUMBER": {
    "array": [3, 5, 2, 7, 1],
    "negativeValue": -100,
    "positiveValue": 50,
    "minArray": 1,
    "maxArray": 7,
    "absoluteValue": 100,
    "averageArray": 3.6,
    "sumArray": 18,
    "subtrArray": 30,
    "division": 25,
    "divisionRound": 16.667
  },
  "TYPE": {
    "value": 10.5,
    "stringBoolean": 4,
    "objectWithNull": {
      "fieldWithValue": "ABC"
    },
    "integerValue": 10,
    "booleano": true,
    "stringValue": "10.5"
  },
  "LIST": {
    "array": ["c", "t", "m", "a"],
    "stringField": "123",
    "arrayFirstItem": "c",
    "arrayLastItem": "a",
    "arrayElement": "m",
    "fieldToList": ["123"],
    "orderedArray": ["a", "c", "m", "t"]
  }
}

Note: Some functions were not included as isolated examples because they follow the same application pattern as others already shown. For example, doubleSum and longSum are applied the same way as intSum.

Regarding recursivelySquashNulls and squashNulls, both are applicable only to objects and lists and are used to remove fields with null values. However:

recursivelySquashNulls looks at all levels below the object or list
squashNulls looks only 1 level below

Cascade behavior

modify-overwrite-beta executes in cascade. That is, each new transformation is impacted by previous transformations.

To understand this behavior, let’s take a snippet from the previous example:

"STRING": {
  "product": "=toLower(@(1,product))",
  "company": "=toUpper(@(1,company))",
  "product_company": "=concat(@(1,product),'_',@(1,company))"
}

In:

"product": "=toLower(@(1,product))"

we change the value of product to lowercase.

In:

"company": "=toUpper(@(1,company))"

we change the value of company to uppercase.

So, when we execute:

"product_company": "=concat(@(1,product),'_',@(1,company))"

we are already using the transformed values of product and company, not their original values from the input JSON.

This means the result is influenced by everything that came before it in the same operation, which is why understanding execution order is important.

Final Considerations

JOLT is simple in concept, but not always simple in practice.

The more complex the transformation, the more important it becomes to understand:

JSON navigation
LHS and RHS behavior
wildcard semantics
level counting
list grouping
execution order

Once these concepts are clear, JOLT becomes a very powerful way to transform JSON without writing imperative code.

For a more technical reading about JOLT:

JOLT GitHub documentation

DEV Community

JOLT: A Complete Guide to JSON Transformation (Basics to Advanced)

Introduction

JOLT — JsOn Language for Transform

Operations

`Shift`

`Default`

`Remove`

`Sort`

`Cardinality`

Advanced Concepts

`&`

Example

`*` (asterisk)

Example

`@`

`Shift` example

`modify-default-beta` and `modify-overwrite-beta` example

`$`

Example

`#`

LHS example

RHS example

`|` (pipe)

Example

Operations `modify-default-beta` and `modify-overwrite-beta`

Input JSON

Transformation

Output JSON

Cascade behavior

Final Considerations

Top comments (0)

Introduction

JOLT — JsOn Language for Transform

Operations

Shift

Default

Remove

Sort

Cardinality

Advanced Concepts

&

Example

* (asterisk)

Example

@

Shift example

modify-default-beta and modify-overwrite-beta example

$

Example

#

LHS example

RHS example

| (pipe)

Example

Operations modify-default-beta and modify-overwrite-beta

Input JSON

Transformation

Output JSON

Cascade behavior

Final Considerations

`Shift`

`Default`

`Remove`

`Sort`

`Cardinality`

`&`

`*` (asterisk)

`@`

`Shift` example

`modify-default-beta` and `modify-overwrite-beta` example

`$`

`#`

`|` (pipe)

Operations `modify-default-beta` and `modify-overwrite-beta`