DEV Community

Cover image for JOLT: A Complete Guide to JSON Transformation (Basics to Advanced)
Aquiles
Aquiles

Posted on

JOLT: A Complete Guide to JSON Transformation (Basics to Advanced)

JOLT is a powerful tool for transforming JSON, but understanding how it works—especially wildcards and nested structures—can be challenging.

This guide walks from basic concepts to advanced transformations with detailed examples.

Introduction

JOLT (JsOn Language for Transform) is a language used to transform JSON structures and is commonly found in integration layers where payloads need to be normalized, reshaped, enriched, or reduced before being consumed by downstream systems.

This article was written as a detailed learning reference for developers with little or no prior experience with JOLT. The objective is to start from the basics and gradually move to more advanced behavior, preserving the details that are usually necessary to truly understand how JOLT works in practice.

For testing all examples contained in this article:

JOLT playground


JOLT — JsOn Language for Transform

JOLT is a language used for the transformation of JSON. It uses the following basic structure:

[
  {
    "operation": "",
    "spec": {}
  }
]
Enter fullscreen mode Exit fullscreen mode

Where:

  • operation: defines the type of transformation that will be applied
  • spec: field where the transformation is defined
  • []: the basic JOLT structure is also a JSON list, therefore we can chain multiple operations inside it

The transformation details will always depend on the input JSON.


Operations

There are several types of operations in JOLT.

They are:

  • shift
  • default
  • remove
  • sort
  • cardinality
  • modify-default-beta
  • modify-overwrite-beta

Each operation has its own set of rules and works in different ways.

However, they all follow the same principle for transformations: navigation through the input JSON structure.


Shift

Used to change the structure of a JSON while preserving values contained in that same JSON.

Its usage consists of navigating the JSON structure to the field or object whose value we want to extract and then informing where this value should be placed in the new JSON.

Let’s see the example below.

We have an input JSON containing information about clients:

{
  "client": {
    "name": "Sample Client",
    "email": "sample-client@email.com",
    "ssn": "123.456.789.10",
    "birthDate": "02/15/1985",
    "address": "Sample Client street, 123",
    "country": "United States",
    "number": "8888-8888"
  }
}
Enter fullscreen mode Exit fullscreen mode

And we want a new JSON with the following structure:

{
  "customer": {
    "fullName": "Sample Client",
    "birthDate": "02/15/1985",
    "phoneNumber": "8888-8888",
    "mobileNumber": "8888-8888",
    "address": {
      "street": "Sample Client street, 123",
      "country": "United States"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "client": {
        "name": "customer.fullName",
        "birthDate": "customer.birthDate",
        "address": "customer.address.street",
        "country": "customer.address.country",
        "number": ["customer.phoneNumber", "customer.mobileNumber"]
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

What we did above was navigate to the fields of interest and inform where the value of each one should be inserted.

Through the dot (.) notation, we are able to define levels in the new JSON we want to create.

With:

"name": "customer.fullName"
Enter fullscreen mode Exit fullscreen mode

we take the value of the field name and place it into the field fullName inside the object customer.

And in:

"address": "customer.address.street"
Enter fullscreen mode Exit fullscreen mode

we take the value of the field address and place it into the field street inside the object address, which is also contained inside the object customer.

Tip: We can take the same value and place it into more than one field at the same time.

In:

"number": ["customer.phoneNumber", "customer.mobileNumber"]
Enter fullscreen mode Exit fullscreen mode

we take the value of the field number and place it into the fields phoneNumber and mobileNumber, both contained in customer. This approach allows us to transpose one value into n new fields.

Important: In this operation, only the fields explicitly manipulated in the transformation will be replicated. Any data in the input JSON that is not transformed will be discarded. In the previous example, ssn is ignored and does not appear in the output.

Final JSON:

{
  "customer": {
    "fullName": "Sample Client",
    "birthDate": "02/15/1985",
    "phoneNumber": "8888-8888",
    "mobileNumber": "8888-8888",
    "address": {
      "street": "Sample Client street, 123",
      "country": "United States"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Default

Used to add new fields or objects to a JSON if they do not already exist.

Its usage consists of navigating the JSON structure to the desired level and inserting the field or object with its respective value.

Important: If the field declared in the transformation already exists in the input JSON, the transformation has no effect.

Let’s see the example below.

We have an input JSON containing information about a customer:

{
  "customer": {
    "name": "Customer Default",
    "ssn": "123.456.789.10"
  }
}
Enter fullscreen mode Exit fullscreen mode

However, we need a JSON that, in addition to the name and ssn, also contains the customer’s date of birth:

{
  "customer": {
    "name": "Customer Default",
    "ssn": "123.456.789.10",
    "birthDate": "01/01/1970"
  }
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "default",
    "spec": {
      "customer": {
        "birthDate": "01/01/1970"
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

Above, we navigate to the object customer and add the field birthDate with a default value.


Remove

Used to remove fields or objects from a JSON.

Its usage consists of navigating the input JSON structure to the desired level and informing the field to be removed.

Let’s see the example below.

We have an input JSON containing information about a customer:

{
  "customer": {
    "name": "Customer Default",
    "ssn": "123.456.789.10",
    "birthDate": "01/01/1970"
  }
}
Enter fullscreen mode Exit fullscreen mode

However, we need a JSON that only contains the customer’s name and ssn:

{
  "customer": {
    "name": "Customer Default",
    "ssn": "123.456.789.10"
  }
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "remove",
    "spec": {
      "customer": {
        "birthDate": ""
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

What we did above was navigate to the field birthDate and assign it to empty quotation marks.

The field to be removed must always be assigned to an empty string, otherwise there will be an error in the transformation and it will not occur.


Sort

Used to sort fields and objects in a JSON in alphabetical order.

Important: The ordering of fields and objects cannot be configured, therefore the entire JSON is affected. Only field and object names are ordered, not their values.

Let’s see the example below.

We have an input JSON containing information about an employee:

{
  "employee": {
    "phone": "9 9999-9999",
    "name": "Employee Sort",
    "birthDate": "01/01/1980",
    "role": "JOLT Analyst"
  }
}
Enter fullscreen mode Exit fullscreen mode

We need all fields contained in the input JSON to be ordered alphabetically:

{
  "employee": {
    "birthDate": "01/01/1980",
    "name": "Employee Sort",
    "phone": "9 9999-9999",
    "role": "JOLT Analyst"
  }
}
Enter fullscreen mode Exit fullscreen mode

For the sort operation, we do not need to define the spec object.

Our transformation will be:

[
  {
    "operation": "sort"
  }
]
Enter fullscreen mode Exit fullscreen mode

Cardinality

Used to transform simple fields and objects into lists of objects and vice versa.

Important: When we transform a list of objects into a simple field or object, only the first element of the list is considered.

Let’s see the example below.

We have an input JSON containing information about products:

{
  "products": {
    "name": "Product A",
    "id": "123-A",
    "value": 10
  }
}
Enter fullscreen mode Exit fullscreen mode

We need to transform the products object into a list:

{
  "products": [
    {
      "name": "Product A",
      "id": "123-A",
      "value": 10
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "cardinality",
    "spec": {
      "products": "MANY"
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

In case we have a list of products:

{
  "products": [
    {
      "name": "Product A",
      "id": "123-A",
      "value": 10
    },
    {
      "name": "Product B",
      "id": "456-B",
      "value": 20
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

And we need to transform that list into a simple object:

{
  "products": {
    "name": "Product A",
    "id": "123-A",
    "value": 10
  }
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "cardinality",
    "spec": {
      "products": "ONE"
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

Advanced Concepts

Until now, we have seen essential concepts to understand JOLT and how to use it. Before moving to the remaining operations, we need to learn a few more elaborate concepts.

As a prerequisite for moving forward, we will see two concepts that are used in future explanations:

  • LHS (Left Hand Side) Used to reference the left side of the transformation.
  • RHS (Right Hand Side) Used to reference the right side of the transformation.

That is, all JSON content that is before the colon (:) is the LHS, and what is after the colon is the RHS.

Transformation example:

[
  {
    "operation": "shift",
    "spec": {
      "customer": {
        "name": "client.fullName",
        "birthDate": "client.dateOfBirth",
        "address": "client.address.street",
        "country": "client.address.country"
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

Now we can move forward.

The great power of JOLT lies in the possibility of dealing with transformations in a dynamic way. For this, we use wildcards, which are specific characters allocated in different ways in our transformations, each with a different function.

One wildcard can have different functions depending on its usage (LHS or RHS), and we can also combine different wildcards in the same transformation.

Below we will see their definitions and some usage examples.


&

It uses the content declared in the LHS to compose the structure of the output JSON, without needing to make this content explicit in the transformation. This wildcard is based on the navigation performed during the transformation.

  • Usage: RHS
  • Operation: shift

Example

We have a JSON that contains customer data:

{
  "name": "Client Example",
  "email": "client-example@email.com"
}
Enter fullscreen mode Exit fullscreen mode

And we need this data inside an object called client:

{
  "client": {
    "name": "Client Example",
    "email": "client-example@email.com"
  }
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "name": "client.&",
      "email": "client.&"
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

In &, we take the values of the fields name and email and assign them to fields called name and email inside a client object. Thus, we create a new JSON while preserving the input field names.


* (asterisk)

References all fields and objects in a JSON without needing to explicitly declare their names in the transformation.

  • Usage: LHS
  • Operations: shift, remove, cardinality, modify-default-beta, modify-overwrite-beta

Example

We have an input JSON containing customer data:

{
  "name": "Customer Example",
  "email": "client-example@email.com",
  "document": "1234567890",
  "birthDate": "10/31/1990",
  "address": "Customer Example Street"
}
Enter fullscreen mode Exit fullscreen mode

And we need this data inside an object named customer, but we need to change the field document to a field named ssn:

{
  "customer": {
    "name": "Customer Example",
    "email": "client-example@email.com",
    "ssn": "1234567890",
    "birthDate": "10/31/1990",
    "address": "Customer Example Street"
  }
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "*": "customer.&",
      "document": "customer.ssn"
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

In the line:

"*": "customer.&"
Enter fullscreen mode Exit fullscreen mode

we are taking any content that exists in the input JSON and placing it in an object named customer, keeping the original field name and value.

As for the field document, we are taking its value and assigning it to a field named ssn inside the same object.

Using the wildcard * next to & means that for each field that * finds, & keeps its name and value. This combined usage is very useful because it allows us to manipulate a JSON without needing to know and declare all of its content beforehand.


@

References the value of a field or object contained in the input JSON, but produces different effects depending on how it is used.

  • Usage: LHS and RHS
  • Operations: shift (LHS and RHS), modify-default-beta (RHS), modify-overwrite-beta (RHS)

Shift example

We have a JSON containing isolated product information:

{
  "key": "code",
  "value": "123-ABC"
}
Enter fullscreen mode Exit fullscreen mode

And we need to group this into a product object, relating the key field to the value field:

{
  "product": {
    "code": "123-ABC"
  }
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "value": "product.@(1,key)"
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

In:

"@(1,key)"
Enter fullscreen mode Exit fullscreen mode

we are taking the value of the field key to be used as the name of the field that will receive the value of the field value.

The use of @ involves declaring the level at which we are seeking information and counting levels from 1 onward.

In this case, the field key is at the same level as the field value, so we use the number 1.

The usage of @ in the LHS follows the same principle as in the RHS.

modify-default-beta and modify-overwrite-beta example

We have a JSON containing isolated product information:

{
  "product": {
    "name": "Product A",
    "price": 10
  },
  "manufacturer": "Company A"
}
Enter fullscreen mode Exit fullscreen mode

And we need the product object to contain a company field with the value of the field manufacturer, which is outside product:

{
  "product": {
    "name": "Product A",
    "price": 10,
    "company": "Company A"
  },
  "manufacturer": "Company A"
}
Enter fullscreen mode Exit fullscreen mode

The transformation will be:

[
  {
    "operation": "modify-default-beta",
    "spec": {
      "product": {
        "company": "@(2,manufacturer)"
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

For modify-overwrite-beta, we would have the same transformation structure.

What we did was create a field named company and assign to it the value of the field manufacturer. For that, we moved up to level 2 in order to be able to see the field manufacturer and retrieve its value.

The difference between modify-default-beta and modify-overwrite-beta is that in modify-default-beta the inclusion of the field company only happens if there is no other field named company inside product.

For modify-overwrite-beta, the field company is included even if the field company already exists inside product. As the name suggests, if the content already exists in the input JSON, it is overwritten.


$

References the name of a field or object contained in the input JSON so that this name can be used as the value of a field or object in the output JSON.

  • Usage: LHS
  • Operation: shift

Example

We have an input JSON containing product data:

{
  "product": {
    "name": "Product Example",
    "value": 10,
    "category": "CATEG-1",
    "weight": 25
  }
}
Enter fullscreen mode Exit fullscreen mode

And we need a JSON to know what product information is being provided:

{
  "product": [
    "name",
    "value",
    "category",
    "weight"
  ]
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "product": {
        "*": {
          "$": "product[]"
        }
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

What we did was select all (*) the fields of the object product, then take the name ($) of each field and assign them to a list named product.

That way, we obtain the field names rather than their values.


#

If used in the LHS, it inserts values manually in the output JSON.

In the RHS, it is applicable only when creating lists and is used to group certain content of the input JSON within the list being created.

  • Usage: LHS and RHS
  • Operation: shift

LHS example

We have an input JSON with product information:

{
  "product": {
    "name": "Product Example",
    "value": 10,
    "weight": 25
  }
}
Enter fullscreen mode Exit fullscreen mode

And we need a JSON that contains name, value, weight, and category:

{
  "product": {
    "name": "Product Example",
    "value": 10,
    "category": "CATEG-1",
    "weight": 25
  }
}
Enter fullscreen mode Exit fullscreen mode

However, the input JSON never provides category, so we need to add this field manually:

[
  {
    "operation": "shift",
    "spec": {
      "product": {
        "*": "product.&",
        "#DEFAULT-CATEGORY": "product.category"
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

The value contained after the wildcard # will always be assigned to the field declared in the RHS, which in our case is the field category inside the object product.

RHS example

We have an input JSON containing a list of products:

{
  "products": [
    {
      "code": "PROD-A",
      "value": 10
    },
    {
      "code": "PROD-B",
      "value": 20
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

And we need only to change the name of the field value to price:

{
  "products": [
    {
      "code": "PROD-A",
      "price": 10
    },
    {
      "code": "PROD-B",
      "price": 20
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Our transformation will be:

[
  {
    "operation": "shift",
    "spec": {
      "products": {
        "*": {
          "code": "products[#2].&",
          "value": "products[#2].price"
        }
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

The use of # in the RHS involves declaring the level at which we are seeking information. The declaration [#2] represents:

  • the creation of a list ([])
  • and that it must group (#) all the information found 2 levels above

We need this declaration in order to guarantee the correct grouping of each product with its respective code and price.

That is, in:

"code": "products[#2].&"
Enter fullscreen mode Exit fullscreen mode

we are taking the value from the field code and placing it into another field code inside a list named products.

And in:

"value": "products[#2].price"
Enter fullscreen mode Exit fullscreen mode

we take the value from the field value and place it into a field named price inside the same list.

When creating the list products, we look 2 levels above (the level of the list products) and the way each item is grouped in the input list is preserved in the new list.

This level-based grouping is one of the most important details to understand when working with JOLT lists.


| (pipe)

Allows referencing multiple fields or objects of an input JSON so that, regardless of the name of the field or object, its value is allocated to the same destination in the output JSON.

  • Usage: LHS
  • Operation: shift

Example

We have an input JSON containing customer data:

{
  "customer": {
    "fullName": "Customer Example",
    "email": "customer-example@email.com"
  }
}
Enter fullscreen mode Exit fullscreen mode

And we need a JSON with the following structure:

{
  "customer": {
    "name": "Customer Example",
    "email": "customer-example@email.com"
  }
}
Enter fullscreen mode Exit fullscreen mode

However, in the input JSON there is the possibility that the field fullName comes as customerName, so we need the transformation to be prepared to recognize both possibilities:

[
  {
    "operation": "shift",
    "spec": {
      "customer": {
        "fullName|customerName": "customer.name",
        "email": "customer.&"
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

Operations modify-default-beta and modify-overwrite-beta

As mentioned in the explanation of the wildcard @, these operations allow us to dynamically reference values. While modify-default-beta assigns a value to a field only if it does not already exist, modify-overwrite-beta overwrites the value even if the field already exists.

However, modify-overwrite-beta also allows us to apply functions to our JSON.

They are:

  • String
    • toLower
    • toUpper
    • concat
    • join
    • split
    • substring
    • trim
    • leftPad
    • rightPad
  • Number
    • min
    • max
    • abs
    • avg
    • intSum
    • doubleSum
    • longSum
    • intSubtract
    • doubleSubtract
    • longSubtract
    • divide
    • divideAndRound
  • Type
    • toInteger
    • toDouble
    • toLong
    • toBoolean
    • toString
    • recursivelySquashNulls
    • squashNulls
  • List
    • firstElement
    • lastElement
    • elementAt
    • toList
    • sort
    • size

Input JSON

{
  "STRING": {
    "product": "Product A",
    "company": "company a",
    "value": "100",
    "measureWithSpaces": "  10 meters "
  },
  "NUMBER": {
    "array": [3, 5, 2, 7, 1],
    "negativeValue": -100,
    "positiveValue": 50
  },
  "TYPE": {
    "value": 10.5,
    "stringBoolean": "true",
    "objectWithNull": {
      "fieldWithValue": "ABC",
      "nullField": null
    }
  },
  "LIST": {
    "array": ["c", "t", "m", "a"],
    "stringField": "123"
  }
}
Enter fullscreen mode Exit fullscreen mode

Transformation

[
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "STRING": {
        "product": "=toLower(@(1,product))",
        "company": "=toUpper(@(1,company))",
        "product_company": "=concat(@(1,product),'_',@(1,company))",
        "joinProductCompany": "=join(' - ',@(1,product),@(1,company))",
        "splitProductCompany": "=split('[-]',@(1,joinProductCompany))",
        "substringProduct": "=substring(@(1,product),0,4)",
        "value": "=leftPad(@(1,value),6,'A')",
        "measure": "=trim(@(1,measureWithSpaces))"
      },
      "NUMBER": {
        "minArray": "=min(@(1,array))",
        "maxArray": "=max(@(1,array))",
        "absoluteValue": "=abs(@(1,negativeValue))",
        "averageArray": "=avg(@(1,array))",
        "sumArray": "=intSum(@(1,array))",
        "subtrArray": "=intSubtract(@(1,positiveValue),20)",
        "division": "=divide(@(1,positiveValue),2)",
        "divisionRound": "=divideAndRound(3,@(1,positiveValue),3)"
      },
      "TYPE": {
        "integerValue": "=toInteger(@(1,value))",
        "booleano": "=toBoolean(@(1,stringBoolean))",
        "stringValue": "=toString(@(1,value))",
        "stringBoolean": "=size",
        "objectWithNull": "=recursivelySquashNulls"
      },
      "LIST": {
        "arrayFirstItem": "=firstElement(@(1,array))",
        "arrayLastItem": "=lastElement(@(1,array))",
        "arrayElement": "=elementAt(@(1,array),2)",
        "fieldToList": "=toList(@(1,stringField))",
        "orderedArray": "=sort(@(1,array))"
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

Output JSON

{
  "STRING": {
    "product": "product a",
    "company": "COMPANY A",
    "value": "AAA100",
    "measureWithSpaces": "  10 meters ",
    "product_company": "product a_COMPANY A",
    "joinProductCompany": "product a - COMPANY A",
    "splitProductCompany": [
      "product a ",
      " COMPANY A"
    ],
    "substringProduct": "prod",
    "measure": "10 meters"
  },
  "NUMBER": {
    "array": [3, 5, 2, 7, 1],
    "negativeValue": -100,
    "positiveValue": 50,
    "minArray": 1,
    "maxArray": 7,
    "absoluteValue": 100,
    "averageArray": 3.6,
    "sumArray": 18,
    "subtrArray": 30,
    "division": 25,
    "divisionRound": 16.667
  },
  "TYPE": {
    "value": 10.5,
    "stringBoolean": 4,
    "objectWithNull": {
      "fieldWithValue": "ABC"
    },
    "integerValue": 10,
    "booleano": true,
    "stringValue": "10.5"
  },
  "LIST": {
    "array": ["c", "t", "m", "a"],
    "stringField": "123",
    "arrayFirstItem": "c",
    "arrayLastItem": "a",
    "arrayElement": "m",
    "fieldToList": ["123"],
    "orderedArray": ["a", "c", "m", "t"]
  }
}
Enter fullscreen mode Exit fullscreen mode

Note: Some functions were not included as isolated examples because they follow the same application pattern as others already shown. For example, doubleSum and longSum are applied the same way as intSum.

Regarding recursivelySquashNulls and squashNulls, both are applicable only to objects and lists and are used to remove fields with null values. However:

  • recursivelySquashNulls looks at all levels below the object or list
  • squashNulls looks only 1 level below

Cascade behavior

modify-overwrite-beta executes in cascade. That is, each new transformation is impacted by previous transformations.

To understand this behavior, let’s take a snippet from the previous example:

"STRING": {
  "product": "=toLower(@(1,product))",
  "company": "=toUpper(@(1,company))",
  "product_company": "=concat(@(1,product),'_',@(1,company))"
}
Enter fullscreen mode Exit fullscreen mode

In:

"product": "=toLower(@(1,product))"
Enter fullscreen mode Exit fullscreen mode

we change the value of product to lowercase.

In:

"company": "=toUpper(@(1,company))"
Enter fullscreen mode Exit fullscreen mode

we change the value of company to uppercase.

So, when we execute:

"product_company": "=concat(@(1,product),'_',@(1,company))"
Enter fullscreen mode Exit fullscreen mode

we are already using the transformed values of product and company, not their original values from the input JSON.

This means the result is influenced by everything that came before it in the same operation, which is why understanding execution order is important.


Final Considerations

JOLT is simple in concept, but not always simple in practice.

The more complex the transformation, the more important it becomes to understand:

  • JSON navigation
  • LHS and RHS behavior
  • wildcard semantics
  • level counting
  • list grouping
  • execution order

Once these concepts are clear, JOLT becomes a very powerful way to transform JSON without writing imperative code.

For a more technical reading about JOLT:

JOLT GitHub documentation

Top comments (0)