DEV Community: Brilian Firdaus

Understanding Database Locks: Managing Concurrency in Databases

Brilian Firdaus — Wed, 23 Apr 2025 15:25:21 +0000

Consistency in a database is crucial when building a system that serve many users. Imagine maintaining a concert booking system, and a very famous artist has just started selling assigned seats to the concert. Many people start purchasing seats as soon as the selling time begins. What you don't want to happen in this case is for seats to be oversold, resulting in more than one person being assigned to a single seat. This could cause a disaster at the event as people might fight for their seats, ultimately damaging your reputation as the person managing the booking system.

In this article, we will explore how concurrency works in databases and how to prevent the scenario mentioned earlier from happening. We'll cover:

Why Databases Use Locks : We'll delve more thoroughly into the previous example to understand what's happening and why database locks can prevent such issues.
Shared Lock and Exclusive Lock : There are two types of locks based on how they lock processes. We'll learn about them.
Read Phenomenon and Isolation Levels : The more consistency you want in your database, the stricter the database lock process will be. Inconsistent reading in a database is known as a read phenomenon, and it can be managed by adjusting the isolation level of a database.
Types of Locks : We'll check out different types of database locks, how they behave, and when to use them.

ℹ️ This article focuses on SQL databases. Not all concepts are applicable to their NoSQL counterparts.

Why Databases Use Locks

For starters, let's explore our concert booking system and break down how race conditions can occur. Suppose we have the following simplified schema for the database:

Let's create a simplified query of how we can book the seat:

BEGIN;
SELECT * FROM seats WHERE id = '1';
UPDATE seats SET remainingQty = remainingQty - 1 WHERE id = '1';
COMMIT;

💡BEGIN; and COMMIT; are SQL command that indicates when the transaction started and ended. The SQL database will ensure that the operations between them are executed atomically.

Now let's see how the transaction would work if there are concurrent processes operating on the same row:

As you can see, there are two different processes working on the same row. As a result, both processes will indicate that their respective booking is successful. The seat that only had one available spot now has multiple bookings.

This issue is called a race condition , and it's a common problem in multi-threaded applications. Thankfully, databases have a built-in locking mechanism to prevent this from happening.

Moving forward, we'll explore how the database locks work. At the end of the article, we'll figure out how to solve the concert booking example (note: it's actually quite simple).

Shared and Exclusive Lock

There are two types of database locks related to their functionality: Shared Lock and Exclusive Lock.

When a row is locked with a Shared Lock , other transactions are prevented from writing to that row. This lock is typically used when a transaction is reading a specific row and needs to ensure that the data remains consistent throughout the transaction. Multiple transactions can hold a shared lock on the same row simultaneously, but no transaction can modify the row while it's locked in shared mode.

On the other hand, an Exclusive Lock locks the row for both reading and writing processes. Typically, when a transaction is writing to a row, it will place an exclusive lock on that row. This lock ensures that other transactions cannot read or write to the row while it is being modified. The exclusive lock ensures complete isolation of the transaction, maintaining data integrity by preventing concurrent access.

Read Phenomenon and Isolation Level

When reading data from a database, you can encounter inconsistent data due to concurrent processes working on the same data. This is known as a Read Phenomenon. Various read phenomena range from simple to more complex. To tackle this issue, databases allow you to set the Isolation Level for each transaction. The stricter the isolation level, the fewer read phenomena that can occur, but this may also slow down your database. This is because stricter isolation levels involve more complex locking mechanisms.

It is important to note that isolation levels can vary between SQL databases. To make the topic easier to understand, we will explore the simplest isolation level mechanisms that uses only Pessimistic Locking (we'll also explore the others in the next section).

Read Uncommitted

Read Uncommitted is the most lenient isolation level. At this level, there is no locking mechanism at all. This means that, as the name suggests, a transaction can read data from another transaction that has not yet been committed.

Although this is the fastest isolation level, it can cause issues when a transaction reads data from an uncommitted transaction that is later rolled back. This phenomenon is known as Dirty Reads.

The previously mentioned Race Condition example can also occur due to this level of isolation. Because of the high inconsistency that can happen with this isolation level, some databases, like PostgreSQL, choose not to support it at all.

Read Committed

The next isolation level is Read Committed. As the name suggests, at this level, the database will only allow transactions to read data that has been committed.

Starting at this isolation level, your transaction is prevented from reading data from another unfinished transaction. To achieve this, the database places an exclusive lock when a transaction is writing to a row, ensuring that other transactions cannot read or write to it until the transaction is completed.

Now, let's consider what could happen when you read from the same row twice within the same transaction:

This phenomenon is called a Non-Repeatable Read. It occurs when a transaction reads the same row twice and gets different results each time, because another transaction has committed changes to the row in between the reads.

Repeatable Read

To combat the Non-Repeatable Read phenomenon, we can use the Repeatable Read isolation level. In this isolation level, your transaction will also place a shared lock on the rows you're working on. This means that if another transaction attempts to update the data before your transaction is completed, it will be blocked. This isolation level ensures that your transaction always reads the same value from a row throughout its execution.

However, this isolation level cannot handle a read phenomenon called Phantom Read. Phantom reads occur when you read a set of rows, and then another transaction inserts new rows that satisfy the criteria of your previous query. As a result, if you rerun the same query, the new rows (phantoms) will appear in the result set.

Serializable

The last isolation level we'll explore is Serializable , which is the strictest isolation level. In the Serializable isolation level, the database ensures that transactions run as if they were executed sequentially, one after the other. This means that effectively, only one transaction can modify data at a time, eliminating any chance of concurrency-related issues.

By ensuring transactions run in this highly isolated manner, the Serializable isolation level guarantees consistency. Transactions cannot interfere with each other, preventing all types of read phenomena.

However, this isolation level has a significant drawback: the elimination of concurrency. As a result, the database operations become much slower. The performance trade-off is substantial, as the database throughput is dramatically reduced when ensuring that transactions operate in a serial order.

Types of Database Locks

We've explored isolation levels and read phenomena, but we have only scratched the surface of what database locks are. The examples we used before rely on a mechanism called Pessimistic Locking. However, there are two other approaches to handling database concurrency: Optimistic Locking and MVCC (Multi-Version Concurrency Control). Databases and application frameworks usually use a combination of these locks to handle concurrent processes effectively.

Pessimistic Locking

Pessimistic locking is the locking mechanism we explored previously. It behaves such that when a transaction is currently modifying a specific row, another transaction must wait for the change to finish before it can modify the same row.

One important aspect of pessimistic locking is the potential for deadlocks. Deadlocks occur when two transactions are each waiting for the other to release a lock on different rows, resulting in both transactions being blocked indefinitely. To handle this, databases typically implement a timeout mechanism. If a transaction cannot acquire the necessary locks within the specified timeout period, the transaction is rolled back, and an error is generated.

Optimistic Locking

Optimistic locking is a type of locking that will fail a transaction if there's another transaction that has already modified the data.

You can implement optimistic locking by adding version field to your data. When you want to update a row, you'll first retrieve the row to get the version number. Then, you'll include the version number in your update filter. If no row is updated, it means another transaction has already modified that row, and you can conclude that your transaction has failed.

This also works for consistently reading a row twice (Repeatable Read). First, you need to snapshot the version number from the initial query. For subsequent queries, you'll include the version number or timestamp. If the query doesn't return anything, it means another transaction has already updated the row, and you should rollback the transaction.

Unlike pessimistic locking, optimistic locking can't cause deadlocks, so it is considered safer. However, you'll need a retry mechanism for writing to your database if you expect concurrent updates on the same row.

Multi-Version Concurrency Control

In Multi-Version Concurrency Control, or for short MVCC, your application will store multiple versions of a row. When you read from the row, your transaction will snapshot the version of the row. This process ensures that you will always get the same result, even if other transactions successfully update the row, because they will create a new version.

Unlike optimistic locking, your query won't actually return an error when you try to read an outdated version. The database can still return data because it stores multiple versions of the row.

MVCC handles write concurrency similarly to optimistic locking: it throws an error and rolls back the transaction that tries to write to a row with a newer version or timestamp.

Solving The Booking Cinema Issue

Let's revisit our first example: how do we solve the issue of overbooking seats in a cncert? The solution is quite simple. The easiest way to keep the query performant is to use the Read Committed isolation level with some additions to the query.

The Read Committed isolation level ensures that no concurrent process is writing to the same row. Next, we'll add a filter to our query, remainingQty >= 1, which will ensure that there is always at least one seat available.

UPDATE seats SET remainingQty = remainingQty - 1 WHERE id = '1' AND remainingQty >= 1;

To validate whether the transaction is successful, we can count how many rows are affected by the query. If it’s 1, then the query is successful. If it’s 0, then it’s not, and we should throw an error.

Takeaways

In this article, we've explored the types of database locks and how they work. One important reminder is that we have only covered the concepts. Every database system has its own interpretation and implementation of these concepts, and understanding them is crucial if you want to master that specific database. However, the concepts discussed in this article should provide a solid foundation and make it easier for you to understand these implementations.

A Developer’s Guide to Circuit Breakers: Protect Your Service from Failure

Brilian Firdaus — Mon, 14 Apr 2025 06:00:06 +0000

When you're building complex microservices that handle a lot of traffic, failure becomes a normal part of the system—especially in environments where teams push changes weekly, or even multiple times a week.

Resiliency plays a critical role in software design. A resilient system can isolate failures and recover gracefully. Take a travel commerce example: if one airline's server goes down, that failure shouldn't impact other unrelated flows. Ideally, the system should recover on its own.

A Circuit Breaker is a commonly used mechanism to improve a system’s resiliency. Inspired by electrical circuit breakers, it behaves similarly—in electricity, when a breaker detects a voltage spike, it trips and stops the current to protect appliances. In software, when a downstream service behaves abnormally, a circuit breaker can trip and stop requests from going through.

This behavior provides two main benefits: containing the incident to only the affected part of the system and supporting self-healing. In this article, we’ll cover:

Understanding the Circuit Breaker : A deeper look into what it is and how it prevents cascading failures.
Why Use a Circuit Breaker : When and why to implement one in your architecture.
How Circuit Breakers Work : The internal states and configurations.
Implementing a Circuit Breaker : A hands-on example using Hystrix.
Best Practices : Practical tips and gotchas to watch out for.

Understanding The Circuit Breaker

As mentioned, a circuit breaker works like an electrical circuit breaker—it stops traffic to a failing downstream dependency. This is especially helpful when your service has multiple downstream dependencies and flows that are unrelated. If one dependency fails, you don’t want the others to be affected. Circuit breakers help by cutting off the connection entirely, saving resources.

Think of a building’s electrical system: if there’s a voltage spike, the breaker trips to prevent damage—or even a fire. In software, the equivalent would be your service crashing because it’s overloaded with slow or failing requests from a downstream service. Without a circuit breaker, a single failure could spiral into a full outage.

When a circuit breaker trips:

It goes into the Open state, rejecting all requests to the downstream service. The error is returned immediately.
After a timeout, it moves into the Half-Open state, allowing a limited number of requests to test if the downstream service has recovered.
If those requests succeed, it transitions back to Closed , and all traffic is allowed again. If they fail, it trips back to Open.

Why Use a Circuit Breaker?

Now that we know what a circuit breaker is, why use it? What happens if we don’t?

Anytime your service calls another service, it uses resources—CPU, memory, threads, connections. If the downstream service is down or very slow, your service will waste resources waiting. In high-traffic environments, this can lead to resource exhaustion and eventually cause your own service to go down.

Imagine a travel commerce site where one airline's API is down. Users should still be able to see and purchase flights from other airlines. But if your system keeps making slow calls to the failing airline API, it could eat up all your resources and take down the whole service. This leads to cascading failures across systems.

Even if your service doesn’t crash, using a circuit breaker can help with:

Not overloading the downstream service during incidents.
Failing fast , which improves user experience and prevents unnecessary load on your own service.

Let’s say your service has a circuit breaker and the client calling your service doesn’t. By returning a fast error instead of making a slow failing call, you help prevent cascading failures further upstream. Bonus: your users won’t have to wait several seconds for a timeout—they’ll get an immediate error, which is often a better experience.

How Circuit Breakers Work

Now that we’ve seen the value of a circuit breaker, let’s look at how it works in more detail.

Circuit breakers typically have three states :

Closed – Everything is working. All requests pass through to the downstream dependency.
Open – The breaker has tripped. No requests are allowed through.
Half-Open – A trial state after some timeout. Some requests are allowed to test whether the downstream is back. Let's use this chart to better visualize the state flow of the circuit breaker:

You can refer to this diagram for better understanding on how the state changes:

Circuit Breaker State Change Diagram

Key Configuration Options

Failure Rate Threshold : Determines when the breaker should trip. You might configure it to trip if more than 50% of the last 100 requests fail.
Reset Timeout : How long the breaker should stay open before trying to move to half-open. Set it too low and you risk overloading the service again. Set it too high and recovery might be delayed unnecessarily.
Half-Open Test Requests : The number of requests to allow during half-open state before deciding whether to close or re-open the breaker.

These configuration options may differ slightly depending on the library you use, but they’re commonly supported. Many circuit breaker libraries also let you configure timeouts and concurrency limits. Make sure to check your library’s documentation for full capabilities.

Circuit Breaker in Action (with Hystrix in Go)

Let’s walk through an example using the Hystrix library in Go. Hystrix Circuit Breaker library was developed by Netflix and is available in various languages. It also provides a built-in dashboard to visualize the circuit breaker’s state.

This is a simple diagram showing what we will build in this section:

Circuit Breaker Example

Setup

We’ll create two APIs:

A mock API that always returns an error (to simulate a faulty dependency).
An invoke API that calls the mock API with a Hystrix circuit breaker.

hystrix.ConfigureCommand("mock_api_call", hystrix.CommandConfig{  
    Timeout: 1000,  
    MaxConcurrentRequests: 100,  
    RequestVolumeThreshold: 5,  
    SleepWindow: 5000,  
    ErrorPercentThreshold: 50,
})

The "mock_api_call" identifier allows multiple breakers with different configs in a service.

Then we start the Hystrix stream handler, which will allow us to connect a dashboard to visualize circuit breaker behavior:

// Start Hystrix stream handler  
hystrixStreamHandler := hystrixgo.NewStreamHandler()  
hystrixStreamHandler.Start()  
go func() {  
    log.Println("Starting Hystrix stream handler on port 8081...")  
    if err := stdhttp.ListenAndServe(":8081", hystrixStreamHandler); err != nil {  
       log.Printf("Error starting Hystrix stream handler: %v", err)  
    }  
}()

Mock API

// HandleMockAPI handles requests to the mock API endpointfunc (h *Handler) HandleMockAPI(w http.ResponseWriter, r *http.Request) {  
    log.Println("Mock API called")  
    http.Error(w, "Internal Server Error", http.StatusInternalServerError)  
}

This API always returns a 500 error to simulate failure.

Invoke API

// HandleInvoke handles requests to the invoke endpointfunc (h *Handler) HandleInvoke(w http.ResponseWriter, r *http.Request) {  
    output := make(chan string, 1)  
    log.Println("Calling mock API")
    errors := hystrix.Go("mock_api_call", func() error {  
       log.Println("Calling mock API")  
       result, err := h.apiClient.CallMockEndpoint()  
       if err != nil {  
          return err  
       }  
       output <- result  
       return nil  
    }, func(err error) error {  
       // Fallback function  
       output <- fmt.Sprintf("Fallback: %v", err)  
       return nil  
    })  

    select {  
    case result := <-output:  
       fmt.Fprintln(w, result)  
    case err := <-errors:  
       log.Printf("Error: %v", err)  
       http.Error(w, "Error processing request", http.StatusInternalServerError)  
    }  
}

What this API will do is:

Handles an HTTP request when someone calls the endpoint.
Tries to call a mock API using h.apiClient.CallMockEndpoint().
Wrap Hystrix on the API call. Notice that we're using the mock_api_call command we've setup before
If the API call works , it sends the result to the user.
If the API call fails , it runs a fallback function and sends a fallback message instead. Which contain the error happened so we can differentiate whether the error is from the mock API or from the tripped circuit breaker.

When you hit the invoke API a few times, you’ll start to see this response:

Fallback: hystrix: circuit open

This means the circuit breaker has tripped and is blocking further calls to the mock API.

In the log, you will see:

2025/04/03 14:25:18 Invoke API called
2025/04/03 14:25:18 Calling mock API
2025/04/03 14:25:18 Mock API called
2025/04/03 14:25:18 Invoke API called
2025/04/03 14:25:18 Calling mock API
2025/04/03 14:25:18 Mock API called
2025/04/03 14:25:19 Invoke API called
2025/04/03 14:25:19 Calling mock API
2025/04/03 14:25:19 Mock API called
2025/04/03 14:25:19 Invoke API called
2025/04/03 14:25:19 Calling mock API
2025/04/03 14:25:19 Mock API called
2025/04/03 14:25:19 Invoke API called
2025/04/03 14:25:19 Calling mock API
2025/04/03 14:25:19 Mock API called
2025/04/03 14:25:20 Invoke API called

Notice that when the circuit is open, the actual call to the mock API is skipped entirely, proving that the breaker is doing its job.

Monitoring with Hystrix Dashboard

You can visualize how the Hystrix is running by running Hystrix dashboard in Docker:

docker run --rm -p 7979:7979 --name hystrix-dashboard steeltoeoss/hystrix-dashboard

Then point the dashboard to:

http://host.docker.internal:8081 (use this if your app is running outside Docker).

Hystrix Dashboard

The red number indicates the count of errors returned by the Invoke API (or more specifically, by the code wrapped in the Hystrix function). The purple number indicates how many requests were short-circuited. In this case, I hit the Invoke API 8 times—5 of which reached the mock API and returned an internal server error (ISE), while 3 were short-circuited and returned an error without hitting the mock API.

You can check the full code in: https://github.com/brilianfird/hystrix-circuit-breaker-demo

Best Practices When Using Circuit Breakers

You now understand how a circuit breaker works and how to configure it. But how should you tune it in a real-world system?

Check your monitoring to determine the right circuit breaker configuration

The best way to set the circuit breaker values is by understanding how your service interacts with its downstream dependencies. Look at metrics like the typical error rate, response time, and the volume of requests sent to the downstream service. You can get these insights from your existing monitoring tools. If you don’t have monitoring in place, you can still implement a circuit breaker and use its built-in dashboard to gather this data.

Allow Breathing Room in the Circuit Breaker Configuration—Especially at First

If you're implementing Hystrix (or any circuit breaker) for the first time, it's a good idea to start with a more lenient configuration. Since you might not yet have monitoring in place, you won't know how often the downstream service fails or how it behaves under load. A strict configuration could unintentionally disrupt existing processes, so give your system some breathing room while you observe and learn.

Don't hardcode the configuration

It's best not to hardcode your circuit breaker settings. Instead, make them configurable via properties, a database, or an external configuration service. This way, if the settings need to be adjusted, you won’t have to redeploy your application—making it much easier to recover quickly and minimize business impact.

Account for App Instances in Your Configuration

When using per-instance circuit breaker settings, it’s important to think about how your application behaves at scale. Circuit breakers often use percentage-based thresholds, which makes them more adaptable than fixed-count mechanisms like rate limiters. Still, scaling can introduce unexpected issues.

For example, if you set a high request volume threshold, it might work fine in a single-instance setup. But once the app scales horizontally, each instance may receive fewer requests—potentially not enough to reach the threshold and trip the circuit breaker, even if the overall system is under stress. Always factor in your scaling strategy when defining these thresholds to avoid silent failures or delayed responses.

Takeaway

In this article, we explored how the Circuit Breaker pattern helps improve system resiliency.

However, it's just one piece of the puzzle. Other techniques—likeretry mechanisms, the bulkhead pattern, and timeouts—also play a key role in building robust systems.

Make sure to subscribe to Code Curated so you don’t miss out when we cover these methods in future articles!

Why You Should Use Caching - Improve User Experience and Reduce Costs

Brilian Firdaus — Fri, 10 May 2024 02:44:46 +0000

Today, we're diving into the world of caching. Caching is a secret weapon for building scalable, high-performance systems. There are many types of caching, but in this article, we'll focus on backend object caching (backend caching). Mastering it will help you to build high performance and reliable software.

In this article, we'll be exploring:

What is Caching? We'll explore caching and explain how it temporarily stores data for faster access.
Benefits of Caching : Discover how caching boosts speed, reduces server load, improves user experience, and can even cut costs.
Caching Pattern : In this section, we'll dive into different ways to use the cache. Remember, there are pros and cons to each approach, so make sure to pick the right pattern for your needs!
Caching Best Practice : Now you know how to store and retrieve cached data. But how do you ensure your cached data stays up-to-date? And what happens when the cache reaches its capacity?
When Not To Cache : While caching offers many benefits, there are times when it's best avoided. Implementing caching in the wrong system can increase complexity and potentially even slow down performance.

What is Caching

Creating a high-performance and scalable application is all about removing bottlenecks and making the system more efficient. Databases often bottleneck system performance due to their storage and processing requirements. This makes them a costly component because they need to be scaled up often.

Thankfully, there's a component that can help offload database resource usage while improving data retrieval speed – that component is called cache.

Cache is a temporary storage designed for fast write and read of data. It uses low-latency memory storage and optimized data structures for quick operations. Chances are you've already used Redis or Memcached, or at least heard their names. These are two of the most popular distributed caching systems for backend services. Redis can even act as a primary database, but that's a topic for another article!

Benefits of Caching

Latencies every developer should know

The main benefit of caching is its speed. Reading data from a cache is significantly faster than retrieving it from a database (like SQL or Mongo). This speed comes from caches using dictionary (or HashMap) data structures for rapid operations and storing data in high-speed memory instead of on disk.

Secondly, caching reduces the load on your database. This allows applications to get the data they need from the cache instead of constantly hitting the database. This dramatically decreases hardware resource usage; instead of searching for data on disk, your system simply accesses it from fast memory.

These benefits directly improve user experience and can lead to cost savings. Your application responds much faster, creating a smoother and more satisfying experience for users.

Caching reduces infrastructure costs. While a distributed system like Redis requires its own resources, the overall savings are often significant. Your application accesses data more efficiently, potentially allowing you to downscale your database. However, this comes with a trade-off: if your cache system fails, ensure your database is prepared to handle the increased load.

Cache Patterns

Now that you understand the power of caching, let's dive into the best ways to use it! In this section, we'll explore two essential categories of patterns: Cache Writing Patterns and Cache Miss Patterns. These patterns provide strategies to manage cache updates and handle situations when the data you need isn't yet in the cache.

Writing Patterns

Writing patterns dictate how your application interacts with both the cache and your database. Let's look at three common strategies: Write-back , Write-through , and Write-around. Each offers unique advantages and trade-offs:

Write Back

Write-back Cache Pattern

How it works:

Your application interacts only with the cache.
The cache confirms the write instantly.
A background process then copies the newly written data to the database.

Ideal for: Write-heavy applications where speed is critical, and some inconsistency is acceptable for the sake of performance. Examples include metrics and analytics applications.

Advantages:

Faster reads: Data is always in the cache for quick access, bypassing the database entirely.
Faster writes: Your application doesn't wait for database writes, resulting in faster response times.
Less database strain: Batched writes reduce database load and can potentially extend the lifespan of your database hardware.

Disadvantages:

Risk of data loss: If the cache fails before data is saved to the database, information can be lost. Redis mitigates this risk with persistent storage, but this adds complexity.
Increased complexity: You'll need a middleware to ensure the cache and database eventually stay in sync.
Potential for high cache usage: All writes go to the cache first, even if the data isn't frequently read. This can lead to high storage consumption.

Write Through

Write-through Cache policy

How it works:

Your application writes to both the cache and the database simultaneously.
To reduce wait time, you can write to the cache asynchronously. This allows your application to signal successful writes before the cache operation is completely finished.

Advantages:

Faster reads: Like Write-Back, data is always in the cache, eliminating the need for database reads.
Reliability: Your application only confirms a write after it's saved in the database, guaranteeing data persistence even if a crash occurs immediately afterward.

Disadvantages:

Slower writes: Compared to Write-Back, this policy has some overhead because the application waits for both the database and cache to write. Asynchronous writes improve this but remember, there's always the database wait time.
High cache usage: All writes go to the cache, potentially consuming storage even if the data isn't frequently accessed.

Write Around

Write-around Cache Pattern

With Write-Around, your application writes data directly to the database, bypassing the cache during the write process. To populate the cache, it employs a strategy called the cache-aside pattern :

Read request arrives: The application checks the cache.
Cache miss: If the data isn't found in the cache, the application fetches it from the database and then stores it in the cache for future use.

Advantages:

Reliable writes: Data is written directly to the database, ensuring consistency.
Efficient cache usage: Only frequently accessed data is cached, reducing memory consumption.

Disadvantages:

Higher read latency (in some cases): If data isn't in the cache, the application must fetch it from the database, adding a roundtrip compared to policies where the cache is always pre-populated.

Cache Miss Pattern

Cache Miss Pattern

A cache miss occurs when the data your application needs isn't found in the cache. Here are two common strategies to tackle this:

Cache-Aside
- The application checks the cache.
- On a miss, it fetches data from the database and then updates the cache.
- Key point: The application is responsible for managing the cache.

Using Cache-Aside pattern means your application will manage the cache. This approach is the most common to use because it's simple and don't need development in places other than the application

Read-Through
- The application makes a request, unaware of the cache.
- A specialized mechanism checks the cache and fetches data from the database if needed.
- The cache is updated transparently.

Read-through pattern reduce application complexity, but it increase infrastructure complexity. It help to offload the application resource to the middleware instead.

Overall, the write-around pattern with cache-aside is most commonly used because of its ease of implementation. However, I recommend to also include the write-through pattern if you have any data that will be used immediately after it's cached. This will provide a slight benefit to read performance.

Caching Best Practice

In this section, we'll explore best practices for using a cache. Following these practices will ensure your cache maintains fresh data and manages its storage effectively.

Cache Invalidation

Imagine you've stored data in the cache, and then the database is updated. This causes the data in the cache to differ from the database version. We call this type of cache data "stale." Without a cache invalidation technique, your cached data could remain stale after database updates. To keep data fresh, you can use the following techniques:

Cache Invalidation on Update: When you update data in the database, update the corresponding cache entry as well. Write-through and write-back patterns inherently handle this, but write-around/cache-aside requires explicit deletion of the cached data. This strategy prevents your application from retrieving stale data.
Time To Live (TTL): TTL is a policy you can set when storing data in the cache. With TTL, data is automatically deleted after a specified time. This helps clear unused data and provides a failsafe against stale data in case of missed invalidations.

Cache Replacement Policies

If you cache a large amount of data, your cache storage could fill up. Cache systems typically use memory, which is often smaller than your primary database storage. When the cache is full, it needs to delete some data to make room. Cache replacement policies determine which data to remove:

Least Recently Used (LRU): This common policy evicts data that hasn't been used (read or written) for the longest time. LRU is suitable for most real-world use cases.
Least Frequently Used (LFU): Similar to LRU, but focuses on access frequency. Newly written data might be evicted, so consider adding a warm-up period during which data cannot be deleted.

Other replacement policies like FIFO (First-In, First-Out), Random Replacement, etc., exist, but are less common.

When Not To Cache

Before diving into cache implementation, it's important to know when it might not be the best fit. Caching often improves speed and reduces database load, but it might not make sense if:

Low traffic: If your application has low traffic and the response time is still acceptable, you likely don't need caching yet. Adding a cache increases complexity, so it's best implemented when you face performance bottlenecks or anticipate a significant increase in traffic.
Your system is write-heavy: Caching is most beneficial in read-heavy applications. This means data in your database is updated infrequently or read multiple times between updates. If your application has a high volume of writes, caching could potentially add overhead and slow things down.

Takeaways

In this article, we've covered the basics of caching and how to use it effectively. Here's a recap of the key points:

Confirm the Need: Ensure your system is read-heavy and requires the latency reduction caching offers.
Choose Patterns Wisely: Select cache writing and cache miss patterns that align with how your application uses data.
Data Freshness: Implement cache invalidation strategies to prevent serving stale data.
Manage Replacement Policy: Choose a cache replacement policy (like LRU) to handle deletions when the cache reaches its capacity.

References

Designing A Retry Mechanism For Reliable Systems

Brilian Firdaus — Tue, 03 Jan 2023 02:00:10 +0000

A retry mechanism is a critical component of many modern software systems. It allows our system to automatically retry failed operations to recover from transient errors or network outages. By automatically retrying failed operations, retry mechanisms can help software systems recover from unexpected failures and continue functioning correctly.

Today, we'll take a look at these topics:

What is A Retry Pattern : What is a retry pattern? What is it for, and why do we need to implement it in our system?
When to Retry Your Request : Only some requests should be retried. It's important to understand what kind of errors from the downstream service can be retried to avoid problems with business logic.
Retry Backoff Period : When we retry the request to the downstream service, how long should we wait to send the request again after it fails?
How to Retry? : We'll look at ways to retry from the basic to more complex.

What is A Retry Pattern

Retrying is an act of sending the same request if the request to downstream service failed. By using a retry pattern, you'll be improving the downstream resiliency aspect of your system. When an error happens when calling a downstream service, our system will try to call it again instead of returning an error to the upstream service.

So, why do we need to do it, exactly? Microservices architecture has been gaining popularity in recent decades. While this approach has many benefits, one of the downsides of microservices architecture is introducing network communication between services. Additional network communication leads to the possibility of errors in the network while services are communicating with each other (Read Fallacies of distributed computing). Every call to other services has a chance of getting those errors.

In addition, whether you're using monolith or microservices architecture, there is a big chance that you still need to call other services that are not within your company's internal network. Calling service within a different network means your request will go through more network layers and have more chance of failure.

Other than network errors, you can also get system errors like rate-limit errors, service down, and processing timeout. The errors you get may or may not be suitable to be retried. Let's head to the next section to explore it in more detail.

When to Retry Your Request

Although adding a retry mechanism in your system is generally a good idea, not every request to the downstream service should be retried. As a simple baseline, things you should consider when you want to retry are:

Is it a transient error? You'll need to consider whether the type of errors you're getting is transient (temporary). For example, you can retry a connection timeout error because it's usually only temporary but not a bad request error because you need to change the request.
Is it a system error? When you're getting an error message from the downstream service, it can be categorized as either: system error or application error. System error is generally okay to be retried because your request hasn't been processed by the downstream service yet. On the other hand, an application error usually means that something is wrong with your request, and you should not retry it. For example, if you're getting a bad request error from the downstream service, you'll always get the same error no matter how many times you've retried.
Idempotency. Even when you're getting an error from the downstream service, there is still a chance it has processed your request. The downstream service could send the error after it has processed the main process, but another sub-process causes errors. Idempotent API means that even if the API gets the same request twice, it will only process the first request. We can achieve it by adding some id in the request that's unique to the request so the downstream service can determine whether it should process the request. Usually, you can differentiate this with the Request Method. GET, DELETE, and PUT are usually idempotent, and POST is not. But you need to confirm the API's idempotency to the service owner.
The cost of retrying. When you retry your request to the downstream service, there will be additional resource usage. The additional resource usage can be in the form of additional CPU usage, blocked Thread, additional memory usage, additional bandwidth usage, etc. You need to consider this, especially if your service expects large traffic.
The implementation cost of the retry mechanism. Many programming languages already have a library that implements a retry mechanism, but you still need to determine which request to retry. You can also create your retry mechanism or every system if you want to, but of course, this means that there will be a high implementation cost for the retry mechanism.

Idempotency

✍️

Many libraries have already implemented the retry mechanism gracefully. For example, if you're using the Spring Mongo library in Java Spring Boot and the connection between your apps and MongoDB is severed, it will try to reconnect.

⚠️

Some libraries also implement a retry mechanism by default. It's sometimes dangerous because you can be unaware that the library will retry your request.

I've also compiled some common errors and whether or not they're suitable for retrying:

Error	Retry(idempotent)	Retry(not idempotent)
Connection Timeout	Yes	Yes
Read Timeout	Yes	No
Circuit Breaker Tripped	Yes	Yes
400: Bad Request	No	No
401: Unauthorized	No	No
404: Not Found	No	No
429: Too Many Request	Yes (Longer backoff)	Yes (Longer backoff)
500: Internal Server Error	Yes	Yes
503: Service Unavailable	Yes	Yes

Let's describe the errors shortly one by one

Connection timeout : Your app failed to connect to the downstream service. Hence the downstream service isn't aware of your request, and you can retry it.
Read timeout : The downstream app has processed your request but not returning any response for a long time.
Circuit breaker tripped : This is an error if you use a circuit breaker in your service. You can retry this kind of error because your service hasn't sent its request to the downstream service
400 - Bad Request : This error means your request to the downstream service was flagged your request as a wrong request after validating it. You shouldn't retry this error because it will always return the same error if the request is the same.
401 - Unauthorized : You need to authorize before sending the request. Whether you can retry this error will depend on the authentication method and the error. But generally, you will always get the same error if your request is the same
429 - Too many requests : Your request is rate limited by the downstream service. You can retry this error, although you should confirm with the downstream service's owner how long your request will be rate limited.
500 - Internal Server Error : This means the downstream service had started processing your request but failed in the middle of it. Usually, it's okay to retry this error.
503 - Service Unavailable : The downstream service is unavailable due to downtime. It is okay to retry this kind of error.

Retry Backoff Period

When your request fails to the downstream service, your system will need to wait for some time before trying again. This period is called the retry backoff period.

Generally, there are three strategies for wait time between calls: Fixed Backoff, Exponential Backoff, and Random Backoff. All three of them have their advantages and disadvantages. Which one you use should depend on your API and service use case.

Fixed Backoff. Fixed backoff means that every time you retry your request, the delay between requests is always the same. For example, if you do a retry twice with a backoff of 5 seconds, then if the first call fails, the second request will be sent 5 seconds after. If it fails again, the third call will be sent 5 seconds after the failure.

A fixed backoff period is suitable for a request coming directly from the user and needs a quick response. If the request is important and you need it to come back ASAP, then you can set the backoff period to none or close to 0.

Fixed backoff

Exponential Backoff. When downstream service is having a problem, it doesn't always recover quickly. What you don't want to do when the downstream service is trying to recover is to hit it multiple times in a short interval. Exponential backoff works by adding some additional backoff time every time our service attempts to call the downstream service.

For example, we can configure our retry mechanism with 5-second initial backoff and add two as the multiplier every attempt. This means when our first call to the downstream service fails, our service will wait 5 seconds before the next call. If the second call fails again, the service will wait 10 seconds instead of 5 seconds before the next call.

Due to its longer interval nature, exponential backoff is unsuitable for retrying a user request. But it will be perfect for a background process like notification, sending email, or webhook system.

Exponential backoff

Random backoff is a backoff strategy introducing randomness in its backoff interval calculation. Suppose that your service is getting a burst of traffic. Your service then calls a downstream service for every request, and then you get errors from it because the downstream service gets overwhelmed by your request. Your service implements a retry mechanism and will retry the requests in 5 seconds. But there is a problem: when it's time to retry the requests, all of them will be retried at once, and you might get an error from the downstream service again. With the randomness introduced by the random backoff mechanism, you can avoid this.

A random backoff strategy will help your service to level the request to the downstream service by introducing a random value for retry. Let's say you configure the retry mechanism with 5 seconds interval and two retries. If the first call fails, the second one could be attempted after 500ms; if it fails again, the third one could be attempted after 3.8 seconds. If many requests fail the downstream service, they won't be retried simultaneously.

Random backoff

Where to store the retry state?

When doing a retry, you'll need to store the state of the retry somewhere. The state includes how many retries have been made, the request to be retried, and the additional metadata you want to save. Generally, there are three places you can use to store the retry state, which are:

Thread is the most common place to store the retry state. If you're using a library with a built-in retry mechanism, it will most likely use the Thread to store the state. The simplest way to do this is to sleep the Thread. Let's see some example in Java:

int retryCount = 0;

while (retryCount < 3) {
    try {
        thirdPartyOutboundService.getData();
    } catch (Exception e) {
        retryCount += 1;
        Thread.sleep(3000);
    }
}

The code above basically sleep the Thread when getting an exception and calling the process again. While this is simple, it has the disadvantage of blocking the Thread and making other processes unable to use the Thread. This method is suitable for a fixed backoff strategy with a low interval like processes that direct response to the user and need a response as soon as possible.

Messaging. We could use a popular messaging broker like RabbitMQ (delayed queue) to save a retry state. When you're getting a request from the upstream, and you fail to process it (it can be because of downstream service or not), you can publish the message to the delayed queue to consume it later (depending on your backoff).

Using messaging to save the retry state is suitable for a background process request because the upstream service can't directly get the response of the retry process. The advantage of using this approach is that it's usually easy to implement because the broker/library already supports the retry function. Messaging as a storage system of retry state also works well with distributed systems. One problem can happen is your service suddenly has a problem like downtime when waiting for the next retry. By saving the retry state in the messaging broker, your service can continue the retry after the issue has been resolved.

Retry using message broker

Database is the most customizable solution to store the retry state, either by using a persistent storage or an in-memory KV store like Redis. When the request to the downstream service fails, you can save the data in the Database and use a cron job to check the Database every second or minute to retry failed messages.

While this is the most customizable solution, the implementation cost will be very high because you'll need to implement your retry mechanism. You can either create the mechanism in your service with the downside of sacrificing a bit of performance when a retry is happening or make an entirely new service for retry purposes.

Takeaways

This article has explored what is and what aspects to consider when implementing a retry pattern.

You need to know what request and how to retry it.

If you do the retry mechanism correctly, you'll help with the user experience and reduced operation of the service you're building. But, if you do it incorrectly, you risk worsening the user experience and business error. You need to understand when the request can be retried and how to retry it so you can implement the mechanism correctly.

There is much more.

In this article, we've covered about retry pattern. This pattern increases the downstream resiliency aspect of a system, but there is more to the downstream resiliency. We can combine the retry pattern with a timeout(which we explored in this article) and circuit breaker to make our system more resilient to downstream failure. If you're interested, subscribe to the newsletter because we plan to write about that too.

References

How To Better Store Password In Database

Brilian Firdaus — Tue, 02 Aug 2022 13:00:43 +0000

How Would You Store Your User's Password?

How would you store a password if you were asked to create an authentication system? The easiest solution would be to store the plain text inputted by the users. For example, if the users inputs their email as codecurated@codecurated.com and password as weakpassword, then we can insert it into our users table, right?

When the user tries to log in, we can query the table with SELECT * FROM users WHERE email = {email} AND password = {password}. If the query returns a result, then we can authenticate the user. Task done? Well, not quite.

Problem With Storing Plain Text Password

Even though the solution above will work, it is not secure and prone to many attacks. The most direct attack that might not be obvious is an internal attack. In this attack, the people or employees who have access to your database (including yourself) can easily see the user's password and get their credentials.

A data breach is also another thing why plain text password is terrible. Someone you don't intend to might acquire your database, and it happens all the time, even to a large company. With a plain text password, the attacker can get all of your user's credentials by querying your users table.

Better Way of Storing User's Password?

The first step that you want to take is to hash your user password with a hashing function before storing it in the database. Unlike encryption, hashing function can only go one way, and the result of hashing a specific string will always be the same. This makes hashing function a very suitable process in password storing.

One of the most popular and secure hashing functions is SHA256. If we try to hash weakpassword with SHA256, the result would be:

9b5705878182ccecf493b6c5ef3d2c723082141d0af33432c997b52dcc9f3e71

Hashing function only goes one way, so we can't convert the hash result back to weakpassword. Also, every time weakpassword is hashed using SHA256, the result will always be the same.

Looks good. Now the attackers wouldn't be able to know the users' passwords even if they can access your database. But, it's not good enough.

A Rainbow table attack is an attack with a table of a precomputed hash of common passwords. With a rainbow table attack, the attacker will be able to get the credential of the users with weak passwords in your database. To combat this, salt is usually used when hashing a password. salt is a randomly generated value that you can combine with the password before hashing it. For example, if we generated jvFJ4 as a salt and combine it with weakpassword and hash it (sha256("jvFJ4weakpassword")) it will produce:

b104c5bf49e2e4937ac2419e94864f7209014a96cae582302f6e5f891e426e22

Which is a completely different result that hashed plain weakpassword.

Now with salt, our table will become:

What do we need to do when the user logs in?

The user will submit their email and plain password
The system needs to query the users table by email, e.g., SELECT * FROM users WHERE email='codecurated@codecurated.com' to get the hashed password and salt
Compute SHA-256(salt + inputted password)
Compare the result with the hashed password. If it's the same, it means the user has inputted the correct password, and the system can authenticate the user.

We have mitigated many attacks with this design, but is it enough? Well...

The approaches we created previously can mitigate a lot of attacks. But not dictionary attack. What if the attacker gets the salt, combines it with a common password, hash the combination, and compares it with the password in the database? Only time will separate the attacker from getting your user's password. And actually, time is one variable that we can tune.

SHA256 is unsuitable for password hashing as it is designed to hash a complex enough input(which a password often does not) and compute it quickly. I tried to do a hashing on weakpassword 10 million times with my AMD Ryzen 5 3600(6 Cores, 12 Threads @3.6GHz) CPU I can finish it very fast.

h := sha256.New()  
start := time.Now()  
for i := 0; i < 10000000; i++ {  
   h.Write([]byte("weakpassword"))  
}  
elapsed := time.Since(start)  
log.Printf("SHA256 took %s \n", elapsed)

The result is:

2022/07/28 16:57:00 SHA256 took 337.2232ms

With SHA-256, the attacker would be able to hash and compare many passwords quickly with it. We need a slower hashing function.

Bcrypt

This is where Bcrypt comes in. Bcrypt is a password hashing function based on Blowfish in which you can determine its cost to run. This trait, in particular, is perfect for password hashing because it will future-proof the hashing function when a faster machine comes up.

Let's see Bcrypt it in action:

for i := 10; i < 21; i++ {  
   start := time.Now()  
   bcrypt.GenerateFromPassword([]byte("weakpassword"), i)  
   elapsed := time.Since(start)  
   fmt.Printf("cost: %d Elapsed time: %s\n", i, elapsed)  
}

The result is:

cost: 10 Elapsed time: 56.233ms
cost: 11 Elapsed time: 113.1521ms
cost: 12 Elapsed time: 213.0455ms
cost: 13 Elapsed time: 447.572ms
cost: 14 Elapsed time: 877.3284ms
cost: 15 Elapsed time: 1.8126554s
cost: 16 Elapsed time: 3.375513s
cost: 17 Elapsed time: 6.5935858s
cost: 18 Elapsed time: 13.3655301s
cost: 19 Elapsed time: 27.0033831s
cost: 20 Elapsed time: 53.8954938s

We can see that the time went parabolic compared to the cost. Higher cost means better security but a worse user experience. Just imagine if you set the cost as 20, the user will need to wait 53 seconds when logging in. But put it too low, it will be easier for the attacker to steal your user's credential.

Let's do some math. Suppose you have ten million users in your database, and the attacker has a dictionary of 1000 most common passwords. How long would it take for the attacker to calculate the password hash with SHA256 compared to Bcrypt with the cost of 12?

First, we will need to calculate how many hash operations the attacker needs to do, which we can get by multiplying how many users we have by the number of common passwords the attacker uses. So, 10.000.000 * 1000 = 10.000.000.000 Now we calculate the time the attacker needs to calculate the hash ten billion times. For SHA256, we did a million calculations in 337.2232ms. So we can calculate all of the hash: 10.000.000.000/1.000.000*337.2232ms = 3372232ms, which is just under 1 hour.

Next, let's try with Bcrypt with cost 12: 10.000.000.000*213.0455ms = 2.130455e+12ms, which equals to 4053377 years. As you can see, using Bcrypt for your password hashing function makes a lot of difference. If a data breach happens to your database, it will buy you a lot of time to notice it and ask your users to change their passwords.

Besides determining cost, Bcrypt also uses salt by default, which means the attacker won't be able to do a rainbow table attack we discussed previously. Let's see what the result is if we hash weakpassword with Bcrypt:

hashedPassword,_ := bcrypt.GenerateFromPassword([]byte("weakpassword"), 10)  
fmt.Printf("hashed password: %s", hashedPassword)


hashed password: $2a$10$.krQtTcne8xlhG2rJONbKu9KZepUpwl8tyC/fFIB6lRmNufvPfge2

If we break the result down, we will get:

Bcrypt breakdown

alg: The has algorithm identifier, $2a means Bcrypt
cost: The cost of the Bcrypt, remember we set this as 10 in the code
salt (22 characters): Random salt for password hashing generated by the Bcrypt hashing function
hashed password (31 characters): The hashing function result.

Lastly, let's see how to validate the password hashed by Bcrypt:

hashedPassword,_ := bcrypt.GenerateFromPassword([]byte("weakpassword"), 10)  
err := bcrypt.CompareHashAndPassword(hashedPassword, []byte("weakpassword"))  

if err != nil {  
   fmt.Print(err)  
} else {  
   fmt.Printf("Password true")  
}

As we can see, we don't need to send the cost, alg, and salt when comparing the hash and password because every required input has been added to the Bcrypt hashing result itself.

Let's review, there are two essential traits of Bcrypt that make it suitable for password hashing:

Bcrypt let us determine the cost to calculate the hash result, which makes it future-proof for faster machines.
Bcrypt calculates its forces using salt, making a rainbow table attack impossible to do.

Next Step

We've discussed storing your user password correctly so attackers can't figure out your user's password quickly. But securely storing our users' passwords doesn't mean the attacker can't get the password. For example, the attacker can take a look when your user inputted their password to figure out the password. There is also a chance that the attacker can make a man-in-the-middle attack if your website doesn't use HTTPS.

If you want to understand more about how to secure the authentication process, I urge you to read about:

MFA
Passwordless Login (Login by OTP, Magic Link, and WebAuthn)
TLS Protocol

References

How to Implement JSON Web Token (JWT) in Java Spring Boot

Brilian Firdaus — Sun, 12 Jun 2022 10:58:28 +0000

JSON Web Token or JWT has been famous as a way to communicate securely between services. There are two form of JWT, JWS and JWE. The difference between them is that JWS' payload is not encrypted while JWE is.

This article will explore the implementation of the JWT in Java Spring Boot. If you want to learn more about the JWT itself, you can visit my other article here.

The code in this article is hosted on the following GitHub repository: https://github.com/brilianfird/jwt-demo.

Library

For this article, we will use the jose4j library. jose4j is one of the popular JWT libraries in Java and has a full feature. If you want to check out other libraries (whether it's for Java or not), jwt.io has compiled a list of them.

<dependency>  
    <groupId>org.bitbucket.b_c</groupId>  
    <artifactId>jose4j</artifactId>  
    <version>0.7.12</version>  
</dependency>

Implementing JWS in Java

JSON Web Signature (JWS) consists of three parts:

JOSE Header
Payload
Signature

Let's see an example of the JOSE header:

{
    alg:"HS264"
}

JOSE header store the metadata about how to handle the JWS.

alg stores information about which signing algorithm the JWT uses.

Next, let's check the payload:

{
  "sub": "1234567890",
  "name": "Brilian Firdaus",
  "iat": 1651422365
}

JSON payload stores the data that we want to transmit to the client. It also stores some JWT claims for information purposes that we can verify.

In the example above, we have three fields registered as JWT claims.

sub indicates the user's unique id
name indicates the name of the user
iat indicates when we created the JWT in an epoch

The last part is the signature, which is the one that makes JWS secure. Usually, the signature of the JWS will be in the form of bytes. Let's see an example of a Base64 Encoded signature:

qsg3HKPxM96PeeXl-sMrao00yOh1T0yQfZa-BsrtjHI

Now, if we see the three parts above, you might wonder how to transfer those three parts seamlessly to the consumer. The answer is with compact serialization. Using compact serialization, we can easily share the JWS with the consumer because the JWS will become one long string.

Base64.encode(JOSE Header) + "." + Base64.encode(Payload) + "." + Base64.encode(signature)

The result will be:

eyJhbGciOiJIUzI1NiIsImtpZCI6IjIwMjItMDUtMDEifQ.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkJyaWxpYW4gRmlyZGF1cyIsImlhdCI6MTY1MTQyMjM2NX0.qsg3HKPxM96PeeXl-sMrao00yOh1T0yQfZa-BsrtjHI

The compact serialization part is also mandatory in the JWT specification. So for a JWS to be considered a JWT, we must do a compact serialization.

Unprotected

The first type of JWS we will explore is an unprotected JWS. People rarely use his type of JWS (Basically just a regular JSON), but let's explore this first to understand the base of the implementation.

Let's start by creating the header. Unlike the previous example where we used the HS256 algorithm, now we will use no algorithm.

Producing Unprotected JWS

@Test  
public void JWS_noAlg() throws Exception {  

  JwtClaims jwtClaims = new JwtClaims();  
  jwtClaims.setSubject("7560755e-f45d-4ebb-a098-b8971c02ebef"); // set sub
  jwtClaims.setIssuedAtToNow(); // set iat
  jwtClaims.setExpirationTimeMinutesInTheFuture(10080); // set exp
  jwtClaims.setIssuer("https://codecurated.com"); // set iss
  jwtClaims.setStringClaim("name", "Brilian Firdaus"); // set name
  jwtClaims.setStringClaim("email", "brilianfird@gmail.com");//set email  
  jwtClaims.setClaim("email_verified", true); //set email_verified

  JsonWebSignature jws = new JsonWebSignature();  
  jws.setAlgorithmConstraints(AlgorithmConstraints.NO_CONSTRAINTS);  
  jws.setAlgorithmHeaderValue(AlgorithmIdentifiers.NONE);  
  jws.setPayload(jwtClaims.toJson());  

  String jwt = jws.getCompactSerialization(); //produce eyJ.. JWT
  System.out.println("JWT: " + jwt);  
}

Let's see what we did in the code.

We set a bunch of claims (sub, iat, exp, iss, name, email, email_verified)
We set the signing algorithm to NONE and the algorithm constraint to NO_CONSTRAINT because jose4j will throw an exception because the algorithm lack security
We packaged JWS in the compact serialization, which will produce one string containing the JWS. The result is a JWT complied String.

Let's see what output we get by calling the jws.getCompactSerialization():

eyJhbGciOiJub25lIn0.eyJzdWIiOiI3NTYwNzU1ZS1mNDVkLTRlYmItYTA5OC1iODk3MWMwMmViZWYiLCJpYXQiOjE2NTI1NTYyNjYsImV4cCI6MTY1MzE2MTA2NiwiaXNzIjoiaHR0cHM6Ly9jb2RlY3VyYXRlZC5jb20iLCJuYW1lIjoiQnJpbGlhbiBGaXJkYXVzIiwiZW1haWwiOiJicmlsaWFuZmlyZEBnbWFpbC5jb20iLCJlbWFpbF92ZXJpZmllZCI6dHJ1ZX0.

If we try to decode it, we'll get the JWS with fields that we set before:

{
  "header": {
    "alg": "none"
  },
  "payload": {
    "sub": "7560755e-f45d-4ebb-a098-b8971c02ebef",
    "iat": 1652556266,
    "exp": 1653161066,
    "iss": "https://codecurated.com",
    "name": "Brilian Firdaus",
    "email": "brilianfird@gmail.com",
    "email_verified": true
  }
}

We've successfully created a JWT with Java's jose4j library! Now, let's proceed to the JWT-consuming process.

To consume the JWT, we can use the JwtConsumer class in the jose4j library. Let's see an example:

@Test  
public void JWS_consume() throws Exception {  
  String jwt = "eyJhbGciOiJub25lIn0.eyJzdWIiOiI3NTYwNzU1ZS1mNDVkLTRlYmItYTA5OC1iODk3MWMwMmViZWYiLCJpYXQiOjE2NTI1NTYyNjYsImV4cCI6MTY1MzE2MTA2NiwiaXNzIjoiaHR0cHM6Ly9jb2RlY3VyYXRlZC5jb20iLCJuYW1lIjoiQnJpbGlhbiBGaXJkYXVzIiwiZW1haWwiOiJicmlsaWFuZmlyZEBnbWFpbC5jb20iLCJlbWFpbF92ZXJpZmllZCI6dHJ1ZX0.";  

  JwtConsumer jwtConsumer = new JwtConsumerBuilder()  
          // required for NONE alg  
          .setJwsAlgorithmConstraints(AlgorithmConstraints.NO_CONSTRAINTS) 
          // disable signature requirement  
          .setDisableRequireSignature()
          // require the JWT to have iat field  
          .setRequireIssuedAt() 
          // require the JWT to have exp field 
          .setRequireExpirationTime()  
          // expect the iss to be https://codecurated.com  
          .setExpectedIssuer("https://codecurated.com") 
          .build();  

  // process JWT to jwt context  
  JwtContext jwtContext = jwtConsumer.process(jwt); 
  // get JWS object
  JsonWebSignature jws = (JsonWebSignature)jwtContext.getJoseObjects().get(0);
  // get claims  
  JwtClaims jwtClaims = jwtContext.getJwtClaims(); 

  // print claims as map  
  System.out.println(jwtClaims.getClaimsMap()); 
}

By using JwtConsumer, we can easily make rules about what to validate when processing incoming JWT. It also provides an easy way to get the JWS Object and the claims by using .getJoseObjects() and getJwtClaims(), respectively.

Now that we know how to produce and consume JWT without a signing algorithm, it will be much easier to understand the one with it. The difference is that we need to set the algorithm and create a key(s) to generate/validate the JWT.

HMAC SHA-256

HMAC SHA-256(HS256) is a MAC function with a symmetric key. We will need to generate at least 32 bytes for its secret key and feed it to the HmacKey class in the jose4j library to ensure security.

We'll use the SecureRandom library in Java to ensure the key randomity.

byte[] key = new byte[32];  

SecureRandom secureRandom = new SecureRandom();  
secureRandom.nextBytes(key);

HmacKey hmacKey = new HmacKey(key);

The secret key should be considered as a credential, hence it should be stored in a secure environment. For recommendation, you can store it as a environment variable or in Vault.

Let's see how to create and consume the JWT signed with HS256:

@Test  
public void JWS_HS256() throws Exception {  

  // generate key  
  byte[] key = new byte[32];  
  SecureRandom secureRandom = new SecureRandom();  
  secureRandom.nextBytes(key);  
  HmacKey hmacKey = new HmacKey(key);  

  JwtClaims jwtClaims = new JwtClaims();  
  jwtClaims.setSubject("7560755e-f45d-4ebb-a098-b8971c02ebef"); // set sub  
  jwtClaims.setIssuedAtToNow(); // set iat  
  jwtClaims.setExpirationTimeMinutesInTheFuture(10080); // set exp  
  jwtClaims.setIssuer("https://codecurated.com"); // set iss  
  jwtClaims.setStringClaim("name", "Brilian Firdaus"); // set name  
  jwtClaims.setStringClaim("email", "brilianfird@gmail.com");//set email  
  jwtClaims.setClaim("email_verified", true); //set email_verified  

  JsonWebSignature jws = new JsonWebSignature();  
  // Set alg header as HMAC_SHA256  
  jws.setAlgorithmHeaderValue(AlgorithmIdentifiers.HMAC_SHA256);  
  // Set key to hmacKey  
  jws.setKey(hmacKey);  
  jws.setPayload(jwtClaims.toJson());  

  String jwt = jws.getCompactSerialization(); //produce eyJ.. JWT  

  // we don't need NO_CONSTRAINT and disable require signature anymore 
  JwtConsumer jwtConsumer = new JwtConsumerBuilder()  
          .setRequireIssuedAt()  
          .setRequireExpirationTime()  
          .setExpectedIssuer("https://codecurated.com")  
          // set the verification key  
          .setVerificationKey(hmacKey)  
          .build();  

  // process JWT to jwt context  
  JwtContext jwtContext = jwtConsumer.process(jwt);  
  // get JWS object  
  JsonWebSignature consumedJWS = (JsonWebSignature)jwtContext.getJoseObjects().get(0);  
  // get claims  
  JwtClaims consumedJWTClaims = jwtContext.getJwtClaims();  

  // print claims as map  
  System.out.println(consumedJWTClaims.getClaimsMap());  

  // Assert header, key, and claims  
  Assertions.assertEquals(jws.getAlgorithmHeaderValue(), consumedJWS.getAlgorithmHeaderValue());  
  Assertions.assertEquals(jws.getKey(), consumedJWS.getKey());  
  Assertions.assertEquals(jwtClaims.toJson(), consumedJWTClaims.toJson());  
}

There isn't much difference in the code compared to creating a JWS without a signing algorithm. We first made the key using SecureRandom and HmacKey classes. Since HS256 uses a symmetric key, we only need one key that we will use to sign and verify the JWT.

We also set the algorithm header value to HS256 by using jws.setAlgorithmheaderValue(AlgorithmIdentifiers.HMAC_SHA256 and the key with jws.setKey(hmacKey).

In the JWT consumer, we only need to set the HMAC key by using .setVerificationKey(hmacKey) on the jwtConsumer object jose4j will automatically determine which algorithm is used in the JWS by parsing its JOSE header.

ES256

Unlike the HS256 that only needs one key, we need to generate two keys for the ES256 algorithm, private and public keys.

We can use the private key to create and verify the JWT, while we can only use public keys to verify the JWT. Due to those traits, a private key is usually stored as a credential, while a public key can be hosted in public as JWK so the consumer of the JWT can query the host and get the key by themself.

jose4j library provides a simple API to generate private and public keys as a JWK.

EllipticCurveJsonWebKey ellipticCurveJsonWebKey = EcJwkGenerator.generateJwk(EllipticCurves.P256);

// get private key
ellipticCurveJsonWebKey.getPrivateKey();

// get public key
ellipticCurveJsonWebKey.getECPublicKey();

Now that we know how to generate the key creating the JWT with the ES256 algorithm is almost the same as creating a JWT with the HS256 algorithm.

...
JsonWebSignature jws = new JsonWebSignature();  
// Set alg header as ECDSA_USING_P256_CURVE_AND_SHA256  
jws.setAlgorithmHeaderValue(AlgorithmIdentifiers.ECDSA_USING_P256_CURVE_AND_SHA256);  
// Set key to the generated private key  
jws.setKey(ellipticCurveJsonWebKey.getPrivateKey());  
jws.setPayload(jwtClaims.toJson());
...
JwtConsumer jwtConsumer = new JwtConsumerBuilder()  
        .setRequireIssuedAt()  
        .setRequireExpirationTime()  
        .setExpectedIssuer("https://codecurated.com")  
        // set the verification key as the public key  
        .setVerificationKey(ellipticCurveJsonWebKey.getECPublicKey())  
        .build();
...

The only different things are:

We set the algorithm header as ECDSA_USING_P256_CURVE_AND_SHA256
We use the private key when creating the JWT
We use the public key for verifying the JWT

Hosting JWK

We can easily create JSON Web Key Set using the JsonWebKeySet class.

@GetMapping("/jwk")  
public String jwk() throws JoseException {  
// Create public key and private key pair
  EllipticCurveJsonWebKey ellipticCurveJsonWebKey = EcJwkGenerator.generateJwk(EllipticCurves.P256);  

  // Create JsonWebkeySet object
  JsonWebKeySet jsonWebKeySet = new JsonWebKeySet();  

  // Add the public key to the JsonWebKeySet object
  jsonWebKeySet.addJsonWebKey(ellipticCurveJsonWebKey);  

  // toJson() method by default won't host the private key
  return jsonWebKeySet.toJson();  
}

We also need to change some properties of the key resolver:

// Define verification key resolver
HttpsJwks httpsJkws = new HttpsJwks("http://localhost:8080/jwk");  
HttpsJwksVerificationKeyResolver verificationKeyResolver =  
    new HttpsJwksVerificationKeyResolver(httpsJkws);  

JwtConsumer jwtConsumer = new JwtConsumerBuilder()  
    .setRequireIssuedAt()  
    .setRequireExpirationTime()  
    .setExpectedIssuer("https://codecurated.com")  
    // set verification key resolver
    .setVerificationKeyResolver(verificationKeyResolver)  
    .build();

Since we hosted the JSON Web Key Set, we need to query the host. jose4j is also providing a simple way to do this by using HttpsJwksVerificationKeyResolver.

Implementing JWE in Java

JSON Web Encryption, unlike JWS, is a type of JWT that is encrypted so that no one can see its content except the one with the private key. First, let's see an example of it.

eyJhbGciOiJFQ0RILUVTK0EyNTZLVyIsImVuYyI6IkExMjhDQkMtSFMyNTYiLCJlcGsiOnsia3R5IjoiRUMiLCJ4IjoiMEdxMEFuWUk1RVFxOUVZYjB4dmxjTGxKanV6ckxhSjhUYUdHYzk5MU9sayIsInkiOiJya1Q2cjlqUWhjRU1xaGtubHJ6S0hVemFKMlhWakFpWGpIWGZYZU9aY0hRIiwiY3J2IjoiUC0yNTYifX0.DUrC7Y_ejpt1n9c8wXetwU65sxkEYxG6RBsCUdokVODJBtwypL9VjQ.ydZx-UDWDN7jbGeESXvPHg.6ksHUeeGgGj0txFNXmsSQUCnAv52tJuGR5vgrX54vnLkryPFv2ATdLwYXZz3mAjeDes4s9otz4-Fzg1IBZ4qsfCVa6_3CVdkb8BTU4OvQx23SFEgtj8zh-8ZrqZbpKIT.p-E09mQIleNCCmwX3YL-uQ

The structure of the JWE is:

BASE64URL(UTF8(JWE Protected Header)) || ’.’ ||
BASE64URL(JWE Encrypted Key) || ’.’ ||
BASE64URL(JWE Initialization Vector) || ’.’ ||
BASE64URL(JWE Ciphertext) || ’.’ ||
BASE64URL(JWE Authentication Tag)

And if we decrypt the JWE, we will get the following claims:

{
    "iss":"https://codecurated.com",
    "exp":1654274573,
    "iat":1654256573,
    "sub":"12345"
}

Now, let's see how we create the JWE:

@Test  
public void JWE_ECDHES256() throws Exception {  
  // Determine signature algorithm and encryption algorithm  
  String alg = KeyManagementAlgorithmIdentifiers.ECDH_ES_A256KW;  
  String encryptionAlgorithm = ContentEncryptionAlgorithmIdentifiers.AES_128_CBC_HMAC_SHA_256;  

  // Generate EC JWK  
  EllipticCurveJsonWebKey ecJWK = EcJwkGenerator.generateJwk(EllipticCurves.P256);  

  // Create  
  JwtClaims jwtClaims = new JwtClaims();  
  jwtClaims.setIssuer("https://codecurated.com");  
  jwtClaims.setExpirationTimeMinutesInTheFuture(300);  
  jwtClaims.setIssuedAtToNow();  
  jwtClaims.setSubject("12345");  

  // Create JWE  
  JsonWebEncryption jwe = new JsonWebEncryption();  
  jwe.setPlaintext(jwtClaims.toJson());  

  // Set JWE's signature algorithm and encryption algorithm  
  jwe.setAlgorithmHeaderValue(alg);  
  jwe.setEncryptionMethodHeaderParameter(encryptionAlgorithm);  

  // Unlike JWS, to create the JWE we use the public key  
  jwe.setKey(ecJWK.getPublicKey());  
  String compactSerialization = jwe.getCompactSerialization();  
  System.out.println(compactSerialization);  

  // Create JWT Consumer  
  JwtConsumer jwtConsumer =  
      new JwtConsumerBuilder()  
          // We set the private key as decryption key  
          .setDecryptionKey(ecJWK.getPrivateKey())  
          // JWE doesn't have signature, so we disable it  
          .setDisableRequireSignature()  
          .build();  

  // Get the JwtContext of the JWE  
  JwtContext jwtContext = jwtConsumer.process(compactSerialization);  

  System.out.println(jwtContext.getJwtClaims());  
}

The main difference between creating and consuming JWE compared to JWS are:

We use a public key as the encryption key and a private key as the decryption key
We don't have a signature in JWE, so the consumer will need to skip the signature requirement

Conclusion

In this article, we've learned to create both JWS and JWE in Java using jose4j. Hopefully, this article is useful to you. If you want to learn more about the concept of JWT, you can visit my other article.

Introduction to JWT (Also JWS, JWE, JWA, JWK)

Brilian Firdaus — Mon, 02 May 2022 06:02:02 +0000

The security and privacy of users' data have been a growing concern for the past few years. At the same time, JWT, as one technology to combat it, has been used more and more. Understanding JWT will give you an edge over the other software engineers. JWT might seem simple at first, but it is pretty hard to understand.

In this article, we will explore mainly JWT and JWS. In addition, we'll also go through JWE, JWA, and JWK quickly. This article aims to make the reader understand the concept of JWT without diving too deep into the topic.

What Are They?

Well, let's take a look at what are the JWT, JWS, JWE, JWA, and JWK

JSON Web Token (JWT)

First, let's see the definition of JWT defined in the RFC 7519.

JSON Web Token (JWT) is a compact claims representation format intended for space constrained environments such as HTTP Authorization headers and URI query parameters. JWTs encode claims to be transmitted as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or integrity protected with a Message Authentication Code (MAC) and/or encrypted. JWTs are always represented using the JWS Compact Serialization or the JWE Compact Serialization.

From the text, we can understand that JWT is not a structure but a set of claims in the shape of either JWS or JWE as its way of securing itself. In the most basic form, the difference between JWS and JWE is that everyone can see the payload of JWS while the JWE one is encrypted.

In this article, we will explore more about JWS than JWE.

JSON Web Algorithm (JWA)

JWA (RFC 7518), which stands for JSON Web Algorithm, is a specification defining which hashing and encryption algorithm to make a JWT.

For example, the following are the hashing algorithms we can use to create a JWT with a JWS structure.

JSON Web Key (JWK)

JWK (RFC 7517) stands for JSON Web Key. JWK is a JSON data structure that contains information about hashing function's cryptographic key. It's a way to store your hashing key in JSON format.

{
    "kty":"EC",
    "crv":"P-256",
    "x":"f83OJ3D2xF1Bg8vub9tLe1gHMzV76e8Tus9uPHvRVEU",
    "y":"x_FEzRu9m36HLN_tue659LNpXW6pCyStikYjKIWI5a0",
    "kid":"Public key used in JWS spec Appendix A.3 example"
}

The JWK is usually used to host a public key for a hashing function with an asymmetric key (private key and public key), so the consumer can get the key by themself.

JSON Web Signature (JWS)

JWS (RFC 7515), which stands for JSON Web Signature, is one of the structures used by JWT. It's the most common implementation of the JWT. JWS consists of 3 parts: the JOSE header, payload, and signature.

The following is an example of JWS in compact serialization.

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

JWS compact serialization example

If we decode the JWS, we will get JOSE (JavaScript Object Signing and Encryption) header:

{
  "alg": "HS256",
  "typ": "JWT"
}

The payload:

{
  "sub": "1234567890",
  "name": "John Doe",
  "iat": 1516239022
}

The next part is the signature. We won't decode it because it's a bytes value.

SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

The most common question related to JWT (in JWS structure) is what makes it secure since everyone can decode the JWT and see its data. The JWT is secure because not everyone can create it, but only the one with the secret key.

Json Web Encryption (JWE)

JWE (RFC 7516), unlike JWS, encrypts its content using an encryption algorithm. The only one that can see what is inside the JWT is the one with the key.

The structure of JWE compact serialization is as follows:

BASE64URL(UTF8(JWE Protected Header)) || ’.’ ||
BASE64URL(JWE Encrypted Key) || ’.’ ||
BASE64URL(JWE Initialization Vector) || ’.’ ||
BASE64URL(JWE Ciphertext) || ’.’ ||
BASE64URL(JWE Authentication Tag)

Let's take a look at one example:

eyJhbGciOiJSU0EtT0FFUC0yNTYiLCJlbmMiOiJBMTI4Q0JDLUhTMjU2In0.RD09fEltrYPVNoGt2KY1Odv_5eDxkU4VX1f__P8b9zl9uzh5bmvvJy35dL-hYlUib1g63qnWBEfeSyDk5cAIQiMt6PZCBQzuWQJQlQtuo2UPLZznmLPqah37uHKB4a57q_lWf_W9soyZbO7Zj7QRNz4ZR4s5ozRHArSZcc1pAL-pYuHKyeh6Ey8t4bk66wkthjjfOjXvIfOlgbemhibegmE4GpQL6F-m0teqcAE-OxkaBRTmmb4AD5HdrCJWCIIuC52fzuWrhcoNmHM74ggtWUUjlHaKpwcVE-IWINTFaz5Pi9u4U3vnVNOZwDwB0TLSQvqnPwTZ-bYWNj8vH4TS_w.Pjo5QK1u1otxgcuBR7e8ew._OElhHugS2L6Kp04HhbFt6dLij_KXhO654RmT4JKyswYBX0wqRWt7ZzAE6eCHfJSJdMQYxqVSNloGb4OSIzYcTEo174lBZBINkHW-w2K6E0.QBDgBFizm80HLVkZvfBPCg

The example uses the RSA_OAEP_256 key management algorithm and AES_128_CBC_HMAC_SHA_256 as its encryption algorithm. If we decrypted the JWE, we will get:

{
  "iss": "https://codecurated.com",
  "exp": 1651417524,
  "iat": 1651417224
}

Decoding the JWT

Now let's take a deeper look at the JWT by using the following example:

eyJhbGciOiJIUzI1NiIsImtpZCI6IjIwMjItMDUtMDEifQ.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkJyaWxpYW4gRmlyZGF1cyIsImlhdCI6MTY1MTQyMjM2NX0.qsg3HKPxM96PeeXl-sMrao00yOh1T0yQfZa-BsrtjHI

The JWT in the example is a JWT with JWS compact serialization structure. We need to split the JWT by . (dot), and Base64 decode them to look at what's inside it.

Breaking down the JWT

After splitting and decoding the JWT, we will get three parts:

JOSE (JavaScript Object Signature and Encryption) Header
Payload
Signature

JOSE Header

eyJhbGciOiJIUzI1NiIsImtpZCI6IjIwMjItMDUtMDEifQ

After doing a Base64 decode on the string, we will get:

{
  "alg": "HS256",
  "kid": "2022-05-01"
}

This part of JWT is called the JOSE header. With the JOSE header, the JWT can inform the client how to handle the JWT.

Let's break down the two fields we have in our JWT:

alg: contains the information regarding the signing algorithm of the JWT.
kid: contains the information of the id of the key used for verifying the JWT. We will explore more about this in the JWK section.

alg is the only mandatory header, and it's the only one needed for most cases, but there are many more headers you can check here.

Payload

With the JOSE header out of the way, let's take a look at the second part, the payload.

eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkJyaWxpYW4gRmlyZGF1cyIsImlhdCI6MTY1MTQyMjM2NX0

After we decode it:

{
  "sub": "1234567890",
  "name": "Brilian Firdaus",
  "iat": 1651422365
}

This part of the JWT is called the payload. There are fields with the term JWT claims ( sub, name, iat). So, now seems like a good time to talk about JWT claims.

There are three types of JWT claims:

Registered Claims
Public Claims
Private Claims

Let's break it down one by one, starting from Registered Claims. Registered claims are claims that have been documented initially in the RFC 7519.

iss (Issuer): indicates who is the issuer of the JWT.
sub (Subject): indicates the user's id requesting the JWT.
aud (Audience): shows who's the intended consumer of the JWT.
exp (Expiration): the expiration time of the JWT.

You can check the complete list here. None of these claims are mandatory, as dictated in the RFC 7519, but they are essential in securing your JWT.

The next type of claim we'll explore is private claims. These claims can be anything. It's up to the JWT creator or consumer to determine the claims name and function.

Note: When specifying private claims, you need to be careful not to cause any collision in the name.

The last one is public claims, which is a type of claim that is publicly registered through IETF. You can check the list of public claims at: https://www.iana.org/assignments/jwt/jwt.xhtml#claims.

Most people don't have any use case needing them to register their claims. For best practice, you can search the list of Public Claims and use the one suitable for your use case.

Signature

If you've come to this part, you might wonder what makes the JWT secure since everyone can see its content. Well, the payload of the JWS is intended for everyone to read. What makes the JWT safe is the consumer can verify who is the one issued the JWT.

The signature part in JWT is created by using a hash function. If you're not familiar with a hash function, it's an algorithm that maps an object to another object. A hash function has two crucial traits which make it suitable to secure the JWT:

It only works one way
The results of hashing process are always the same

In this article, we'll explore 2 of the most common hash function used in JWT:

HMAC SHA-256
ECDSA256

Now, let's break apart the signature we use in the example:

qsg3HKPxM96PeeXl-sMrao00yOh1T0yQfZa-BsrtjHI

Which is generated using HS256 MAC algorithm with secret key (in base64) 7TgIAQCcYUA27bCI5+m7InRwp/mzQ+ArnFW/4c0Q51U=.

The HS256 MAC algorithm receives bytes value as its secret key parameter and produces bytes value as its output. So, unlike the header and the payload, we will get bytes value if we try to decode the signature.

Let's explore the digital signature or MAC algorithm more thoroughly.

HMAC SHA-256

The first algorithm we'll explore is HMAC SHA-256 (HS256), a MAC algorithm using a hash function with a symmetric key. Hashing function with a symmetric key means only one key for the hashing function. So, the producer and consumer of the JWT will use the same key to sign and verify the JWT. The advantage of using this algorithm is it doesn't need many CPU resources to create the hash.

Note: The minimum bytes length recommended for the hs256 secret key is 32 bytes. The secret key must be generated with a Cryptographically-secure pseudorandom number generator to ensure its randomity.

HS256 Example

ECDSA-256

ECDSA-256 (ECDSA256), unlike HMAC, is an algorithm that uses hashing function with an asymmetric key. Hashing function with an asymmetric key means we will need to generate two keys. One key is called a private key, which can be used to both sign and verify the JWT signature. The other key is called a public key, which can only be used to verify the JWT signature.

Private and Public Key in ECDSA-256

As their name implies, a public key can be stored in a public space (usually as a JWK), so the client that needs to verify your JWT signature can get it quickly. In contrast, a private key must be secured and treated as a credential.

Which Hashing Function to Choose?

We have learned two algorithms, HS256 and ECDSA256, but when to choose one instead of another? You can easily decide about it by thinking about whether the producer and the consumer of the JWT are not the same components.

If the consumer of the JWT is the same component, then you can use the HS256 algorithm. This hashing function's most common use case is when you're making an authentication system using JWT.

Note: Some people consider hs256 a kind of anti-pattern because JWT is supposedly used to increase security, but most of the use case that uses this algorithm decreases security.

For example, suppose you plan to use JWT as an authentication session instead of a database. In that case, your system is more insecure because you can't expire the JWT, so you can't kick sessions.

On the other hand, if the JWT is produced and consumed by a different component, you can use ECDSA256. This way, you can secure your key and ensure that no one else (even the public key owner) can create a JWT on your behalf.

Storing Public Key as JWKs (JSON Web Key Set)

If you plan to let the public consume and verify your JWT, it's recommended to host the public key as a JWKs on a URL. This way, if the consumer wants to verify your JWT, they can query a specific URL hosting the JWKs and get the public key.

{
  "keys": [
    {
      "kty": "EC",
      "kid": "2022-05-01",
      "x": "g_pYyqY7Htj8Aa989Ura0_mwRdqJPEnhknKzaUrztj8",
      "y": "MwOFYLE-VYre92hU0iDjNx36dk7cX6xdGgdgLIPt6Ts",
      "crv": "P-256"
    },
    {
      "kty": "EC",
      "kid": "2020-01-01",
      "x": "6bw04ZlSMjxVzC7gXv75XAposOVTONh45ZPR0AeYaoU",
      "y": "vYyCSIt0m5k4Q5A_uW8h3nEYJvgA8PgREErLcaiAHgQ",
      "crv": "P-256"
    }
  ]
}

You might've noticed more than one key in the JSON, which is why it's called JSON Web Key Set. If your product uses more than one key, you can host every key in the JWKs. To identify which key to use, you will need to add the kid or key id field in your JOSE Header.

{
    "alg":"ES256",
    "kid":"2022-05-01"
}

This way, the client will know which key to get by comparing the kid in the JOSE header JWT to the one in the JWK. The other fields, combined, will make the public key.

Takeaway

In this article, we've learned that:

JWT is an abstract concept about how to allow one or more parties to exchange information securely. The implementation of JWT comes in the form of JWS or JWE.
The difference function between JWS and JWE is that JWS allows everyone to see its payload, while JWE doesn't allow it by using an encryption method.
What makes JWS considered secure even though everyone can see its payload is that the creator of the JWS can be verified by its signature using MAC or Signature Verifying algorithm. This way, the consumer can be sure that the one created the JWT is the one intended.
We've explored two types of algorithms, HS256 and ECDSA256. HS256 is suitable when the producer and the consumer of the JWT are the same components, while ECDSA256 is suitable when the producer and the consumer are different components.
We can use JSON Web Key Set to host a public key for hashing function with an asymmetric key. You need to set the kid field in the JOSE header of your JWT so the consumer can compare it to the one in the JWKs to get a compatible key.

Alas, I want to thank you for reading until the end. The next step you can take is to learn how to implement it (I intend to write about it too). There are many libraries for popular programming languages that you can check here.

Getting Started With Elasticsearch in Java Spring Boot

Brilian Firdaus — Wed, 31 Mar 2021 16:54:58 +0000

Both Java and Elasticsearch is a popular technology stack companies use. Java is a programming language that was released back in 1996. Currently, Java is acquired by Oracle and still in active development.

Elasticsearch is a young technology when we compare it to Java, it has only released in 2010 (14 years younger than Java). It’s gaining popularity quickly and now used in many companies as a search engine.

Seeing how popular they are, I’m sure that many people and companies want to connect Java with Elasticsearch to develop their own search engine. In this article, I want to teach you how to connect Java Spring Boot 2 with Elasticsearch. We will learn how to create an API that will call Elasticsearch to produce results.

Connecting Java with Elasticsearch

The first thing we must do to connect our Spring Boot project with Elasticsearch. The easiest way to do this is to use the client library provided by Elasticsearch, which we can just add to our package manager like Maven or Gradle.

For this article, we’ll use a spring-data-elasticsearch library provided by Spring Data, which also includes Elasticsearch’s high level client library.

Starting our project

Let’s start by creating our Spring Boot project with Spring Initialzr. I’ll configure my project to be like the picture below, since we’re going to use high-level client, then we can use a convenient library provided by Spring, “Spring Data Elasticsearch”:

Adding dependency to Spring Data Elasticsearch

If you followed my Spring Initialzr configuration in the previous section, then you should already have the elasticsearch client dependency in your project. But, if you don’t, you can add:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

Creating Elasticsearch client's bean

There are 2 methods to initialize the bean, you can either use the beans defined in the spring data elasticsearch library or you can create your own bean.

The first and easy one is to use the bean configured by spring data elasticsearch.

For example, you can add these properties in your application.properties:

spring.elasticsearch.rest.uris=localhost:9200
spring.elasticsearch.rest.connection-timeout=1s
spring.elasticsearch.rest.read-timeout=1m
spring.elasticsearch.rest.password=
spring.elasticsearch.rest.username=

The second method is to create your own bean. You can configure the settings by creating RestHighLevelClient bean. If the bean is exist, the spring data will use it as its configuration.

@Configuration
@RequiredArgsConstructor
public class ElasticsearchConfiguration extends AbstractElasticsearchConfiguration {

  private final ElasticsearchProperties elasticsearchProperties;

  @Override
  @Bean
  public RestHighLevelClient elasticsearchClient() {
    final ClientConfiguration clientConfiguration = ClientConfiguration.builder()
        .connectedTo(elasticsearchProperties.getHostAndPort())
        .withConnectTimeout(elasticsearchProperties.getConnectTimeout())
        .withSocketTimeout(elasticsearchProperties.getSocketTimeout())
        .build();

    return RestClients.create(clientConfiguration).rest();
  }
}

Testing the connection from our Spring Boot application to Elasticsearch

Your Spring Boot app and Elasticsearch should be connected now that you’ve configured the bean. Since we’re going to test the connection, make sure that your Elasticsearch is up and running!

To test it, we can create a bean that will create an index in the Elasticsearch in the DemoApplication.class. The class would look like:

@SpringBootApplication
public class DemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(DemoApplication.class, args);
    }

    @Bean
    public boolean createTestIndex(RestHighLevelClient restHighLevelClient) throws Exception {
        try {
            DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest("hello-world");
            restHighLevelClient.indices().delete(deleteIndexRequest, RequestOptions.DEFAULT); // 1
        } catch (Exception ignored) {
        }

        CreateIndexRequest createIndexRequest = new CreateIndexRequest("hello-world");
        createIndexRequest.settings(
                Settings.builder().put("index.number_of_shards", 1)
                        .put("index.number_of_replicas", 0));
        restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT); // 2

        return true;
    }
}

Okay, in that code we called Elasticsearch twice with the RestHighLevelClient, which we will learn later on in this article. The first call is to delete the index if it’s already exists. We used try catch that because if the index doesn’t exist then the elasticsearch will throw an error and failing our app starting process.

The second call is to create an index. Since I’m only running an 1 node Elasticsearch, I configured the shards to be 1 and replicas to be 0.

If everything went fine, then you should see the indices when you check your Elasticsearch. To check it, just go to http://localhost:9200/_cat/indices?v and you can see the list of the indexes in your Elasticsearch:

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open hello-world 0NgzXS5gRxmj1eFTPMCynQ 1 1 0 0 208b 208b

Congrats! You just connect your application to the Elasticsearch!!

Another ways to connect

I recommend you to use spring-data-elasticsearch library if you want to connect to Elasticsearch with Java. But, in case that you can’t use the library, there is another way to connect your apps to Elasticsearch.

High level client

As we know in the previous section, the spring-data-elasticsearch library we use also includes Elasticsearch’s high level client. If you’ve already imported spring-data-elasticsearch, then you can already use the Elasticsearch’s high level client.

If you want to, it’s also possible to use the high level client library directly without spring data’s dependency. You just need to add this dependency in your dependency manager:

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>8.0.0</version>
</dependency>

We’ll also use this client in our examples because the function in high level client is more complete than the spring-data-elasticsearch.

For more information, you can read Elasticsearch documentation.

Low level client

Elasticsearch’s low level client. You’ll have a harder time with this library, but you can customize it more. To use it, you can add the following dependency:

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-client</artifactId>
    <version>8.0.0</version>
</dependency>

For more information, you can read Elasticsearch documentation about this.

Transport Client

Elasticsearch also provides transport client, which will make your application identify as one of the node of Elasticsearch. I don’t recommend this method because it will be deprecated soon.

If you’re interested, you can read about transport client here.

REST call

The last way to connect to Elasticsearch is by doing a REST call. Since Elasticsearch uses REST API to connect to its client, you basically can use a REST call to connect your apps to Elasticsearch. You can use OKHTTP, Feign or Web Client to connect your apps with Elasticsearch.

I also don’t recommend this method because it’s a hassle. Since Elasticsearch already provides client libraries, it’s better to use them instead. Only use this method if you don’t have any other way to connect.

Using Spring Data Elasticsearch

First, let’s learn how to use spring-data-elasticsearch in our spring project. spring-data-elasticsearch is a very easy to use and high level library we can use to access the Elasticsearch.

Creating entity and configuring our index

After we’re done connecting your apps with Elasticsearch, it’s time to create an entity! With spring data, we can add a metadata in our entity, which will be read by the repository bean we created. This way the code will be much cleaner and faster to develop since we don’t need to create any mapping logic in our service level.

Let’s create an entity called Product:

@Data
@AllArgsConstructor
@NoArgsConstructor
@Builder
@Document(indexName = "product", shards = 1, replicas = 0, refreshInterval = "5s", createIndex = true)
public class Product {
    @Id
    private String id;

    @Field(type = FieldType.Text)
    private String name;

    @Field(type = FieldType.Keyword)
    private Category category;

    @Field(type = FieldType.Long)
    private double price;

    public enum Category {
        CLOTHES,
        ELECTRONICS,
        GAMES;
    }
}

So let me explain what’s going on in the code block above. First, I won’t explain about @Data @AllArgsConstructor @NoArgsConstructor @Builder . They’re annotations from Lombok library for constructor, getter, setter, builder, and other things. If you don’t know about them yet, I urge you to check it out.

Now, let’s talk about the first spring data annotation in the Entity, @Document . @Document annotation show that the class is an entity containing a metadata of the Elasticsearch index’s setup. To use spring data repository, which we’ll learn later on, the @Document annotation is mandatory.

The only annotation that is mandatory in the @Document is the indexName. It should be pretty clear from the name, we should fill it with the Index name we want to use for the entity. In this article, we’ll use the same name as the entity, product.

The second parameter of the @Document to talk about is the createIndex parameter. If you set the createIndex as true, your apps will create an index automatically when you’re starting the apps if the index doesn’t yet exist.

shards, replicas and refreshInterval parameters determine the index settings when the index is created. If you change the value of those parameters after the index is already created, the settings won’t be applied. So, the parameters will only be used when creating the index for the first time.

If you want to use a custom id in the Elasticsearch, you can use @Id annotations. If you use the @Id annotations, spring data will tell Elasticsearch to store the id in the document and the document source.

The @Field type will determine the field mapping of the field. Like shards, replicas and refreshInterval, the @Field type will only affect Elasticsearch when first creating the index. If you add a new field or change types when the index is already created, it won’t do anything.

Now that we configured the entity, let’s try out the automatic index creation by spring data! When we configure the createIndex as true, spring data will check whether the index exists in Elasticsearch. If it doesn’t exist, spring data will create the index with the configuration we created in the entity.

Let’s start our apps, after it is running, let’s check the settings and see if it’s correct:

curl --request GET \
  --url http://localhost:9200/product/_settings

The result is:

{
  "product": {
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "refresh_interval": "5s",
        "number_of_shards": "1",
        "provided_name": "product",
        "creation_date": "1607959499342",
        "store": {
          "type": "fs"
        },
        "number_of_replicas": "0",
        "uuid": "iuoO8lE6QyWVSoECxa0I8w",
        "version": {
          "created": "7100099"
        }
      }
    }
  }
}

Everything is as we configured! The refresh_interval is set to 5s, the number_of_shards is 1 and number_of_replicas is 0.

Now let's check the mappings:

curl --request GET \
  --url http://localhost:9200/product/_mappings

The result is:

{
  "product": {
    "mappings": {
      "properties": {
        "category": {
          "type": "keyword"
        },
        "name": {
          "type": "text"
        },
        "price": {
          "type": "long"
        }
      }
    }
  }
}

The mappings are also as we expected. It’s the same as we configured in the entity class!

Basic CRUD with spring data repository interface

After we created the entity, we’ve everything we need to create a repository interface in Spring Boot. Let’s create a repository called ProductRepository. When you’re creating an interface, make sure to extend ElasticsearchRepository<T, U>. In this case, the T object is your entity and U object type you want to use for the data id. In our case, since we’ll use Product entity, we created earlier than T and String as U .

public interface ProductRepository extends ElasticsearchRepository<Product, String> {

}

Now, your repository interface is done, you don’t need to take care about the implementation because spring is taking care of that. Now, you can call every function in the classes that your repository extends to.

For the examples of CRUD, you can check the codes below:

@Service
@RequiredArgsConstructor
public class SpringDataProductServiceImpl implements SpringDataProductService {

  private final ProductRepository productRepository;

  public Product createProduct(Product product) {
    return productRepository.save(product);
  }

  public Optional<Product> getProduct(String id) {
    return productRepository.findById(id);
  }

  public void deleteProduct(String id) {
    productRepository.deleteById(id);
  }

  public Iterable<Product> insertBulk(List<Product> products) {
    return productRepository.saveAll(products);
  }

}

In the code blocks above, we created a service class called SpringDataProductServiceImpl which is autowired to ProductRepository we created before.

There are 4 basic CRUD function in it. The first one is createProduct which as its name will create a new product in the product index. The second one, getProduct is to get the product we’ve indexed by its id. The deleteProduct function can be used to delete the product in the index by id. insertBulk function will allow you to insert multiple products to Elasticsearch.

All is done! I won’t write about the API testing in this article because I want to focus about how our apps can interact with Elasticsearch. But, if you want to try the API, I left a GitHub link in the end of the article so you can clone and try this project after you’re done with this article.

Custom query methods in the spring data

In the previous section, we only take advantage of using the basic methods that are already defined in the other classes. But we can also create a custom query methods to use. What’s very convenient about spring data is that you can make a method in the repository interface and you don’t need to code any implementation. Spring data library will read the repository and automatically create the implementations for it.

Let’s try searching for products by the name field:

public interface ProductRepository extends ElasticsearchRepository<Product, String> {

  List<Product> findAllByName(String name);
}

Yes, that’s all you need to do to create a function in spring data repository interface.

You can also define a custom query with @Query annotation and insert a JSON query in the parameters.

public interface ProductRepository extends ElasticsearchRepository<Product, String> {

  List<Product> findAllByName(String name);

  @Query("{\"match\":{\"name\":\"?0\"}}")
  List<Product> findAllByNameUsingAnnotations(String name);
}

Both of the methods we’ve created do the same thing, use the match query with name as its parameter. If you try it, you’ll get the same results.

Using ElasticsearchRestTemplate

If you want to do a more advanced query, like aggregations, highlighting or suggestions, you can use ElasticsearchsearchRestTemplate provided by the spring data library. By using it, you can create your own query as complex as you want.

For example, let’s create a function for doing a match query to the name field like before:

  public List<Product> getProductsByName(String name) {
    Query query = new NativeSearchQueryBuilder()
        .withQuery(QueryBuilders.matchQuery("name", name))
        .build();
    SearchHits<Product> searchHits = elasticsearchRestTemplate.search(query, Product.class);

    return searchHits.get().map(SearchHit::getContent).collect(Collectors.toList());
  }

You should notice that the code above is more complex than the one we defined in the ElasticserchRepository. It is recommended to use the spring data repository if you can. But, for more advanced query like aggregation, highlighting or suggestions, you must use the ElasticsearchRestTemplate.

For example, let’s write a code that will aggregate a term:

  public Map<String, Long> aggregateTerm(String term) {
    Query query = new NativeSearchQueryBuilder()
        .addAggregation(new TermsAggregationBuilder(term).field(term).size(10))
        .build();

    SearchHits<Product> searchHits = elasticsearchRestTemplate.search(query, Product.class);
    Map<String, Long> result = new HashMap<>();
    searchHits.getAggregations().asList().forEach(aggregation -> {
      ((Terms) aggregation).getBuckets()
          .forEach(bucket -> result.put(bucket.getKeyAsString(), bucket.getDocCount()));
    });

    return result;
  }

Elasticsearch RestHighLevelClient

If you’re not using spring, or your spring version doesn’t support spring-data-elasticsearch, you can use a Java library developed by Elasticsearch, RestHighLevelClient.

RestHighLevelClient is a library you can use to do from basic things like CRUD to managing your Elasticsearch. Even though the name is high level, it’s actually more low level if you compare it to spring-data-elasticsearch.

The advantage of this library over spring data is that you can also manage your Elasticsearch with it. It provides index and elasticsearch configuration, which you can use more flexibility compared to spring data. It’s also has a more complete function that interact with Elasticsearch. The disadvantage of this library over spring data is this library is more low level, which means you must code more.

CRUD with RestHighLevelClient

Let’s see how we can create a simple create a function with the library so we can compare it to the previous methods we’ve used:

@Service
@RequiredArgsConstructor
@Slf4j
public class HighLevelClientProductServiceImpl implements HighLevelClientProductService {

  private final RestHighLevelClient restHighLevelClient;
  private final ObjectMapper objectMapper;

  public Product createProduct(Product product) {
    IndexRequest indexRequest = new IndexRequest("product");
    indexRequest.id(product.getId());
    indexRequest.source(product);

    try {
      IndexResponse indexResponse = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
      if (indexResponse.status() == RestStatus.ACCEPTED) {
        return product;
      }

      throw new RuntimeException("Wrong status: " + indexResponse.status());
    } catch (Exception e) {
      log.error("Error indexing, product: {}", product, e);
      return null;
    }
  }


}

As you can see, it’s now more complicated and harder to implement. Now, you need to handle the exception and also convert the JSON result to your entity. It’s recommended to use spring data instead for basic CRUD because RestHighLevelClient is more complicated.

I’ve included another CRUD functions in the GitHub project. If you’re interested, you can check it out. The link is at the end of this article.

Index Creation

This section is where the RestHighLevelClient holds a clear advantage compared to spring data elasticsearch. When we’re creating an index with its mappings and settings in the previous section, we’ve only used annotations. It’s very easy to do, but you can’t do much with it.

With RestHighLevelClient, you can create methods for index managements, or basically almost anything that Elasticsearch REST API allows.

For example, let’s write a code that will creates product index with the settings and mappings we used before:

public boolean createProductIndex() {
    CreateIndexRequest createIndexRequest = new CreateIndexRequest("product");
    createIndexRequest.settings(Settings.builder()
        .put("number_of_shards", 1)
        .put("number_of_replicas", 0)
        .put("index.requests.cache.enable", false)
        .build());
    Map<String, Map<String, String>> mappings = new HashMap<>();

    mappings.put("name", Collections.singletonMap("type", "text"));
    mappings.put("category", Collections.singletonMap("type", "keyword"));
    mappings.put("price", Collections.singletonMap("type", "long"));
    createIndexRequest.mapping(Collections.singletonMap("properties", mappings));
    try {
      CreateIndexResponse createIndexResponse = restHighLevelClient.indices()
          .create(createIndexRequest, RequestOptions.DEFAULT);
      return createIndexResponse.isAcknowledged();
    } catch (Exception e) {
      e.printStackTrace();
    }
    return false;
  }

So let’s see what we did in the code:

We initialized the createIndexRequest when also determining the index name.
We added the settings in the request when calling createIndexRequest.settings. In the settings, we also configured the field index.requests.cache.enable, which is not possible with spring data library.
We made a Map containing the properties and mappings of the fields in the index.
We called the Elasticsearch with restHighlevelClient.indices.create

As you can see, with the RestHighLevelClient we can create a more customized call for creating index to Elasticsearch compared to the annotations in spring data entity. There are also many more function in the RestHighLevelClient that aren’t exist in the spring data library. You can read Elasticsearch’s documentation for more information about the library.

Conclusion

In this article, we’ve learned two ways to connect to Elasticsearch, by using spring data and Elasticsearch client. Both are powerful library, but you should use only the spring data if it’s possible for your use case. The code with spring data elasticsearch is more readable and easy to use.

If you want a more powerful library that can basically do anything the Elasticsearch allows, though, then you can also use Elasticsearch high level client. You can also use the low level client, which we didn’t cover in this article, if you need even more powerful feature.

I’d also like to say that this article is to help you get started with Elasticsearch in Java Spring Boot. If you want to learn more about the libraries, you can check out spring data elasticsearch documentation and Elasticsearch’s high level client documentation.

Alas, thank you for reading until the end!

Previously published at Code Curated!

Create a Simple Autocomplete With Elasticsearch

Brilian Firdaus — Tue, 22 Dec 2020 05:17:47 +0000

Autocomplete is a feature to predict the rest of a word a user is typing. It is an important feature to implement that can improve the user’s experience of your product.

Creating an autocomplete might sound daunting at first if you’ve never created one. But with the help of the features in Elasticsearch, it’s actually a simple thing to do.

Things You Should Know

If you have little knowledge of Elasticsearch, I suggest that you read my other articles first. We do not require this, but knowing how an analyzer and a text field work definitely will help you understand this article.

The article “Basics of Elasticsearch for Developer” will introduce you to Elasticsearch. The article “Elasticsearch: Text vs. Keyword” will teach you the difference between text and keyword in Elasticsearch and also will explain how Elasticsearch’s analyzer works.

Setup

Creating the index

First, let’s create an index called autocomplete-example. We will use this index for the examples in this article.

Request:

curl --request PUT \
  --url http://localhost:9200/autocomplete-example/ \
  --header 'content-type: application/json'

Response:

{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "autocomplete-example"
}

Defining a mapping

Before indexing a document, let’s first define a mapping. We will only need one field, simple_autocomplete, with field data type text and will use a standard analyzer.

Since Elasticsearch uses the standard analyzer as default, we need not define it in the mapping.

Request:

curl --request PUT \
  --url http://localhost:9200/autocomplete-example/_mapping \
  --header 'content-type: application/json' \
  --data '{
 "properties": {
  "simple_autocomplete" : {
   "type":"text"
  }
 }
}'

Response:

{
  "acknowledged": true
}

Indexing a document

Let’s index a document. For the examples in this article, we will only need one document, containing the text “Hong Kong.”

Request:

curl --request POST \
  --url http://localhost:9200/autocomplete-example/_doc \
  --header 'content-type: application/json' \
  --data '{
 "simple_autocomplete": "Hong Kong"
}

Response:

{
  "_index": "autocomplete-example",
  "_type": "_doc",
  "_id": "aFAbznQBPNT8JhPaDhND",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}

Querying the Index With match Query

Let’s start with the query that we normally use, match query.

The standard analyzer will lowercase your indexed text and split the text to tokens on stop words before storing it to an inverted index.

The match query by default will use the index-time analyzer, so the analyzer it uses is the same as the one indexed in the index, which is standard analyzer.

Let’s see how our “Hong Kong” text looks in the inverted index with the API provided by the Elasticsearch:

Request:

curl --request GET \
  --url 'http://localhost:9200/_analyze?pretty=' \
  --header 'content-type: application/json' \
  --data '{
 "analyzer": "standard",
 "text": "Hong Kong"
}'

Response:

{
  "tokens": [
    {
      "token": "hong",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "kong",
      "start_offset": 5,
      "end_offset": 9,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

When we do a search query to the index with match query, we will only get a result when we type text containing either “Hong” or “Kong.” This is because Elasticsearch only returns a result when the analyzed query is an exact match with a token in the inverted index.

Request:

curl --request POST \
  --url 'http://localhost:9200/autocomplete-example/_doc/_search?pretty=' \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "match": {
   "simple_autocomplete": "Hong"
  }
 }
}'

Response:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "autocomplete-example",
        "_type": "_doc",
        "_id": "aFAbznQBPNT8JhPaDhND",
        "_score": 0.5753642,
        "_source": {
          "simple_autocomplete": "Hong Kong"
        }
      }
    ]
  }
}

If the user type “Ho” or “Kon” or “Hon Kon,” there won’t be any response from Elasticsearch.

For an autocomplete, this one isn’t very useful to help the user, right? At the least, autocomplete needs to show something, even if we do not type the full words.

Request:

curl --request POST \
  --url 'http://localhost:9200/autocomplete-example/_doc/_search?pretty=' \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "match": {
   "simple_autocomplete": "Hon"
  }
 }
}'

Response:

{
  "took": 9,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [
    ]
  }
}

To fix it, we can use a match_phrase_prefix query provided by Elasticsearch.

Using match_phrase_prefix Query

match_phrase_prefix query will allow the user to get a result without typing all the words. By using the usual match query, we won’t get any result from the Elasticsearch if we type “Hon” or “Kon,” but with match_pharse_prefix, we can get a result.

Request:

curl --request POST \
  --url 'http://localhost:9200/autocomplete-example/_doc/_search?pretty=' \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "match_phrase_prefix": {
   "simple_autocomplete": {
    "query": "Hon"
   }
  }
 }
}'


curl --request POST \
  --url 'http://localhost:9200/autocomplete-example/_doc/_search?pretty=' \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "match_phrase_prefix": {
   "simple_autocomplete": {
    "query": "Kon"
   }
  }
 }
}'

Response:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "autocomplete-example",
        "_type": "_doc",
        "_id": "aFAbznQBPNT8JhPaDhND",
        "_score": 0.2876821,
        "_source": {
          "simple_autocomplete": "Hong Kong"
        }
      }
    ]
  }
}

There is still a shortcoming of this autocomplete: If the user types “Hon Kon,” it still won’t return any result. This is because “Hon Kon” is not the prefix of “Hong Kong”.

Request:

curl --request POST \
  --url 'http://localhost:9200/autocomplete-example/_doc/_search?pretty=' \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "match_phrase_prefix": {
   "simple_autocomplete": {
    "query": "Hon Kon"
   }
  }
 }
}'

Response:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [
    ]
  }
}

The Pros and Cons

An autocomplete with a text field data type and the standard analyzer is very simple, but it has pros and cons that you can consider before using this type of autocomplete.

Pros

Easy to no setup : You don’t even have to define any mapping because by default, if you index a text document into Elasticsearch, it will get mapped into the text and keyword field data types.
Fast index time : Because this type of autocomplete is using the standard analyzer, it doesn’t process your text much when saving it to the inverted index, which translates to fast index time.
Enough most of the time : Most of the time, you don’t need a complex autocomplete. This autocomplete type will be enough.

Cons

Can’t handle typos : This type of autocomplete can’t handle typos, so if the user types one wrong word, it won’t return any result.
The query can’t start from the middle word : The text queried to this type of autocomplete also can’t start from the middle. In the previous example of “Hong Kong,” if we do a query with text “ong kong,” the Elasticsearch won’t return anything.
Can’t handle space character : If we had mistakenly typed “HongKong” in the previous example, the Elasticsearch wouldn’t have returned anything with this type of autocomplete.

When to Use

I recommend an autocomplete with only the standard analyzer when you only need a simple autocomplete. You can also use this type of autocomplete if the index you want to create an autocomplete of is already in production and indexed with documents. Since this autocomplete uses the default analyzer and default mapping for text, it will work for most text documents.

Conclusion

Creating an autocomplete with the text field data type and standard analyzer is the simplest and easiest autocomplete that we can build with Elasticsearch. It requires almost no setup and can usually create an autocomplete for an existing index.

Even if it’s enough for most use cases, it still has many weaknesses because it can only handle simple queries. To overcome that, we can use a custom-defined analyzer or the Suggesters feature in Elasticsearch, which I plan to write about. Please wait for it!

At last, I want to say thank you to you for reading this article until the end. I hope this article will help you with your project.

References

https://opster.com/elasticsearch-glossary/elasticsearch-auto-complete-guide/

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query-phrase-prefix.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.html

Functional Programming in Java, Explained

Brilian Firdaus — Tue, 15 Dec 2020 11:13:02 +0000

If you’re a Java developer, I’m sure that you have seen code similar to the featured image snippet above at least once. The code in the snippet above is an example of functional programming paradigm implementation in Java, which will filter and transform the List<String> in the request to another List<String>.

In this article, I will write about how to write code using Java’s API for functional programming. In the end, we will write our own stream API so we can understand how to implement a functional programming style in Java.

Functional Programming in Java

Functional programming in Java has been around for a long time. When Oracle released Java 8 back in 2014, they introduced lambda expression, which was the core feature for functional programming in Java.

Let’s see an example of the difference between using a sequence of imperative statements and using a functional style in Java.

      List<String> stringList = Arrays.asList("Hello", "World", "How", "Are", "You", "Today");

        // imperative declaration
        List<String> filteredList = new ArrayList<>();

        for (String string: stringList) {
            if (string.equals("Hello") || string.equals("Are")) {
                filteredList.add(string);
            }
        }

        List<String> mappedList = new ArrayList<>();
        for (String string: filteredList) {
            mappedList.add(string + " String");
        }

        for (String string: mappedList) {
            System.out.println(string);
        }

Imperative Style


        List<String> stringList = Arrays.asList("Hello", "World", "How", "Are", "You", "Today");

        //functional style
        stringList.stream()
                .filter(s -> s.equals("Hello") || s.equals("Are"))
                .map(s -> s + " String")
                .forEach(System.out::println);

Functional Style

As we can see, even though both pieces of code achieve the same result, the difference is significant. The imperative declaration code has many curly braces and is much longer, which makes it harder to read, compared to the functional style code.

Functional Interface Annotation

To understand how functional programming works in Java, first we will need to look at the annotation included in Java 8 SDK, @FunctionalInterface. We can look at it on the Java API documentation site.

From the API documentation, we can see that the behaviors of a functional interface annotation in Java are:

It has exactly one abstract method in it.
It can have more than one method, as long as there is only one abstract method.
We can only add it to Interface type.
We can create the functional interface with a lambda expression, method references, or constructor references.
We don’t need to define @FunctionalInterface because the compiler will treat any interface meeting the definition of a functional interface as a functional interface.

Creating a Functional Interface Class

Now we know what a functional interface all about, we can create it by ourselves.

Let’s first create a model called Person.

package com.example.functional.programming.model;

public class Person {

    private String name;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public Person(String name) {
        this.name = name;
    }

    public static Person createClassExampleFromMethodReference(String name) {
        return new Person(name);
    }

For the functional interface, we’ll create PersonFunctionalInterfaceclass.


package com.example.functional.programming.intf;

import com.example.functional.programming.model.Person;

@FunctionalInterface
public interface PersonFunctionalInterface {

    Person createPerson(String name);

    default String getDefaultMethodString() {
        return "Default Method";
    }
}

Note that there are two methods in the interface, but since there is only one abstract method, PersonFunctionalInterfaceclass is valid as a functional interface.

But suppose we define more than one abstract method, like so:

package com.example.functional.programming.intf;

import com.example.functional.programming.model.Person;

@FunctionalInterface
public interface PersonFunctionalInterface {

    Person createPerson(String name);

    String mapStringToObject(String str);

    default String getDefaultMethodString() {
        return "Default Method";
    }
}

It will produce an error:

[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /D:/Project/functional/src/main/java/com/example/functional/programming/intf/PersonFunctionalInterface.java:[5,1] Unexpected @FunctionalInterface annotation
  com.example.functional.programming.intf.PersonFunctionalInterface is not a functional interface
    multiple non-overriding abstract methods found in interface com.example.functional.programming.intf.PersonFunctionalInterface
[INFO] 1 error
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 5.105 s
[INFO] Finished at: 2020-09-19T10:34:45+07:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project functional-programming: Compilation failure
[ERROR] /D:/Project/functional/src/main/java/com/example/functional/programming/intf/PersonFunctionalInterface.java:[5,1] Unexpected @FunctionalInterface annotation
[ERROR] com.example.functional.programming.intf.PersonFunctionalInterface is not a functional interface
[ERROR] multiple non-overriding abstract methods found in interface com.example.functional.programming.intf.PersonFunctionalInterface

Using a Functional Interface

Anonymous class

Let’s first learn about the anonymous class. Java documentation says that:

“Anonymous classes enable you to make your code more concise. They enable you to declare and instantiate a class at the same time. They are like local classes except that they do not have a name. Use them if you need to use a local class only once.”

Basically, with an anonymous class, we don’t have to define a class that implements the interface we made. We can create a class without a name and store it in a variable.

Let’s declare an anonymous class as an example.


    @Test
    void declareAnonymousClass() {
        PersonFunctionalInterface anonClassExample = new PersonFunctionalInterface() {
            @Override
            public Person createPerson(String name) {
                return new Person(name);
            }
        };

        assert (anonClassExample.createPerson("Hello, World").getName().equals("Hello, World"));
    }

What we’ve done here is we created an anonymous class with PersonFunctionalInterface type and anonClassExample name.

We override the createPerson abstract method so when we call the method, it will return a new Person object with a name.

When we called anonClassExample.createPerson(“Hello, World”), we basically just created a new Person object with “Hello, World” as its name.

Creating an Anonymous Class With a Functional Interface

We can start creating the anonymous class of PersonFunctionalinterface for the functional interface we made.

    @Test
    void interfaceExample() {
        PersonFunctionalInterface normalAnonymousClass = new PersonFunctionalInterface() { // create normal anonymous class
            @Override
            public Person createPerson(String name) {
                return new Person(name);
            }
        };

        PersonFunctionalInterface interfaceExampleLambda = 
                name -> new Person(name); // create anonymous class by lambda
        PersonFunctionalInterface interfaceExampleMethodReference = 
                Person::createClassExampleFromMethodReference; // create anonymous class by method reference
        PersonFunctionalInterface interfaceExampleConstructorReference = 
                Person::new; // create anonymous class by constructor reference

        // assert that every anonymous class behave the same
        assert(normalAnonymousClass
                .createPerson("Hello, World").getName().equals("Hello, World"));
        assert(interfaceExampleLambda
                .createPerson("Hello, World").getName().equals("Hello, World"));
        assert(interfaceExampleMethodReference
                .createPerson("Hello, World").getName().equals("Hello, World"));
        assert(interfaceExampleConstructorReference
                .createPerson("Hello, World").getName().equals("Hello, World"));
        assert(normalAnonymousClass.getDefaultMethodString().equals("Default Method"));
        assert(interfaceExampleLambda.getDefaultMethodString().equals("Default Method"));
        assert(interfaceExampleMethodReference.getDefaultMethodString().equals("Default Method"));
        assert(interfaceExampleConstructorReference.getDefaultMethodString().equals("Default Method"));
    }

We’ve just implemented the functional interface!

In the code above, we created three anonymous classes in different ways. Remember that the anonymous class has the behavior that we can create a functional interface with a lambda expression, method references, or constructor references.

To make sure we created anonymous classes that behave the same, we assert every method in the interface.

Built-In Functional Interface in Java 8

Java 8 has many built-in functional interface classes in the java.util.function package that we can see in its documentation.

In this article, I will only explain four of the most commonly used functional interfaces, but if you’re interested in more, you can read it in the Java API documentation noted above.

Consumer<T>: A functional interface that accepts an object and returns nothing.
Producer<T>: A functional interface that accepts nothing and returns an object.
Predicate<T>: A functional interface that accepts an object and returns a boolean.
Function<T, R>: A functional interface that accepts an object and returns another object.

Common Usage

If you’ve been developing with Java a lot, then it’s likely you’ve met the concept of functional interface already.

Stream and optional API

Java’s Stream API uses functional interfaces a lot, as we can see in the code below.

    @Test
    void commonFunctionalInterface() {
        Stream.of("Hello", "World", "How", "Are", "you")
                .filter(s -> s.equals("Hello") || s.equals("Are"))
                .map(s -> s + " String")
                .forEach(System.out::println);

        Optional.of("Hello")
                .filter(s -> s.equals("Hello") || s.equals("Are"))
                .map(s -> s + " String")
                .ifPresent(System.out::println);
    }

The filter method has a parameter Predicate<T> functional interface. As we can see, the method accepts a String and produce a boolean.

The mapmethod uses Function<T, R> as its parameter. It accepts a String and also returns String.

The forEach method in Stream and ifPresent method in Optional accept Consumer<T>, accepting a String and not returning anything.

Reactive library

Both of the most popular Java Reactive libraries, RxJava and Reactor, are based on Java 8 Streams API, which means they also use functional interfaces in their code.

If we look at Reactor’s Flux API documentation and RxJava’s Observable API documentation, we can see many of their methods accept a functional interface.

Creating Our Own Stream API

Now that we know how to create and use a functional interface, let’s try creating our own streaming API so we can understand how we can implement the functional interface.

Of course, our streaming API is much simpler than Java’s.

package com.example.functional.intf;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.function.Consumer;
import java.util.function.Function;
import java.util.function.Predicate;

public class SimpleStream<T> {

    private List<T> values;

    public SimpleStream(T... values) {
        this.values = Arrays.asList(values);
    }

    public SimpleStream(List<T> values) {
        this.values = values;
    }

    public SimpleStream<T> filter(Predicate<T> filter) {
        List<T> returnValueList = new ArrayList<>();
        for (T value : values) {
            if (filter.test(value)) {
                returnValueList.add(value);
            }
        }
        this.values = returnValueList;
        return this;
    }

    public SimpleStream<T> map(Function<T, T> function) {
        List<T> returnValueList = new ArrayList<>();
        for (T value : values) {
            returnValueList.add(function.apply(value));
        }
        this.values = returnValueList;
        return this;
    }

    public void forEach(Consumer<T> consumer) {
        for (T value : values) {
            consumer.accept(value);
        }
    }

    public List<T> toList() {
        return this.values;
    }

}

And test class:

    @Test
    void implementingFunctionalInterface() {
        List<String> stringsFromSimpleStream = new SimpleStream<>("Hello", "World", "How", "Are", "you")
                .filter(s -> s.equals("Hello") || s.equals("Are"))
                .map(s -> s + " String")
                .toList();

        assert(stringsFromSimpleStream.size() == 2);
        assert(stringsFromSimpleStream.get(0).equals("Hello String"));
        assert(stringsFromSimpleStream.get(1).equals("Are String"));

        new SimpleStream<>(stringsFromSimpleStream)
                .forEach(System.out::println);
    }

Okay, let’s discuss the methods one by one.

Constructor

We made two constructors, one constructor imitating the Stream.of() API and one constructor to convert List<T> to SimpleStream<T>.

Filter

In this method, we accept Predicate<T> as a parameter since Predicate<T>has an abstract parameter named test that accepts an object and produces a boolean.

Let’s look at the test class, where we wrote:

.filter(s -> s.equals("Hello") || s.equals("Are"))

This means we wrote an anonymous class implementing Predicate<T>:


Predicate<String> filter = new Predicate<String>() {
            @Override
            public boolean test(String s) {
                return s.equals("Hello") || s.equals("Are");
            }
        };

So in the SimpleStream<T> class, we can see the filter method as:


    public SimpleStream<T> filter(Predicate<T> filter) {
        List<T> returnValueList = new ArrayList<>();
        for (T value : values) {
            if (value.equals("Hello") || value.equals("Are")) {
                returnValueList.add(value);
            }
        }
        this.values = returnValueList;
        return this;
    }

Map

In the map method, we accept Function<T, R> as its parameter, which means the map method will accept a functional interface that accepts an object and also produces an object.

We wrote the following in the test class:

.map(s -> s + " String")

It’s the same as creating an anonymous class implementing Function<T, R>:


        Function<String, String> map = new Function<String, String>() {
            @Override
            public String apply(String s) {
                return s + " String";
            }
        };

And in the SimpleStream<T> class, we can see it as this:

    public SimpleStream<T> map(Function<T, T> function) {
        List<T> returnValueList = new ArrayList<>();
        for (T value : values) {
            returnValueList.add(value + " String");
        }
        this.values = returnValueList;
        return this;
    }

forEach

The forEach method accepts Consumer<T> as its parameter, meaning that it will accept an object and return nothing.

We wrote the following in the test class:

.forEach(System.out::println);

This translates to creating an anonymous class implementing Consumer<T>:


Consumer<String> forEach = new Consumer<String>() {
            @Override
            public void accept(String s) {
                System.out.println(s);
            }
        };

In the SimpleStream<T>, we can see the forEach method, as below:


public void forEach(Consumer<T> consumer) {
        for (T value : values) {
            System.out.println(value);
        }
    }

Conclusion

With the release of Java 8 back in 2014, we can use a functional programming style in Java. Using a functional programming style in Java has many benefits, one of which is making your code shorter and more readable. With the benefits it provides, knowing the implementation of functional programming in Java if you’re a Java developer is a must!

Thanks for reading this article!

You can find the GitHub repository used for this article here:
(https://github.com/brilianfird/java-functional-programming)

Resources

Avoiding the Null Pointer Exception With Optional in Java

Brilian Firdaus — Tue, 08 Dec 2020 14:03:58 +0000

In 1964, British computer scientist Tony Hoare invented the Null Pointer References.

The Null Pointer Exception has contributed the most bugs in production exceptions. It was implemented in many programming languages, including C, C++, C#, JavaScript, Java, and more.

The loss of financial resources, time, and human resources to fix it prompted Hoare to call it a “billion-dollar mistake.”

Java is one of the programming languages that implement Null Pointer References. If you’ve been developing with Java, I’m sure that you’ve seen them a lot. It doesn’t matter if you are new to Java or have ten years of experience. There is always a chance that you’ll encounter a Null Pointer Exception bug.

Optional in Java

Optional is an API that was introduced in Java 8. If used right, it can solve the problem of the Null Pointer Exception.

Optional API implements functional programming and uses Functional Interface.

If you want to know more about Functional Programming in Java you can read my other article, Functional Programming in Java, Explained.

Before we proceed any further, please note that I’m using Java 11 for the examples in this article. If you’re using a different version of Java, some methods might not exist or behave exactly the same.

Empty Optional

An empty optional is the main way to avoid the Null Pointer Exception when using the Optional API.

In Optional’s flow, a null will be transformed into an empty Optional. The empty Optional won’t be processed any further. This is how we can avoid a NullPointerException when using Optional.

We will learn further about how an empty Optional behaves later in this article.

Creating Optional object

There are three ways to initiate an Optional object:

Optional.of(T)
Optional.ofNullable(T)
Optional.empty()

Optional.of

Optional.of accepts any type with a non-nullable value in its parameter. To create an Optional object with Optional.of, we just have to pass a value in its parameter.

    @Test
    public void initializeOptional_optionalOf() {
        Optional<String> helloWorldOptional = Optional.of("Hello, world");
        assert helloWorldOptional.isPresent();
        assert "Hello, world".equals(helloWorldOptional.get());
    }

Be very careful when you are passing a value to the Optional.of. Remember that Optional.of doesn’t accept null values in its parameter. If you try to pass a null value, it will produce a NullPointerException.

     @Test
    public void initializeOptional_optionalOf_null() {
        try {
            Optional.of(null);
        } catch (Exception e) {
            assert e instanceof NullPointerException;
        }
    }

Optional.ofNullable

Optional.ofNullable is similar to Optional.of. It accepts any type. The difference is, with Optional.ofNullable, you can pass a null value to its parameter.

    @Test
    public void initializeOptional_optionalOfNullable() {
        Optional<String> helloWorldOptional = Optional.ofNullable("Hello, world");
        assert helloWorldOptional.isPresent();
        assert "Hello, world".equals(helloWorldOptional.get());
    }

When Optional.ofNullable is initialized using a null object, it will return an empty Optional.

    @Test
    public void initializeOptional_optionalOfNullable_null() {
        Optional<String> helloWorldOptional = Optional.ofNullable(null);
        assert !helloWorldOptional.isPresent();
        try {
            helloWorldOptional.get();
        } catch (Exception e) {
            assert e instanceof NoSuchElementException;
        }
    }

Optional.empty

An empty Optional can be initialized by using Optional.empty().

    @Test
    public void initializeOptional_optionalEmpty() {
        Optional<String> helloWorldOptional = Optional.empty();
        assert !helloWorldOptional.isPresent();
    }

Accessing Optional

There are some ways to get the value of an Optional.

get

A pretty straightforward method. The get method will return the value of Optional if it is present and throw a NoSuchElementException if the value doesn’t exist.

    @Test
    public void get_test() {
        Optional<String> helloWorldOptional = Optional.of("Hello, World");
        assert "Hello, World".equals(helloWorldOptional.get());
    }

    @Test
    public void get_null_test() {
        Optional<String> helloWorldOptional = Optional.empty();
        try {
            helloWorldOptional.get();
        } catch (Exception e) {
            assert e instanceof NoSuchElementException;
        }
    }

orElse

If you want to use a default value if the Optional is empty, you can use the orElse method.

    @Test
    public void orElse_test() {
        Optional<String> helloWorldOptional = Optional.of("Hello, World");
        assert "Hello, World".equals(helloWorldOptional.orElse("default"));
    }

    @Test
    public void orELseNull_test() {
        Optional<String> helloWorldOptional = Optional.empty();
        assert "default".equals(helloWorldOptional.orElse("default"));
    }

orElseGet

orElseGet is very similar to the orElse method. It’s just that orElseGet accepts Supplier<T> as its parameter.

    @Test
    public void orElseGet_test() {
        Optional<String> helloWorldOptional = Optional.of("Hello, World");
        assert "Hello, World".equals(helloWorldOptional.orElseGet(() ->"default"));
    }

    @Test
    public void orELseGet_Null_test() {
        Optional<String> helloWorldOptional = Optional.empty();
        assert "default".equals(helloWorldOptional.orElseGet(() ->"default"));
    }

orElseThrow

orElseThrow will return the value of the Optional or throw an exception if the value of the Optional is empty.

    @Test
    public void orElseThrow_test() {
        Optional<String> helloWorldOptional = Optional.of("Hello, World");
        assert "Hello, World".equals(helloWorldOptional.orElseThrow(NullPointerException::new));
    }

    @Test
    public void orELseThrow_Null_test() {
        Optional<String> helloWorldOptional = Optional.empty();
        try {
            helloWorldOptional.orElseThrow(NullPointerException::new);
        } catch (Exception e) {
            assert e instanceof NullPointerException;
        }
    }

Processing an Optional

There are many ways to process and transform an Optional. In this section, we will learn the common methods that are used.

As I wrote at the beginning of the article, an empty Optional won’t be processed in the flow. We can see that from the examples in this section.

Map

map is the most used method when processing an Optional object. It accepts Function<? super T, ? extends U> as its parameter and returns an Optional<U>. This means you can use a Function with any type of parameter and the return value will be wrapped to Optional in the map method.

    @Test
    public void processingOptional_map_test() {
        Optional<String> stringOptional = Optional.of("Hello, World")
                .map(a -> a + ", Hello");

        assert stringOptional.isPresent();
        assert "Hello, World, Hello".equals(stringOptional.get());
    }

If you try to return a null value in Function<? super T, ? extends U>, the map method will return an empty Optional.

    @Test
    public void processingOptional_map_empty_test() {
        Optional<String> stringOptional = Optional.of("Hello, World")
                .map(a -> null);

        assert !stringOptional.isPresent();
    }

An empty optional won’t be processed by map. We can confirm this with the following test:

    @Test
    public void processingOptional_map_empty_notProcessed_test() {
        AtomicBoolean atomicBoolean = new AtomicBoolean(false);
        Optional<String> stringOptional = Optional.of("Hello, World")
                .map(a -> null)
                .map(a -> {
                    atomicBoolean.set(true);
                    return "won't be processed";
                });

        assert !stringOptional.isPresent();
        assert atomicBoolean.get() == false;
    }

FlatMap

This is similar to map, but flatMap won’t wrap the return value of the Function to Optional. The flatMap method accepts Function<? super T, ? extends Optional<? extends U>> as its parameter. This means that you’ll need to define a Function that accepts any type and returns an Optional.

You will usually use the flatMap method when your code calls another method that returns an Optional object.


    @Test
    public void processingOptional_flatmap_test() {
        Optional<String> stringOptional = Optional.of("Hello, World")
                .flatMap(this::getString);

        assert "Hello, World, Hello".equals(stringOptional.get());
    }

    @Test
    public void processingOptional_flatmap_randomString_test() {
        Optional<String> stringOptional = Optional.of(UUID.randomUUID().toString())
                .flatMap(this::getString);

        assert !stringOptional.isPresent();
    }

    public Optional<String> getString(String s) {
        if ("Hello, World".equals(s)) {
            return Optional.of("Hello, World, Hello");
        }
        return Optional.empty();
    }

Filter

In the previous example of flatMap, we used a declarative style to differentiate the return value of the getString method. But we can actually use a functional style for that with the filter method.

   @Test
    public void processingOptional_filter_test() {
        Optional<String> stringOptional = Optional.of("Hello, World")
                .filter(helloWorldString -> "Hello, World".equals(helloWorldString))
                .map(helloWorldString -> helloWorldString + ", Hello");

        assert "Hello, World, Hello".equals(stringOptional.get());
    }

    @Test
    public void processingOptional_filter_randomString_test() {
        Optional<String> stringOptional = Optional.of(UUID.randomUUID().toString())
                .filter(helloWorldString -> "Hello, World".equals(helloWorldString))
                .map(helloWorldString -> helloWorldString + ", Hello");

        assert !stringOptional.isPresent();
    }
view rawProcessingOptionalT

If Present

The ifPresent method accepts a Consumer that will only be executed if the Optional is not empty.

    @Test
    public void processingOptional_ifPresent_test() {
        AtomicBoolean atomicBoolean = new AtomicBoolean(false);
        Optional.of("Hello, World")
            .ifPresent(helloWorldString -> atomicBoolean.set(true));
        assert atomicBoolean.get();
    }

    @Test
    public void processingOptional_ifPresent_empty_test() {
        AtomicBoolean atomicBoolean = new AtomicBoolean(false);
        Optional.empty()
                .ifPresent(helloWorldString -> atomicBoolean.set(true));
        assert !atomicBoolean.get();
    }

Things to avoid

There are some critical things that you need to avoid if you want to use Optional in your code.

Don’t create a method that accepts Optional

Creating a method that accepts Optional as a parameter might introduce a problem it wants to solve, NullPointerException.

If a person using the method with the Optional parameter is not aware of it, they might pass a null to the method instead of Optional.empty(). Processing a null will produce a NullPointerException.

    @Test
    public void optionalAsParameter_test() {
        try {
            isPhoneNumberPresent(null);
        } catch (Exception e) {
            assert e instanceof NullPointerException;
        }
    }

    public boolean isPhoneNumberPresent(Optional<String> phoneNumber) {
        return phoneNumber.isPresent();
    }

Getting value without checking

If you’re using Optional, then you should avoid using the get method if you can. If you still want to use it for some reason, make sure that you check it with the isPresent method first because if you use get on an empty Optional, it will produce a NoSuchMethodException.

    @Test
    public void getWithIsPresent_test() {
        Optional<String> helloWorldOptional = Optional.ofNullable(null);
        if (helloWorldOptional.isPresent()) {
            System.out.println(helloWorldOptional.get());
        }
    }

    @Test
    public void getWithoutIsPresent_error_test() {
        Optional<String> helloWorldOptional = Optional.ofNullable(null);
        try {
            System.out.println(helloWorldOptional.get());
        } catch (Exception e) {
            assert e instanceof NoSuchElementException;
        }
    }

Conclusion

Thank you for reading until the end! Optional is a powerful feature that every Java developer should know about. If you use optional features from end to end correctly, then I’m sure that you won’t meet the NullPointerException anymore.

Optional is also used as a base for other big libraries like Reactor and RXJava, so knowing how Optional works will help you understand them too.

You can find the repository with the examples in this article below:

https://github.com/brilianfird/java-optional

References

How to Handle Typos in Elasticsearch Using Fuzzy Query

Brilian Firdaus — Thu, 03 Dec 2020 12:41:14 +0000

Typo is something that often happens and can reduce user’s experience, fortunately, Elasticsearch can handle it easily with Fuzzy Query.

Handling typos is a must if you’re building an advanced autocomplete system with the Elasticsearch.

If you want to create a simple one instead, you can read my other articles “Create a Simple Autocomplete With Elasticsearch“.

What is fuzzy logic

Fuzzy logic is a mathematics logic in which the truth of variables might be any number between 0 and 1. It is different with a Boolean logic that only has the truth values either 0 or 1.

In the Elasticsearch, fuzzy query means the terms in the queries don’t have to be the exact match with the terms in the Inverted Index.

To calculate the distance between query, Elasticsearch uses Levenshtein Distance Algorithm.

How to calculate distance using Levenshtein Distance Algorithm

Calculating a distance with Levenshtein Distance Algorithm is easy.

You just need to compare the first and second word character by character.

If the character is different, then you can add the distance between the words by one.

Let’s see an example, how to calculate the distance between the common typo word “Gppgle” with the correct word “Google”

elasticsearch fuzzy query: Levenshtein distance

After we calculate the distance between “Gppgle” and “Google” with Levenshtein Distance Algorithm, we can see that the distance is 2.

Fuzzy Query in Elasticsearch

Handling typo in Elasticsearch with Fuzzy Query is also simple.

Let’s start with making an example of the typo word “Gppgle”.

Request

curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "match" : {
            "text": {
                "query": "gppgle"
            }
        }
    }
}'

Response

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

When we’re using normal Match Query, the Elasticsearch will analyze the query “gppgle” first before searching it into the Elasticsearch.

The only term in the inverted index is “google” and it doesn’t match the term “gppgle”. Therefore, the Elasticsearch won’t return any result.

Now, let’s try Elasticsearch’s fuzzy in Match Query

request

curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "match" : {
            "text": {
                "query": "gppgle",
                "fuzziness": "AUTO"
            }
        }
    }
}'

response

{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.19178805,
    "hits": [
      {
        "_index": "fuzzy-query",
        "_type": "_doc",
        "_id": "w8YOCXUBHf9qB4Apc0Cz",
        "_score": 0.19178805,
        "_source": {
          "text": "google"
        }
      }
    ]
  }
}

As you can see, with fuzzy, the Elasticsearch returned a response.

We’ve learnt in the before that “gppgle” and “google” have the distance of 2.

In the query, we inserted “fuzziness”:"AUTO" instead of a number. Why is it working?

Elasticsearch will determine what fuzziness distance is appropriate if we use “AUTO” value in the “fuzziness” field.

For 6 characters, the Elasticsearch by default will allow 2 edit distance.

“AUTO” fuzziness is preferable, but you can tune it with an exact number if you want to.

Now, let’s try with an exact number to prove that “gppgle” and “google” have a distance of 2.

gppgle and google with fuzziness 1

Request

curl --request POST \
  --url 'http://localhost:9200/fuzzy-query/_doc/_search?explain=true' \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "match" : {
            "text": {
                "query": "gppgle",
                "fuzziness": "1"
            }
        }
    }
}'

Response

No Response

gppgle and google with fuzziness 2

Request

curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "match" : {
            "text": {
                "query": "gppgle",
                "fuzziness": "2"
            }
        }
    }
}'

Response

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.19178805,
    "hits": [
      {
        "_index": "fuzzy-query",
        "_type": "_doc",
        "_id": "w8YOCXUBHf9qB4Apc0Cz",
        "_score": 0.19178805,
        "_source": {
          "text": "google"
        }
      }
    ]
  }
}

When we use “fuzziness”:"1", no result is returned by the Elasticsearch.

With “fuzziness”:"2", though, the Elasticsearch returned the document “google”.

This proves our previous distance calculation of “gppgle” and “google” with Levenshtein Distance Algorithm, in which the result is 2.

Two types of a fuzzy query in Elasticsearch

In the previous example, we use a fuzzy query as a parameter inside Match Query.

But there is another way to use the fuzzy feature, Fuzzy Query.

Seems to be the same! So, what’s the difference between them?

Before continuing, if you want to understand more about analyzer, you can read my other articles “Elasticsearch: Text vs. Keyword“.

Fuzzy Query

Fuzzy Query works like just Term Query, the query to Elasticsearch is not analyzed and used raw to search the Inverted Index.

For example, let’s index one more document “Hong Kong”

Request

curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc \
  --header 'content-type: application/json' \
  --data '{
    "text":"Hong Kong"
}'

Response

{
  "_index": "fuzzy-query",
  "_type": "_doc",
  "_id": "5sbKDXUBHf9qB4ApJUDr",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 1,
  "_primary_term": 1
}

Let’s look on what terms the analyzer produces with Elasticsearch’s Analyze API.

Request

curl --request POST \
  --url http://localhost:9200/_analyze \
  --header 'content-type: application/json' \
  --data '{
    "analyzer": "standard",
    "text": "Hong Kong"
}'

Response

{
  "tokens": [
    {
      "token": "hong",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "kong",
      "start_offset": 5,
      "end_offset": 9,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

As you can see, the standard_analyzer produce two terms, “hong” and “kong”.

If you read my other article "Elasticsearch: Text vs. Keyword", you’d know that if we use a term query to search “Hong Kong” then we won’t get any result.

Request

curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "fuzzy" : {
            "text": {
                "value": "Hpng Kpng",
                "fuzziness": "2"
            }
        }
    }
}'

Response

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

This is because there is no term that has less than 2 edit distance with “Hong Kong” in the Elasticsearch.

Now, Let’s try Fuzzy Query with “Hpng”

Request

curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "fuzzy" : {
            "text": {
                "value": "HPng",
                "fuzziness": "2"
            }
        }
    }
}'

Response

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.60996956,
    "hits": [
      {
        "_index": "fuzzy-query",
        "_type": "_doc",
        "_id": "5sbKDXUBHf9qB4ApJUDr",
        "_score": 0.60996956,
        "_source": {
          "text": "Hong Kong"
        }
      }
    ]
  }
}

Term “Hpng” in the query and the term “hong” in the Elasticsearch have a distance of two.

Remember that the term queried and the term in the inverted index is case-sensitive, the distance “2” comes from the difference between “Hp” and “ho”.

Match Query with Fuzziness parameter

Match Query with fuzziness parameter is more preferable than Fuzzy Query. The analyzer in the query will analyze your query before searching it into the Inverted Index.

Let’s try the same query as we did in the Fuzzy Query’s section.

Request

curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "match": {
            "text": {
                "query": "Hpng Kong",
                "fuzziness": 2
            }
        }
    }
}'


curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "match" : {
            "text": {
                "value": "HPng",
                "fuzziness": "2"
            }
        }
    }
}'

Response

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.762462,
    "hits": [
      {
        "_index": "fuzzy-query",
        "_type": "_doc",
        "_id": "5sbKDXUBHf9qB4ApJUDr",
        "_score": 0.762462,
        "_source": {
          "text": "Hong Kong"
        }
      }
    ]
  }
}

As expected, both queries returned a result!

The first query, “Hpng Kong” is analyzed into “hpng” and “kong”. Both terms “hpng” and “kong” exist in the Inverted Index.

“hpng” and “hong” matched with a distance of 1. While “kong” and “kong” match perfectly.

One thing to note if you plan to use Match Query is that every of the terms in the query will allow fuzziness.

We can try querying with “hggg kggg” which has an edit distance of 4 with “Hong Kong” using “fuzziness”:2.

Request

curl --request POST \
  --url http://localhost:9200/fuzzy-query/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
    "query": {
        "match": {
            "text": {
                "query": "hggg kggg",
                "fuzziness": "2"
            }
        }
    }
}'

Response

{
    "_index": "fuzzy-query",
    "_type": "_doc",
    "_id": "5sbKDXUBHf9qB4ApJUDr",
    "_score": 1.2330425,
    "_source": {
        "text": "Hong Kong"
    }
}

As you can see, the Elasticsearch returned a result.

This is because of the query “hggg kggg” is analyzed to terms “hggg” and “kggg” by the analyzer.

Both “hggg” and “kggg” respectively have the edit distance of 2 to “hong” and “kong” in the Elasticsearch.

Tuning the Fuzzy Query in Elasticsearch

You can tune the Fuzzy Query to match your use case.

In this section, I will write about the parameters that we can change in the query.

Fuzziness

Fuzziness is the heart of Fuzzy Query.

The value that we pass to this parameter is the maximum distance allowed.

There are two types of value that we can pass, an integer for exact maximum distance and “AUTO”.

The “AUTO” value allows the fuzziness in the query to be dynamic.

We can tune 2 parameters in the “AUTO” value and write it as “AUTO:[low],[high]”. The query will set fuzziness as 0 if the term length is below the low value. If the term length is between the low and high value, the query will set the fuzziness to 1. Last, If the term length is more than the high value, the query will set the fuzziness to 2.

The Elasticsearch will use 3 and 6 as the default if the low and high value is not determined.

Let’s use an example with a document “Fuzzy Query in Elasticsearch allows you to handle typos”.

We can try some queries to prove the mechanism of AUTO we described earlier.

“tp”: 1 edit distance from “to”.
“Fyzzy”: 1 edit distance from “Fuzzy”.
“Fyzyy”: 2 edit distance from “Fuzzy”.
“Elastissearcc”: 2 edit distance from “Fuzzy”.
“Elestissearcc”: 3 edit distance from “Fuzzy”.

After querying it, these queries produced a result:

“Fyzzy”
“Elastissearcc”

The queries don’t:

“tp”
“Fyzyy”
“Elestissearcc”

Transpositions

transpositions will allow your query to calculate the transpositions of two adjacent characters (ab -> ba) as 1 distance.

For example, if we set the transpositions to true, we will get a result if we query with “leasticsearcc”.

But if we set it as false, there will be no result from the Elasticsearch.

Request

{
  "query": {
    "fuzzy": {
      "text": {
        "value": "leasticsearcc",
        "fuzziness": "AUTO",
        "transpositions": true
      }
    }
  }
}

Response

{
    "_index": "fuzzy-query",
    "_type": "_doc",
    "_id": "AsawDnUBHf9qB4ApNUFh",
    "_score": 0.5491282,
    "_source": {
        "text": "Fuzzy Query in Elasticsearch allows you to handle typos"
    }
}

The Elasticsearch defaults the transpositions setting to true.

We can’t set this setting to the Match Query. The Match Query will always calculate transpositions as 1 distance.

Max Expansions

max_expansions will determine the maximum result you get from your query.

If you set the max_expansions to 1 and there is 2 document in the Elasticsearch that are appropriate to your query, the Elasticsearch will only return 1.

Note that max_expansions applies to shard level. So if you have many shards in the Elasticsearch, even if you set the max_expansion to 1, the query might return more results.

The default value for max_expansions is 50.

Prefix Length

prefix_length is the number of prefix characters that is not considered in fuzzy query.

For example, if we set the prefix_length to 1, we won’t get any result if we query “llasticsearch”.

The prefix_length setting defaults to 0.

Rewrite

You can change rewrite parameter if you want to change the scoring of the results.

You can find more information about the rewrite parameter in the Elasticsearch documentation.

Conclusion

Handling a typo in Elasticsearch is very easy and can improve the user’s experience.

The simplest way to handle a typo is to just add “fuzziness”:"AUTO" in your Match Query.

If you want to tune the Query, there are some parameters that you can change with the “fuzziness” being the most important.

Thank you for reading until the end!

Stay tune to my other articles about Elasticsearch!