In microservice architecture, it’s common for one service to call multiple downstream services.
But what many developers miss is how those calls are executed — sequentially or in parallel.
Let’s understand this with a simple example.
🧩 Problem Statement
Suppose Service A needs data from:
- Service B → takes 1 second
- Service C → takes 1 second
If Service A calls them one after another, what will be the total response time?
👉 2 seconds.
Why? Because it waits for B to finish before calling C.
Let’s implement this in Spring Boot.
🔴 Part 1 — Sequential Execution
Fake downstream endpoints
@RestController
@RequestMapping("/api")
public class TestController {
@GetMapping("/service-b")
public ResponseEntity<String> serviceB() throws InterruptedException {
Thread.sleep(1000);
return ResponseEntity.ok("Response from service B");
}
@GetMapping("/service-c")
public ResponseEntity<String> serviceC() throws InterruptedException {
Thread.sleep(1000);
return ResponseEntity.ok("Response from service C");
}
}
Sequential call
@GetMapping("/sequential")
public ResponseEntity<String> sequential() {
long start = System.currentTimeMillis();
String baseUrl = "http://localhost:8080/api";
String serviceBResponse = restTemplate.getForObject(baseUrl + "/service-b", String.class);
String serviceCResponse = restTemplate.getForObject(baseUrl + "/service-c", String.class);
long end = System.currentTimeMillis();
String response = "Result: " + serviceBResponse + " & " + serviceCResponse + " | Time Taken: " + (end - start) + " ms";
return ResponseEntity.ok(response);
}
⏱ Output
Result: Response from service B & Response from service C | Time Taken: 2065 ms
🟢 Part 2 — Making It Asynchronous
Instead of waiting for one call to finish, we can trigger both calls at the same time using CompletableFuture.
Parallel Implementation
@GetMapping("/parallel")
public ResponseEntity<String> parallel() throws ExecutionException, InterruptedException {
long start = System.currentTimeMillis();
String baseUrl = "http://localhost:8080/api";
CompletableFuture<String> responseB = CompletableFuture.supplyAsync(
() -> restTemplate.getForObject(baseUrl + "/service-b", String.class)
);
CompletableFuture<String> responseC = CompletableFuture.supplyAsync(
() -> restTemplate.getForObject(baseUrl + "/service-c", String.class)
);
CompletableFuture.allOf(responseB, responseC).join();
long end = System.currentTimeMillis();
String response = "Result: "+ responseB.get() + " & " + responseC.get() + " | Time taken: "+ (end-start) + " ms";
return ResponseEntity.ok(response);
}
⏱ Output
Result: Response from service B & Response from service C | Time taken: 1061 ms
Now total time equals the slowest service, not the sum of both.
Mathematically:
Total Time = max(Service B, Service C)
🎯 Why This Matters
Sequential execution:
- Increases API latency
- Blocks threads longer
- Reduces throughput under load
Parallel execution:
- Reduces response time
- Improves user experience
- Utilises threads efficiently
🧠 Key Takeaways
- Sequential calls add up execution time.
- Parallel execution reduces total latency.
- Asynchronous programming is essential in microservice communication.
- Using
CompletableFuturehelps coordinate multiple async tasks cleanly.
Top comments (0)