Java 8 was released in 2014, bringing with it a heap of new features now praised as essential by modern developers, such as the lambda expression, concurrency API improvements, Functional Interfaces, and improvements to bulk data handling. While many years have passed since then, Java 8 still remains one of the most used versions of Java.
As a result, regardless if you're shifting jobs or just starting, proficiency in Java 8 is an essential skill to have in today's tech world. If you’re just switching Java versions now, you may feel like you’re a bit late to the party, but worry not! The features of Java 8 are easier to pick up than you'd think, and today, we'll get you familiar with one of the most critical Java 8 features: Stream API.
Today, we’ll go over:
- What is a Stream in Java 8?
- Features of the Stream
- Comparing Streams and Loops
- Java 8 Stream API Pipeline: Intermediate and Terminal Operations
- What should you learn next?
What is a Stream in Java 8?
In the words of the all-powerful Oracle Interface Package Summary, a stream is “a sequence of elements supporting sequential and parallel aggregate operations”. While a great tongue-twister, it’s not the most digestible definition. Allow me to translate.
Streams are an abstract layer added in Java 8 that allows developers to easily manipulate collections of objects or primitives. It is not a data structure as it does not store data; rather it serves as a transformative medium from the data source to its destination.
Aggregate operations, sometimes called stream operations, are simply operations unique to streams that we can use to transform some (or all) of the stream at once. We’ll see examples of these later on.
Finally, sequential vs. parallel refers to the ability to implement concurrency when completing stream operations. The operation can either be applied one at a time by a single thread, or it can be split among multiple threads, each applying the operation concurrently.
The Stream is often visualized as a pipeline because it acts as an intermediate step between the source of data, transforms the data in some way, then outputs it in a new form downstream. We’ll revisit this metaphor when we explore Intermediate and Terminal Operations below.
Using the Stream API requires a fundamentally different style of coding than traditional coding - while most coding in Java is written in an imperative style, where the developer instructs what to do and how to complete it, operations from the Stream API require a declarative or event-driven style similar to operations in SQL.
Features of the Stream
Now that we know what the stream is, here’s a quick look at a few of the qualities you can take advantage of when using a stream.
- Created using a source collection or array
- Can transform the group but cannot change the data within
- Easily allows manipulation of entire collections at once
- Streams can be processed declaratively
- Neither stores data nor adjusts the data it handles. Therefore, it is not a data structure.
- Adjustable via lambda expressions
Comparing Streams and Loops
Streams are often compared to loops, as both are used to create iterative behavior in a program. Compared to loops, streams appear much cleaner in-line due to the cutting of cluttered loop syntax. Streams are arguably easier to understand at a glance, thanks to their declarative style.
Below we have two snips of Java code that both achieve the same task of printing a data collection, stream
by using streams while loop
with a loop. Take a look to see how they differ! We will break this down a bit more below.
Stream:
import java.util.stream.*;
class StreamDemo {
public static void main(String[] args)
{
Stream<Integer> stream = Stream.of(1,2,3,4,5,6,7,8,9);
stream.forEach(p -> System.out.println(p));
}
}
-->
1
2
3
4
5
6
7
8
9
Loop:
class ArrayTraverse {
public static void main(String args[]) {
int my_array[]={1,2,3,4,5,6,7,8,9};
for(int i:my_array) {
System.out.print(i+" ");
}
}
}
-->
1 2 3 4 5 6 7 8 9
Notice that when we use the auto-iterating forEach()
stream operation, we’re able to cut down lines and make our code more readable.
With the for loop, most of the code is devoted to creating iteration rather than printing the array. This adds a lot of extra time for the developer. Since the stream operations of the API library handle iteration, your code can plainly express the logic of the computation instead of managing every little byte of the control flow.
The downside of adopting this style, however, is that transitioning can be difficult for industry veterans who have used the imperative style for years. If you’re struggling to pick it up, you’re not alone!
Another downside to note is that streams, mainly parallel streams, cost considerably more overhead than loops. Keep this in mind if overhead is a concern in your program.
Advantages and Disadvantages of a Stream
Advantages:
- Less visual clutter in code
- No need to write an iterator
- Can write the “what” rather than the “how” to be understandable at a glance
- Execute as fast as for-loops (or faster with parallel operations)
- Great for large lists
Disadvantages:
- Large overhead cost
- Overkill for small collections
- Hard to pick-up if you're used to traditional imperative style coding
Java 8 Stream API Pipeline: Intermediate and Terminal Operations
Aggregate operations come in two types; intermediate and terminal. Each stream has zero or more intermediate operations and one terminal operation, as well as a data source at the farthest point upstream, such as an array or list.
Intermediate operations take a stream as input and return a stream after completion, meaning several operations can be done in a row.
In our metaphor from before, these operations are like pipe segments with water-like data entering and exiting without interrupting the stream’s flow. While they may redirect the stream in a new direction or change its form, it can still flow. Common examples of an intermediate operation are filter()
, forEach()
, and map()
. Let's discuss them below.
filter()
This method takes a stream and selects a portion of it based on passed criteria. In our metaphor, filter()
would be either a pipe junction or a valve, delegating part of the stream to go a separate way.
In the example below, we use filter()
on a stream of integers to return a stream with integers greater than 10.
import java.util.ArrayList;
import java.util.List;
import java.util.stream.*;
class StreamDemo {
public static void main(String[] args) {
//Created a list of integers
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(12);
list.add(23);
list.add(45);
list.add(6);
list.stream() // Created a stream from the list
.filter(num -> num > 10) //filter operation to get only numbers greater than 10
.forEach(System.out::println); // Printing each number in the list after filtering.
//Again printing the elements of List to show that the original list is not modified.
System.out.println("Original list is not modified");
list.stream()
.forEach(System.out::println);
}
}
-->
12
23
45
Original list is not modified
1
12
23
45
6
It’s important to note that this does not alter the original stream, making it effective for search tasks with static data like searching a database.
map()
This method takes a stream and another method as input, applying that function to each element in the stream. These results are then used to populate a new stream that is sent downstream.
For example, below we use map()
on a stream of names to apply the toUpperCase
method to each name. This makes a new stream with each person’s name capitalized. This is only one type of map operation. Others are MaptoInt()
, which converts the stream to integers and flatMap()
, which combines all stream elements together.
import java.util.ArrayList;
import java.util.List;
import java.util.stream.*;
class StreamDemo {
public static void main(String[] args) {
List<String> list = new ArrayList<>();
list.add("Dave");
list.add("Joe");
list.add("Ryan");
list.add("Iyan");
list.add("Ray");
// map() is used to convert each name to upper case.
// Note: The map() method does not modify the original list.
list.stream()
.map(name -> name.toUpperCase()) //map() takes an input of Function<T, R> type.
.forEach(System.out::println); // forEach() takes an input of Consumer type.
}
}
-->
DAVE
JOE
RYAN
IYAN
RAY
Terminal operations return something other than a stream, such as a primitive or an object. This means that while many intermediate operations can be done in series, there can be only one terminal operation.
In our metaphor, these operations are visualized as the end-of-the-line; a stream comes in, but its flow is stopped, leaving the data as a different type. While this data could be put back into stream form, it would not be the same stream as the input from our terminal operation.
The most common example of a terminal operation is the forEach()
function.
forEach()
This method takes a stream as input iterates through the stream, completes an action on each element within,and outputs the result of that action. The key difference between forEach()
and map()
is that the former returns data in a non-stream form, making it terminal.
Let’s look at a previous example again. We can now note that the output is a group of individually printed integers, not a stream of collected integers.
import java.util.stream.*;
class StreamDemo {
public static void main(String[] args)
{
Stream<Integer> stream = Stream.of(1,2,3,4,5,6,7,8,9);
stream.forEach(p -> System.out.println(p));
}
}
-->
1
2
3
4
5
6
7
8
9
What should you learn next?
Development teams across the industry have made it clear, Java 8 is well-liked, and it is here to stay. If you’re on a journey to master Java 8, this is just the beginning with so many exciting features left to explore.
To help you along that journey, we’re excited to offer Java 8 for Experienced Developers: Lambdas, Stream API & Beyond, a course authored by veteran Java developer Saurav Aggarwal filled with tips and interactive examples on topics like augmenting your streams with lambda expressions, diving deep into advanced Stream API, collection management, CompletableFuture concurrency, and more.
Wherever you go next, I wish you luck as you continue your Java 8 journey!
Happy learning!
Continue Reading about advanced Java on Educative
- Ace the top 15 Java algorithm questions for coding interviews
- Java lambda expression tutorial: Functional programming in Java
- Java Multithreading & Concurrency: Cracking senior interviews
Start a discussion
What other Java concepts are you excited to start learning? Was this article helpful? Let us know in the comments below!
Top comments (5)
Great article. For me, that "Less visual clutter in code" could be a disadvantage, becasue of those very weirdy arrow/lambda like stuff to catch up with the syntax/code.
That's heartbreaking seeing some devs just tent to pick up the easiest way(or somehow, the solution that looks hacky), and tomorrow them, they will be like "why this code is slow? java bad".
Most cool part of stream API is that parallel loop(which is still possible with classic-loops), which is very handy in situations like performing some blocking(mostly IO) ops for each iteration.
Stream processing has a overhead, yes. One can say there is a for-loop internally and the stream code is a wrapper over that. But, the overhead is not large.
No. It is not a overkill on small collections or a large data source.
Streams and functional interfaces bring functional style programming to Java's object oriented capabilities. Yes, these features are advantageous in terms of writing code in easily readable way and hence highly maintainable.
Another aspect of streams programming is not requiring the usage of state. For example, when you use a for-loop to calculate the sum of numbers from a collection (or an array), the code related with for-loop requires that you define a variable to store the sum. The streams programming doesn't have such requirement. This is a feature typical of functional programming.
Processing large data sources can take the advantage of using parallel streams. Parallel streams perform poorly when used with smaller data sources.
There is significant overhead associated with Java 8 streams. Here is a good explanation of where some of that overhead comes from. Java implements a lambda function via the JVM creating a class at runtime the first time the lambda is called that implements the relevant interface. If that lambda is used enough, the JIT eventually kicks in, and any negative impact on performance will eventually disappear overall. You are necessarily using lambdas with streams, sometimes multiple of them (e.g. one for a filter, another for a map, yet another for the forEach). Each results in a new class created by JVM when first encountered. If you are doing so for a small collection, especially if that code utilizing streams is called a small number of times, the JIT won't have the opportunity to make up the loss in performance. For very large collections, or if streams used in a hot-spot, the JIT will have the opportunity to compile those lambdas, eliminating any negative performance impact.
Note that I'm not arguing against using streams. But you need to use them wisely, and recognize that there is overhead.
Use whatever makes your code best readable. And do not think about performance or internalls... If you really see a performance issue based on measuring than identify the root cause and fix it (assuming that you have unit tests)...
Apart from that using more recent versions of JDK's like JDK11, 17 (or currently JDK19) ... reduces that "overhead"
The overhead doesn't go away with Java 17 (it also isn't reduced by Java 17 at least not to affect measurements), which I use for my projects. It probably doesn't go away with 19 either, but I stick to LTS versions so can't say from experience.
If you measure (instead of assuming that others have not), you will see that streams are slower (as much as 2 to 4 times slower from my measurements depending upon a variety of factors). If performance doesn't matter for your application then don't worry about that and use streams. If performance does matter, then the root cause you say to look for may well be streams. It depends.
Here is a link to a comparison from Oracle: blogs.oracle.com/javamagazine/post.... The iterative version with a for loop runs in about a third of the time as sequential streams. And surprisingly about half the time as parallel streams. I'm guessing the latter relates to the task in their comparison. Aside from the surprising parallel result, Oracle's comparison using Java 17 is consistent with every other benchmarking I've seen with Java streams vs loops, as well as consistent with my own benchmarking. Java's sequential streams are significantly slower than the equivalent with a loop.