In the last few parts we covered intermediate and terminal functions, now we will deep dive into collectors.
If map() and filter() transform your data,
collect() is what turns it into something meaningful.
In this article, we’ll go deep into:
- What Collectors are
- How
collect()works internally - Built-in collectors (with real-world examples)
- Downstream collectors
- Custom collectors
- Performance considerations
- Best practices
Let’s dive in.
What is a Collector?
In Java 8, a Collector is a mechanism used to accumulate elements of a stream into a final result.
It is defined in:
java.util.stream.Collectors
The collect() method is a terminal operation, meaning it produces a result and closes the stream.
Example:
List<String> names =
Stream.of("Priyank", "Rahul", "Ram")
.collect(Collectors.toList()); // [Priyank,Rahul,Ram]
Here:
- Stream elements → Collected into a List
-
Collectors.toList()→ defines how accumulation happens
How collect() Works Internally
The collect() method takes a Collector, which internally consists of:
- Supplier → Creates a new mutable container
- Accumulator → Adds elements into the container
- Combiner → Merges two containers (used in parallel streams)
- Finisher → Final transformation (optional)
- Characteristics → Optimization hints
Conceptually:
<R> R collect(Collector<T, A, R> collector)
Where:
T = Stream element type
A = Intermediate accumulation type
R = Final result type
Commonly Used Built-in Collectors
Let’s explore the most important ones.
1 toList()
List<Integer> list =
Stream.of(1, 2, 3)
.collect(Collectors.toList()); // {1,2,3}
Note:
Collectors.toList() does not guarantee the type (could be ArrayList, but not specified).
If you need a specific type:
.collect(Collectors.toCollection(LinkedList::new));
2 toSet()
Set<String> uniqueNames =
Stream.of("A", "B", "A")
.collect(Collectors.toSet()); // [A,B]
Removes duplicates automatically.
3 toMap()
Very powerful — and very dangerous if used incorrectly.
Map<String, Integer> map =
Stream.of("Java", "Python", "Go")
.collect(Collectors.toMap(
s -> s,
s -> s.length()
));
Duplicate keys will throw:
IllegalStateException: Duplicate key
Safe version:
Collectors.toMap(
keyMapper,
valueMapper,
(existing, replacement) -> existing
)
This tells Java what to do when a duplicate key occurs.
- existing → value already present in the map
- replacement → new value being added for the same key
Returning existing means:
“Ignore the new value and keep the old one.”
So instead of throwing an exception, Java resolves the conflict gracefully.
4 joining()
Perfect for String concatenation.
String result =
Stream.of("Java", "is", "awesome")
.collect(Collectors.joining(" "));
Output
Java is awesome
With prefix and suffix:
String result =
Stream.of("Java", "is", "awesome")
.collect(Collectors.joining(", ", "[", "]"));
Output
[Java,is,awesome]
It behaves like:
prefix + element1 + delimiter + element2 + ... + suffix
5 counting()
Collectors.counting() is really powerful when it comes to grouping.
Example: Count students per department
Map<String, Long> result =
students.stream()
.collect(Collectors.groupingBy(
Student::getDepartment,
Collectors.counting()
));
Output
{
"IT"=5,
"HR"=3,
"Finance"=4
}
6 summing/averaging/max/min
int total =
employees.stream()
.collect(Collectors.summingInt(Employee::getSalary));
Other variants:
summingLongsummingDoubleaveragingIntmaxByminBy
Replace summingInt by other variants as required.
Grouping and Partitioning (Most Powerful Use Case)
This is where collectors shine.
groupingBy()
Example: Group employees by department.
Map<String, List<Employee>> grouped =
employees.stream()
.collect(Collectors.groupingBy(Employee::getDepartment));
Output
{
"IT" → [emp1, emp2],
"HR" → [emp3]
}
Multi-Level Grouping
Problem statement: For a given list of employees, find the employees in each department as per their roles.
Map<String, Map<String, List<Employee>>> result =
employees.stream()
.collect(Collectors.groupingBy(
Employee::getDepartment,
Collectors.groupingBy(Employee::getRole)
));
Considering the following Employee list
Employee("Aman", "IT", "Developer",50000)
Employee("Priya", "IT", "Developer",70000)
Employee("Rohit", "IT", "Manager",40000)
Employee("Neha", "HR", "Recruiter",80000)
Employee("Simran", "HR", "Manager",30000)
Output
{
"IT" = {
"Developer" = [
Employee{name='Aman', department='IT', role='Developer',salary=50000},
Employee{name='Priya', department='IT', role='Developer',salary=70000}
],
"Manager" = [
Employee{name='Rohit', department='IT', role='Manager',salary=40000}
]
},
"HR" = {
"Recruiter" = [
Employee{name='Neha', department='HR', role='Recruiter',salary=80000}
],
"Manager" = [
Employee{name='Simran', department='HR', role='Manager',salary=30000}
]
}
}
Grouping with downstream collectors
Map<String, Long> countByDept =
employees.stream()
.collect(Collectors.groupingBy(
Employee::getDepartment,
Collectors.counting()
));
Output
{
"IT" = 3,
"HR" = 2
}
partitioningBy()
Used when condition is boolean.
Map<Boolean, List<Employee>> partitioned =
employees.stream()
.collect(Collectors.partitioningBy(
e -> e.getSalary() > 50000
));
Result
true → Salary of employees > 50000
false → Others
What's next?
In part 2 of Collectors in depth, we will see:
- Downstream Collectors (Advanced)
collectingAndThen()- Creating a custom collector
- Parallel streams and collectors
- And more
Top comments (0)