The "Simple" Problem
Let me paint the picture. I've got a stream of customer names, and I want to create a numbered list for a report:
List<String> customers = List.of("John Smith", "Jane Doe", "Bob Johnson");
// I want: ["1. John Smith", "2. Jane Doe", "3. Bob Johnson"]
Seems straightforward, right? Well, not with standard Java Streams.
The Ugly Solutions
Attempt #1: The AtomicInteger Hack
My first instinct was the classic AtomicInteger approach that every Java developer has probably written at least once:
AtomicInteger counter = new AtomicInteger(0);
List<String> numbered = customers.stream()
.map(name -> (counter.getAndIncrement() + 1) + ". " + name)
.toList();
It works, but... ugh. Look at that thing! I'm creating an external state just to track where I am in my stream. It's not thread-safe if I want to go parallel, it's ugly, and honestly, it makes me feel dirty every time I write it.
Plus, what if I forget to reset the counter? What if I accidentally use the same counter instance somewhere else? This approach is a bug waiting to happen.
Attempt #2: The IntStream Workaround
Then I remembered the "clever" IntStream approach:
List<String> numbered = IntStream.range(0, customers.size())
.mapToObj(i -> (i + 1) + ". " + customers.get(i))
.toList();
This only works if I already have a List (not a Stream), and it completely abandons the stream I was working with. Plus, it requires random access, so goodbye to any lazy evaluation benefits.
Attempt #3: The Custom Collector Nightmare
I won't even show you the custom collector I tried to write. Let's just say it involved way too much mutable state and made me question my life choices.
The "Why Is This So Hard?" Moment
I stepped back and thought: "In Kotlin, this would just be list.withIndex().map { (index, value) -> "${index + 1}. $value" }. Why is Java making this so complicated?"
And that's when it hit me. Java Streams are powerful, but they're missing some of the ergonomic features that other functional programming languages take for granted. The standard library was designed conservatively, which is good for stability, but sometimes frustrating for developer experience.
Real-World Index Problems
This isn't just about numbered lists. Here are real scenarios where I've needed indexed operations:
CSV Processing with Error Reporting
// I need to know which LINE failed parsing
try {
List<Record> records = csvLines.stream()
.map(this::parseRecord) // But which line threw the exception?
.toList();
} catch (ParseException e) {
// "Error parsing CSV" - Thanks, super helpful!
}
Batch Processing with Progress
// Processing 10,000 records, want to show progress
largeDataSet.stream()
.map(this::expensiveOperation) // How do I show "Processing record 3,247 of 10,000"?
.toList();
Building a Better Solution: The StreamX Journey
So I decided to solve this correctly. Here's how I approached building withIndex for StreamX, step by step:
Step 1: The Naive First Attempt
My first thought was: "Let me just wrap this in a utility method to hide the ugly AtomicInteger":
public static <T, R> Stream<R> withIndex(Stream<T> stream, BiFunction<T, Integer, R> mapper) {
AtomicInteger counter = new AtomicInteger(0);
return stream.map(element -> mapper.apply(element, counter.getAndIncrement()));
}
This cleaned up the calling code, but it still had all the same problems:
- External mutable state
- Not truly parallel-safe
- The AtomicInteger overhead for every element
Step 2: The "What If I Collect First?" Attempt
Then I thought, maybe I should just collect in a list first:
public static <T, R> Stream<R> withIndex(Stream<T> stream, BiFunction<T, Integer, R> mapper) {
List<T> elements = stream.collect(Collectors.toList());
return IntStream.range(0, elements.size())
.mapToObj(i -> mapper.apply(elements.get(i), i));
}
This worked, but it broke the whole point of streams! No more lazy evaluation, everything gets materialized into memory immediately. For a large stream, this could be a performance killer.
Step 3: The "I Need to Think Differently" Moment
I realized I was thinking about this wrong. The problem wasn't with Java Streams themselves - it was that I needed to think at the Spliterator level. Streams are built on Spliterators, and that's where the real magic happens.
What if I could create a Spliterator that automatically tracks indices as it processes elements?
Step 4: Building IndexedValue
First, I needed a clean way to represent an element paired with its index:
public record IndexedValue<T>(T value, int index) {
@Override
public String toString() {
return "IndexedValue{value=" + value + ", index=" + index + "}";
}
}
Simple, immutable, and tells you exactly what it is. No mystery here.
Step 5: The IndexingSpliterator
This is where it gets interesting. I needed a Spliterator that wraps another Spliterator and adds index tracking:
public class IndexingSpliterator<T> implements Spliterator<IndexedValue<T>> {
private final Spliterator<T> source;
private int index = 0;
public IndexingSpliterator(Spliterator<T> source) {
this.source = source;
}
@Override
public boolean tryAdvance(Consumer<? super IndexedValue<T>> action) {
return source.tryAdvance(item ->
action.accept(new IndexedValue<>(item, index++)));
}
@Override
public long estimateSize() {
return source.estimateSize();
}
@Override
public int characteristics() {
return source.characteristics();
}
@Override
public Spliterator<IndexedValue<T>> trySplit() {
// For simplicity, we don't support splitting (no parallel processing)
// A full implementation would need to handle this properly
return null;
}
}
The key insight here is tryAdvance() every time the underlying Spliterator produces an element, we wrap it with its index and increment our counter. Clean, simple, and the state is encapsulated within the Spliterator itself.
Step 6: Building zipWithIndex
Now I could create the core operation:
public static <T> Stream<IndexedValue<T>> zipWithIndex(Stream<T> stream) {
return StreamSupport.stream(
new IndexingSpliterator<>(stream.spliterator()),
stream.isParallel() // Preserve parallel characteristics
);
}
This gives me a stream where each element is paired with its index. Perfect!
Step 7: The Final withIndex Implementation
And finally, the clean API I originally wanted:
public static <T, R> Stream<R> withIndex(Stream<T> stream, BiFunction<T, Integer, R> mapper) {
return zipWithIndex(stream)
.map(indexed -> mapper.apply(indexed.value(), indexed.index()));
}
Now I can write:
List<String> numbered = StreamX.withIndex(customers.stream(),
(name, index) -> (index + 1) + ". " + name)
.toList();
Why This Approach Wins
1. No External State
The index tracking is encapsulated within the Spliterator. No shared mutable state, no thread safety concerns.
2. Preserves Stream Characteristics
The operation maintains whether the original stream was parallel, ordered, etc. It's a proper stream citizen.
3. Lazy Evaluation
Elements are only processed when needed. The index calculation happens on demand.
4. Composable
You can chain this with other stream operations naturally:
List<String> result = StreamX.withIndex(customers.stream(),
(name, index) -> (index + 1) + ". " + name)
.filter(line -> !line.contains("John")) // Still a normal stream!
.map(String::toUpperCase)
.toList();
5. Familiar API
If you've used Kotlin's withIndex() or Scala's zipWithIndex, this feels completely natural.
Real-World Examples
CSV Processing with Error Lines
List<String> errors = StreamX.withIndex(csvLines.stream(),
(line, index) -> {
try {
parseRecord(line);
return null;
} catch (ParseException e) {
return "Line " + (index + 1) + ": " + e.getMessage();
}
})
.filter(Objects::nonNull) // Only keep the errors
.toList();
Progress Tracking
int totalSize = data.size();
List<Result> results = StreamX.withIndex(data.stream(),
(item, index) -> {
if (index % 100 == 0) {
System.out.printf("Processing %d of %d (%.1f%%)%n",
index, totalSize, (index * 100.0) / totalSize);
}
return processItem(item);
})
.toList();
Conditional Processing by Position
List<String> htmlRows = StreamX.withIndex(tableData.stream(),
(row, index) -> {
String cssClass = index % 2 == 0 ? "even-row" : "odd-row";
return String.format("<tr class='%s'>%s</tr>", cssClass, row);
})
.toList();
The Lessons Learned
Building this feature taught me a few things:
Sometimes the standard library isn't enough - and that's okay! Java can't include every possible utility operation.
Good APIs hide complexity - The final withIndex method is simple to use, but the underlying implementation requires understanding Spliterators.
Functional programming patterns are worth stealing - When Kotlin, Scala, and Haskell all have similar operations, there's probably a good reason.
Performance matters - The Spliterator approach maintains lazy evaluation and stream characteristics.
Real problems deserve real solutions - This wasn't an academic exercise; it solved actual day-to-day frustrations.
Check out the full code, documentation, and examples here: StreamX
Top comments (0)