Press "Enter" to skip to content

The Ultimate Guide to Java Stream API Collectors

Overview

This guide aims to be your friendly neighbourhood complete Stream API Collectors reference, additionally covering changes introduced in separate Java releases.

All methods feature copy-paste’able examples or references to other articles that provide such.

Please also check out my library – parallel-collectors.

Introduction to Stream#collect()

Stream.collect() is one of the most important and commonly used Stream API‘s terminal methods.

It allows performing mutable fold operations on elements held in Stream instances – simply put, it consumes the whole stream and does something with those elements – in most cases, repackages them into some sort of collection.

To collect the stream’s elements, we need to provide a collecting strategy represented by the Collector interface.

public interface Collector<T, A, R> {
    Supplier<A> supplier();
    BiConsumer<A, T> accumulator();
    BinaryOperator<A> combiner();
    Function<A, R> finisher();
    Set<Characteristics> characteristics();
}

Luckily, in most cases, we don’t need to worry about implementing it by ourselves since most common implementation are provided for us in the Collectors facade.

The Collectors Facade

All predefined Collector implementations can be found in the Collectors facade class.

It’s a common practice to use static imports for the sake of increasing readability:

import static java.util.stream.Collectors.toList;
import static java.util.stream.Collectors.toMap;
import static java.util.stream.Collectors.toSet;

Collectors#toList/toSet/toCollection

Simply put, this family of collectors can be used for taking all stream elements and repackaging them into a collection of our choice – in this case, we have three convenience collectors available:

  1. toList()
  2. toSet()
  3. toCollection()

The first two can be used for handy and boilerplate-free creation of List and Set instances from Streams.

The critical thing to remember is the fact that we can’t assume any particular List/Set implementation with this method.

If you want to have more control over this, use toCollection instead which accepts a Supplier instance providing a collection of our choice.

Examples:

@Test
void E_toList() {
    List<Integer> list = List.of(1, 2, 3);

    List<Integer> result = list.stream()
      .collect(Collectors.toList());

    assertThat(result)
      .hasSize(3)
      .containsOnly(1, 2, 3);
}
@Test
void E_toSet() {
    List<Integer> list = List.of(1, 2, 3, 3);

    Set<Integer> result = list.stream()
      .collect(Collectors.toSet());

    assertThat(result)
      .hasSize(3)
      .containsOnly(1, 2, 3);
}
@Test
void E_toCollection() {
    List<Integer> list = List.of(1, 2, 3);

    Collection<Integer> result = list.stream()
      .collect(Collectors.toCollection(LinkedList::new));

    assertThat(result)
      .isInstanceOf(LinkedList.class)
      .hasSize(3)
      .containsOnly(1, 2, 3);
}

Collectors#toUnmodifiableList/toUnmodifiableSet (since JDK10)

If you want to collect all elements into an immutable collection, classical Collectors#toCollection won’t suffice since they assume mutability of the target collection.

To do that, you can leverage dedicated convenience collectors:

  1. toUnmodifiableList()
  2. toUnmodifiableSet()

Examples:

@Test
void E_toUnmodifiableList() {
    List<Integer> list = List.of(1, 2, 3);

    List<Integer> result = list.stream()
      .collect(Collectors.toUnmodifiableList());

    assertThat(result)
      .hasSize(3)
      .containsOnly(1, 2, 3);

    assertThatThrownBy(() -> result.add(42))
      .isExactlyInstanceOf(UnsupportedOperationException.class);
}
@Test
void E_toUnmodifiableSet() {
    List<Integer> list = List.of(1, 2, 3, 3);

    Set<Integer> result = list.stream()
      .collect(Collectors.toUnmodifiableSet());

    assertThat(result)
      .hasSize(3)
      .containsOnly(1, 2, 3);

    assertThatThrownBy(() -> result.add(42))
      .isExactlyInstanceOf(UnsupportedOperationException.class);
}

Collectors#toMap()

ToMap collector can be used to collect Stream elements into a Map instance. To do this, we need to provide two strategies:

  • keyMapper
  • valueMapper

keyMapper will be used for extracting a Map key from a Stream element, and valueMapper will be used for extracting a value associated with a given key.

Example:

@Test
void E_toMap() {
    List<String> list = List.of("one", "two", "three");

    Map<String, Integer> result = list.stream()
      .collect(toMap(e -> e, e -> e.length()));

    assertThat(result)
      .hasSize(3)
      .containsEntry("one", 3)
      .containsEntry("two", 3)
      .containsEntry("three", 5);
}

Collision Handling

In contrary to toSettoMap doesn’t silently filter duplicates. It’s understandable – how could it figure out which value to associate with each key?

In such cases, we can use a different toMap() overload, and provide a strategy for collision resolution:

@Test
void E_toMap_conflict_resolution() {
    List<String> list = List.of("one", "two", "three");

    Map<Integer, String> result = list.stream()
      .collect(toMap(String::length, e -> e, String::concat));

    assertThat(result)
      .hasSize(2)
      .containsEntry(3, "onetwo")
      .containsEntry(5, "three");
}

Custom Map Implementations

Just like with toCollection(), we need to be able to provide a custom Map implementation to our toMap() collector.

This can be achieved using yet another overload accepting a custom Supplier<Map<K, V>:

@Test
void E_toMap_conflict_resolution_custom_map() {
    List<String> list = List.of("one", "two", "three");

    Map<Integer, String> result = list.stream()
      .collect(toMap(String::length, e -> e, String::concat, TreeMap::new));

    assertThat(result)
      .isExactlyInstanceOf(TreeMap.class)
      .hasSize(2)
      .containsEntry(3, "onetwo")
      .containsEntry(5, "three");
}

Collectors#collectingAndThen()

Whenever we want to apply a custom finisher straight after applying a collector, we can do that by using collectingAndThen()… which applies the provided collector and then (pun intended) applies the provided finisher.

For example, if you have not yet migrated beyond JDK9 but still need to collect a Stream instance into an immutable list, you can achieve that using collectingAndThen():

@Test
void E_collectingAndThen() {
    List<String> list = List.of("one", "two", "three");

    List<String> result = list.stream()
      .collect(
        collectingAndThen(toList(), 
          Collections::unmodifiableList));

    assertThat(result)
      .hasSize(3)
      .containsExactly("one", "two", "three");

    assertThatThrownBy(() -> result.add(""))
      .isExactlyInstanceOf(UnsupportedOperationException.class);
}

Collectors#joining()

Simply put, this collector can be used for collecting all Stream<CharSequence> elements into a String instance.

Additionally, we can provide custom delimiter, prefix, and suffix:

@Test
void E_joining() {
    List<String> list = List.of("one", "two", "three");

    String result = list.stream()
      .collect(Collectors.joining());

    assertThat(result).isEqualTo("onetwothree");
}

@Test
void E_joining_separator() {
    List<String> list = List.of("one", "two", "three");

    String result = list.stream()
      .collect(Collectors.joining(","));

    assertThat(result).isEqualTo("one,two,three");
}

@Test
void E_joining_separator_prefix_suffix() {
    List<String> list = List.of("one", "two", "three");

    String result = list.stream()
      .collect(Collectors.joining(",", "[", "]"));

    assertThat(result).isEqualTo("[one,two,three]");
}

Collectors#summarizingDouble/Long/Int()

This one allows to derive a set of information about Stream Double/Long/Int elements:

@Test
void E_summarizing() {
    List<String> list = List.of("one", "four", "three");

    IntSummaryStatistics result = list.stream()
      .collect(summarizingInt(String::length));

    assertThat(result.getAverage()).isEqualTo(4);
    assertThat(result.getCount()).isEqualTo(3);
    assertThat(result.getMax()).isEqualTo(5);
    assertThat(result.getMin()).isEqualTo(3);
    assertThat(result.getSum()).isEqualTo(12);
}

Collectors#groupingBy()

GroupingBy collector is used for grouping objects by some property and storing results in a Map instance.

We can group them by string length and store grouping results in Set instances:

Map<Integer, Set<String>> result = givenList.stream()
  .collect(groupingBy(String::length, toSet()));

This will result in the following being true:

assertThat(result)
  .containsEntry(1, newHashSet("a"))
  .containsEntry(2, newHashSet("bb", "dd"))
  .containsEntry(3, newHashSet("ccc"));

Notice that the second argument of the groupingBy method is a Collector and you are free to use any Collector of your choice, or skip the parameter and let it default to a standard toList().

A deep dive into groupingBy() can be found in a separate article.

Collectors#partitioningBy()

Collectors#partitioningBy is a specialization of Collectors#groupingBy which allows group all Stream elements using a provided Predicate.

It collects results in a Map instance with Boolean values as keys where:

  • true gets associated with a list of values matching the provided condition
  • false gets associated with a list of values matching the provided condition
@Test
void E_partitioningBy() {
    List<String> list = List.of("one", "two", "three");

    Map<Boolean, List<String>> result = list.stream()
      .collect(partitioningBy(i -> i.length() == 3));

    assertThat(result)
      .hasSize(2)
      .containsEntry(true, List.of("one", "two"))
      .containsEntry(false, List.of("three"));
}

The above functionality can be customised by providing a custom downstream Collector:

@Test
void E_partitioningBy_downstream() {
    List<String> list = List.of("one", "two", "three");

    Map<Boolean, Set<String>> result = list.stream()
      .collect(partitioningBy(i -> i.length() == 3, toSet()));

    assertThat(result)
      .hasSize(2)
      .containsEntry(true, Set.of("one", "two"));
}

Collectors#teeing() (since JDK12)

Simply put, it allows to collect a Stream using two independent collectors, and then merge their results using the supplied BiFunction.

// import static java.util.stream.Collectors.*;

Double ev = Stream.of(1, 2, 3, 4, 5, 6) // dice roll
  .collect(teeing(
    summingDouble(i -> i),
    counting(),
    (sum, n) -> sum / n));

System.out.println(ev); // 3.5

More about it can be found here:

A New JDK12 Stream API Collector – Collectors#teeing

Downstream Collectors

At first sight, below Collectors provide duplicated functionality (all of these are available as Stream instance methods), but their purpose is to be used as downstream Collectors for Collectors#groupingBy and others.

A deep dive into groupingBy() can be found in a separate article.

Collectors#counting()

Counting is a simple collector that counts all Stream elements.

The same functionality is provided by Stream#count.

Collectors#reducing()

Reducing is a simple collector that reduces all Stream elements to a single element by using a provided strategy.

The same functionality is provided by Stream#reduce.

Collectors#maxBy()/minBy()

MaxBy/minBy are simple collectors that allow finding max/min Stream elements according to a given Comparator.

The same functionality is provided by Stream#max/min.

Collectors#filtering() (since JDK9)

Filtering is a simple collector that allows filter Stream elements using a provided Predicate.

The same functionality is provided by Stream#filter.

Collectors#mapping()

Mapping is a simple collector that allows applying a given Function to all Stream elements.

The same functionality is provided by Stream#map.

Collectors#flatMapping() (since JDK9)

FlatMapping is a simple collector that allows flattening nested Stream structures.

The same functionality is provided by Stream#flatMap.

Collectors#averagingDouble/Long/Int()

This one allows deriving average of all Stream elements:

@Test
void E_averaging() {
    List<String> list = List.of("one", "four", "three");

    Double result = list.stream()
      .collect(averagingInt(String::length));

    assertThat(result).isEqualTo(4);
}

The same functionality is provided by IntStream#average:

OptionalDouble result = list.stream()
  .mapToInt(...)
  .average();

Collectors#summingDouble/Long/Int()

@Test
void E_summing() {
    List<String> list = List.of("one", "four", "three");

    int result = list.stream()
      .collect(summingInt(String::length));

    assertThat(result).isEqualTo(12);
}

The same functionality is provided by IntStream#sum:

int result = list.stream()
  .mapToInt(...)
  .sum();

All examples can be found on GitHub.




If you enjoyed the content, consider supporting the site:

Support the siteSupport the site