Press "Enter" to skip to content

The Ultimate Guide to the Java Stream API groupingBy() Collector

The groupingBy() is one of the most powerful and customizable Stream API collectors.

If you always find yourself not going beyond:

.collect(groupingBy(...));

…or simply wanted to discover more its potential uses, then this article is for you.

If you’re looking for a general Collectors API overview, head here.

Overview

Simply put, groupingBy() provides similar functionality to SQL’s GROUP BY clause, just for Java Stream API.

To use it, we always need to specify a property, by which the grouping be performed. We do this by providing an implementation of a functional interface – usually by passing a lambda expression.

For example, if we wanted to group Strings by their lengths, we could do that by passing String::length to the groupingBy():

List<String> strings = List.of("a", "bb", "cc", "ddd"); 

Map<Integer, List<String>> result = strings.stream() 
  .collect(groupingBy(String::length)); 

System.out.println(result); // {1=[a], 2=[bb, cc], 3=[ddd]}

But the collector itself is capable of doing much more than simple groupings like above.

Grouping Into a Custom Map Implementation

If you need to provide a custom Map implementation, you can do that by using a provided groupingBy() overload:

List<String> strings = List.of("a", "bb", "cc", "ddd");

TreeMap<Integer, List<String>> result = strings.stream()
  .collect(groupingBy(String::length, TreeMap::new, toList()));

System.out.println(result); // {1=[a], 2=[bb, cc], 3=[ddd]}

Providing a Custom Downstream Collection

If you need to store grouped elements in a custom collection, this can be achieved by using a toCollection() collector.

For example, if you wanted to group elements in TreeSet instances, this could be as easy as:

groupingBy(String::length, toCollection(TreeSet::new))

and a complete example:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, TreeSet<String>> result = strings.stream()
  .collect(groupingBy(String::length, toCollection(TreeSet::new)));

System.out.println(result); // {1=[a], 2=[bb, cc], 3=[ddd]}

Grouping and Counting Items in Groups

If you simply want to know the number of grouped elements, this can be as easy as providing a custom counting() collector:

groupingBy(String::length, counting())

and a complete example:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, Long> result = strings.stream()
  .collect(groupingBy(String::length, counting()));

System.out.println(result); // {1=1, 2=2, 3=1}

Grouping and Combining Items as Strings

If you need to group elements and create a single String representation of each group, this can be achieved by using the joining() collector:

groupingBy(String::length, joining(",", "[", "]"))

and in action:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, String> result = strings.stream()
  .collect(groupingBy(String::length, joining(",", "[", "]")));

System.out.println(result); // {1=[a], 2=[bb,cc], 3=[ddd]}

Grouping and Filtering Items

Sometimes, there might be a need to exclude some items from grouped results. This can be achieved using the filtering() collector:

groupingBy(String::length, filtering(s -> !s.contains("c"), toList()))

and in action:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, List<String>> result = strings.stream()
  .collect(groupingBy(String::length, filtering(s -> !s.contains("c"), toList())));

System.out.println(result); // {1=[a], 2=[bb], 3=[ddd]}

Grouping and Calculating an Average per Group

If there’s a need to derive an average of properties of grouped items, there are a few handy collectors for that:

  • averagingInt()
  • averagingLong()
  • averagingDouble()

and in action:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, Double> result = strings.stream()
  .collect(groupingBy(String::length, averagingInt(String::hashCode)));

System.out.println(result); // {1=97.0, 2=3152.0, 3=99300.0}

Disclaimer: String::hashCode was used as a placeholder.

Grouping and Calculating a Sum per Group

If you want to derive a sum from properties of grouped elements, there’re some options for this as well:

  • summingInt()
  • summingLong()
  • summingDouble()

and in action:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, Integer> result = strings.stream()
  .collect(groupingBy(String::length, summingInt(String::hashCode)));

System.out.println(result); // {1=97, 2=6304, 3=99300}

Disclaimer: String::hashCode was used as a placeholder.

Grouping and Calculating a Statistical Summary per Group

If you want to group and then derive a statistical summary from properties of grouped items, there are out-of-the-box options for that as well:

  • summarizingInt()
  • summarizingLong()
  • summarizingDouble()

in action:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, IntSummaryStatistics> result = strings.stream()
  .collect(groupingBy(String::length, summarizingInt(String::hashCode)));

System.out.println(result);

the result (user-friendly reformatted):

{
    1=IntSummaryStatistics{
      count=1, 
      sum=97, 
      min=97, 
      average=97.000000, 
      max=97}, 
    2=IntSummaryStatistics{
      count=2, 
      sum=6304, 
      min=3136, 
      average=3152.000000, 
      max=3168}, 
    3=IntSummaryStatistics{
      count=1, 
      sum=99300, 
      min=99300, 
      average=99300.000000, 
      max=99300}
}

Disclaimer: String::hashCode was used as a placeholder.

Grouping and Reducing Items

If you want to perform a reduction operation on grouped elements, you can use the reducing() collector:

groupingBy(List::size, reducing(List.of(), (l1, l2) -> ...)))

in action:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, List<Character>> result = strings.stream()
  .map(toStringList())
  .collect(groupingBy(List::size, reducing(List.of(), (l1, l2) -> Stream.concat(l1.stream(), l2.stream())
    .collect(Collectors.toList()))));

System.out.println(result); // {1=[a], 2=[b, b, c, c], 3=[d, d, d]}

Grouping and Calculating Max/Min Item

If you want to derive the max/min element from a group, you can simply use the max()/min() collector:

groupingBy(String::length, Collectors.maxBy(Comparator.comparing(String::toUpperCase)))

in action:

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, Optional<String>> result = strings.stream()
  .collect(groupingBy(String::length, Collectors.maxBy(Comparator.comparing(String::toUpperCase))));

System.out.println(result); // {1=Optional[a], 2=Optional[cc], 3=Optional[ddd]}

The fact that the collector returns an Optional is a bit inconvenient in this case – there’s always at least a single element in a group, so usage of Optional increases accidental complexity.

Unfortunately, there’s nothing we can do with the collector itself to prevent it. We can recreate the same functionality using the reducing() collector, though.

Composing Downstream Collectors

The whole power of the collector gets unleashed once we start combining multiple collectors to define complex downstream grouping operations – which start resembling standard Stream API pipelines – the sky’s the limit here.

Example #1

Let’s say we have a list of strings and want to obtain a map of string lengths associated with uppercased strings with a length bigger than 1, and collect them into a TreeSet instance.

We can do that quite easily:

var result = strings.stream()
  .collect(
    groupingBy(String::length,
      mapping(String::toUpperCase,
        filtering(s -> s.length() > 1,
          toCollection(TreeSet::new)))));

//result
{1=[], 2=[BB, CC], 3=[DDD]}

Example #2

Given a list of strings, group them by their matching lengths, convert into a list of characters, flatten the obtained list, keep only distinct elements with non-zero length, and eventually reduce them by applying string concatenation.

We can achieve that as well:

var result = strings.stream()
  .collect(
    groupingBy(String::length,
      mapping(toStringList(),
        flatMapping(s -> s.stream().distinct(),
          filtering(s -> s.length() > 0,
            mapping(String::toUpperCase,
              reducing("", (s, s2) -> s + s2)))))
    ));

//result 
{1=A, 2=BC, 3=D}

Sources

All above examples can be found over in my GitHub project.

Make sure to check my OSS project with custom parallel Stream API collectors.




If you enjoyed the content, consider supporting the site: