Java 11 String API Updates

It turns out that the new upcoming LTS JDK 11 release is bringing a few interesting String API updates to the table.

Let’s have a look at them and the interesting facts surrounding them.

String#repeat

One of the coolest additions to the String API is the repeat() method… that allows concatenating a String with itself a given number of times:

var string = "foo bar ";

var result = string.repeat(2); // foo bar foo bar

But the things, I was most excited about here, were the corner cases to try out – if you try to repeat a String 0 times, you will always get an empty String:

@Test
void shouldRepeatZeroTimes() {
    var string = "foo";

    var result = string.repeat(0);

    assertThat(result).isEqualTo("");
}

Same applies to repeating an empty String:

@Test
void shouldRepeatEmpty() {
    var string = "";

    var result = string.repeat(Integer.MAX_VALUE);

    assertThat(result).isEqualTo("");
}

It might be tempting to think that it’s just relying on a StringBuilder underneath, but it’s not the case. The actual implementation is much more resource-effective:

public String repeat(int count) {
    if (count < 0) {
        throw new IllegalArgumentException("count is negative: " + count);
    }
    if (count == 1) {
        return this;
    }
    final int len = value.length;
    if (len == 0 || count == 0) {
        return "";
    }
    if (len == 1) {
        final byte[] single = new byte[count];
        Arrays.fill(single, value[0]);
        return new String(single, coder);
    }
    if (Integer.MAX_VALUE / count < len) {
        throw new OutOfMemoryError("Repeating " + len + " bytes String " + count +
                " times will produce a String exceeding maximum size.");
    }
    final int limit = len * count;
    final byte[] multiple = new byte[limit];
    System.arraycopy(value, 0, multiple, 0, len);
    int copied = len;
    for (; copied < limit - copied; copied <<= 1) {
        System.arraycopy(multiple, 0, multiple, copied, copied);
    }
    System.arraycopy(multiple, 0, multiple, copied, limit - copied);
    return new String(multiple, coder);
}

From the Compressed Strings point of view, the following fragment might look suspicious at the first sight (non-latin single-character String occupies two bytes), but it’s important to remember that value.length is the size of the internal byte array and not the String itself:

final int len = value.length;
// ...
if (len == 1) {
    final byte[] single = new byte[count];
    Arrays.fill(single, value[0]);
    return new String(single, coder);
}

String#isBlank

That one is super straightforward – now we can check if a String instance is empty or contains whitespace (defined by Character#isWhitespace(int)) exclusively:

var result = " ".isBlank(); // true

String#strip

We can easily get rid of all leading and trailing whitespace from each String now:

assertThat("  f oo  ".strip()).isEqualTo("f oo");

This one will come in handy to avoid excessive whitespace once Raw Strings arrive in Java.

Additionally, we can narrow the operation only to trailing/leading whitespace:

assertThat("  f oo  ".stripLeading()).isEqualTo("f oo  ");

assertThat("  f oo  ".stripTrailing()).isEqualTo("  f oo");

However, you might be asking yourself how does this one differ from String#trim?

It turns out that String#strip is a modern Unicode-aware alternative that relies on the same definition of whitespace as String#isBlank.

More details about it can be found straight at the source.

String#lines

Using this new method, we can easily split a String instance into a Stream<String> of separate lines:

"foo\nbar".lines().forEach(System.out::println);

// foo
// bar

What’s really cool is that instead of splitting a String and converting it into a Stream, specialized Spliterators were implemented(one for Latin and one for UTF-16 Strings) that make it possible to stay lazy:

private final static class LinesSpliterator implements Spliterator<String> {
    private byte[] value;
    private int index;        // current index, modified on advance/split
    private final int fence;  // one past last index

    LinesSpliterator(byte[] value) {
        this(value, 0, value.length);
    }

    LinesSpliterator(byte[] value, int start, int length) {
        this.value = value;
        this.index = start;
        this.fence = start + length;
    }

    private int indexOfLineSeparator(int start) {
        for (int current = start; current < fence; current++) {
            byte ch = value[current];
            if (ch == '\n' || ch == '\r') {
                return current;
            }
        }
        return fence;
    }

    private int skipLineSeparator(int start) {
        if (start < fence) {
            if (value[start] == '\r') {
                int next = start + 1;
                if (next < fence && value[next] == '\n') {
                    return next + 1;
                }
            }
            return start + 1;
        }
        return fence;
    }

    private String next() {
        int start = index;
        int end = indexOfLineSeparator(start);
        index = skipLineSeparator(end);
        return newString(value, start, end - start);
    }

    @Override
    public boolean tryAdvance(Consumer<? super String> action) {
        if (action == null) {
            throw new NullPointerException("tryAdvance action missing");
        }
        if (index != fence) {
            action.accept(next());
            return true;
        }
        return false;
    }

    @Override
    public void forEachRemaining(Consumer<? super String> action) {
        if (action == null) {
            throw new NullPointerException("forEachRemaining action missing");
        }
        while (index != fence) {
            action.accept(next());
        }
    }

    @Override
    public Spliterator<String> trySplit() {
        int half = (fence + index) >>> 1;
        int mid = skipLineSeparator(indexOfLineSeparator(half));
        if (mid < fence) {
            int start = index;
            index = mid;
            return new LinesSpliterator(value, start, mid - start);
        }
        return null;
    }

    @Override
    public long estimateSize() {
        return fence - index + 1;
    }

    @Override
    public int characteristics() {
        return Spliterator.ORDERED | Spliterator.IMMUTABLE | Spliterator.NONNULL;
    }
}

Sources

Code snippets backing this article can be found on GitHub.

String#repeat

String#isBlank

String#strip

String#lines

Sources

Further Reading