Stream.forEach() overuse
Starting with Java 8, Java programmers can enjoy the merrits of
functional programming thanks to lambdas, java.util.function.*
and the
stream API. This is great, and allows to improve readability.
For example, conversions of lists can be written in a relatively succint way:
List<String> hosts = urls.stream().map(URL::getHost).collect(Collectors.toList()));
That is much nicer and easier (or at least faster) to read than this:
List<String> hosts = new ArrayList<>(urls.size());
for (final URL url : urls) {
hosts.add(url.getHost());
}
Readability
The stream API solution also avoids (or at least hides) using side
effects, while the for each
solution has side effects on the hosts
list.
As it is with new shiny toys, the lure is to use them everywhere. The
Stream
and Iterable
interface have a forEach(Consumer<? super T>)
method. This can be used instead of classic for each loops.
Map<String, Integer> count = new HashMap<>();
Arrays.asList("foo", "bar", "foo", "baz").forEach(s -> {
if (count.containsKey(s)) {
count.put(s, count.get(s) + 1);
} else {
count.put(s, 1);
}
});
This is not in any form more readable than the imperative version.
Map<String, Integer> count = new HashMap<>();
for (String s : Arrays.asList("foo", "bar", "foo", "baz")) {
if (count.containsKey(s)) {
count.put(s, count.get(s) + 1);
} else {
count.put(s, 1);
}
}
There is a small practical difference between the two approaches. The
lambda body can only access de-facto final variables (i.e., they do not
have to be declared final, but there must only be a single declaration
to the variable). Note, in the example above count
is not declared
final, but it could be declared final without breaking compilation.
This means, you cannot modify variables of primitive type using forEach()
declared outside the body of the lambda function.
Bytecode
The bytecode for these two for loops is different. The first version yields the following bytecode:
L0
NEW java/util/HashMap
DUP
INVOKESPECIAL java/util/HashMap.<init> ()V
ASTORE 1
L1
ICONST_4
ANEWARRAY java/lang/String
...
INVOKESTATIC java/util/Arrays.asList ([Ljava/lang/Object;)Ljava/util/List;
ALOAD 1
INVOKEDYNAMIC accept(Ljava/util/Map;)Ljava/util/function/Consumer; [
// handle kind 0x6 : INVOKESTATIC
java/lang/invoke/LambdaMetafactory.metafactory(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
// arguments:
(Ljava/lang/Object;)V,
// handle kind 0x6 : INVOKESTATIC
MyTest.lambda$main$0(Ljava/util/Map;Ljava/lang/String;)V,
(Ljava/lang/String;)V
]
INVOKEINTERFACE java/util/List.forEach (Ljava/util/function/Consumer;)V
...
private static synthetic lambda$main$0(Ljava/util/Map;Ljava/lang/String;)V
L0
ALOAD 0
ALOAD 1
INVOKEINTERFACE java/util/Map.containsKey (Ljava/lang/Object;)Z
IFEQ L1
L2
ALOAD 0
ALOAD 1
ALOAD 0
ALOAD 1
INVOKEINTERFACE java/util/Map.get (Ljava/lang/Object;)Ljava/lang/Object;
CHECKCAST java/lang/Integer
INVOKEVIRTUAL java/lang/Integer.intValue ()I
ICONST_1
IADD
INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
INVOKEINTERFACE java/util/Map.put (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
POP
GOTO L3
L1
FRAME SAME
ALOAD 0
ALOAD 1
ICONST_1
INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
INVOKEINTERFACE java/util/Map.put (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
POP
L3
FRAME SAME
RETURN
L4
LOCALVARIABLE count Ljava/util/Map; L0 L4 0
LOCALVARIABLE s Ljava/lang/String; L0 L4 1
MAXSTACK = 4
MAXLOCALS = 2
The 2nd version yields this:
L0
NEW java/util/HashMap
DUP
INVOKESPECIAL java/util/HashMap.<init> ()V
ASTORE 1
L1
ICONST_4
ANEWARRAY java/lang/String
...
INVOKESTATIC java/util/Arrays.asList ([Ljava/lang/Object;)Ljava/util/List;
INVOKEINTERFACE java/util/List.iterator ()Ljava/util/Iterator;
ASTORE 2
L2
FRAME APPEND [java/util/Map java/util/Iterator]
ALOAD 2
INVOKEINTERFACE java/util/Iterator.hasNext ()Z
IFEQ L3
ALOAD 2
INVOKEINTERFACE java/util/Iterator.next ()Ljava/lang/Object;
CHECKCAST java/lang/String
ASTORE 3
L4
ALOAD 1
ALOAD 3
INVOKEINTERFACE java/util/Map.containsKey (Ljava/lang/Object;)Z
IFEQ L5
L6
ALOAD 1
ALOAD 3
ALOAD 1
ALOAD 3
INVOKEINTERFACE java/util/Map.get (Ljava/lang/Object;)Ljava/lang/Object;
CHECKCAST java/lang/Integer
INVOKEVIRTUAL java/lang/Integer.intValue ()I
ICONST_1
IADD
INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
INVOKEINTERFACE java/util/Map.put (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
POP
GOTO L7
L5
FRAME APPEND [java/lang/String]
ALOAD 1
ALOAD 3
ICONST_1
INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
INVOKEINTERFACE java/util/Map.put (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
POP
L7
FRAME CHOP 1
GOTO L2
L3
The compiler generates an ananymous method that is called via
INVOKEDYNAMIC
for the stream API call, while the classic for loop has
all code inline.
Performance
Using INVOKEDYNAMIC
is slightly more expensive than inline code. On my
computer it took on average 819 ms to call the non-stream version
10 million times (averaged over 100 runs) vs 845 ms for the stream
version. Usually, this is a negligible difference. One should optimize
for readability and only start to optimize once you hit performance
issues (in most cases anyway).
Conclusion
The stream API offers great ways to increase readability. However, that
does not mean it should be used all the time. If you use forEach()
where you could have use a classic for loop just the same, you are not
really using it the way it was intended and you might as well stick to
the classic way altogether.
As a rule of thumb, I prefer the stream API when there are no side effects.