Issue
In the class P
below, the method test
seems to return identically false
:
import java.util.function.IntPredicate;
import java.util.stream.IntStream;
public class P implements IntPredicate {
private final static int SIZE = 33;
@Override
public boolean test(int seed) {
int[] state = new int[SIZE];
state[0] = seed;
for (int i = 1; i < SIZE; i++) {
state[i] = state[i - 1];
}
return seed != state[SIZE - 1];
}
public static void main(String[] args) {
long count = IntStream.range(0, 0x0010_0000).filter(new P()).count();
System.out.println(count);
}
}
Combining class P
with IntStream
, however, the method test
can (wrongly) return true
.
The code in the main
method above results in some positive integer, like 716208
.
The result changes after every execution.
This unexpected behavior occurs because the int
array state[]
can be set to zero during the execution.
If a testing code, such as
if (seed == 0xf_fff0){
System.out.println(Arrays.toString(state));
}
is inserted at the tail of the method test
, then the program will output a line like [1048560, 1048560, 1048560, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
.
Question: Why can the int array state[]
be set to zero?
I already know how to avoid this behavior: just replacing int[]
with ArrayList
.
I examined in:
- windows 10 + and debian 10+ with OpenJDK Runtime Environment (build 15.0.1+9-18) OpenJDK 64-Bit Server VM (build 15.0.1+9-18, mixed mode, sharing)
- debian 9 + OpenJDK Runtime Environment AdoptOpenJDK (build 13.0.1+9) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 13.0.1+9, mixed mode, sharing)
Solution
One can reproduce the problem with an even simpler example, namely:
class Main {
private final static int SIZE = 33;
public static boolean test2(int seed) {
int[] state = new int[SIZE];
state[0] = seed;
for (int i = 1; i < SIZE; i++) {
state[i] = state[i - 1];
}
return seed != state[SIZE - 1];
}
public static void main(String[] args) {
long count = IntStream.range(0, 0x0010_0000).filter(Main::test2).count();
System.out.println(count);
}
}
The problem is caused by the JVM
optimization flag that allows the vectorization (SIMD) of loops (i.e., -XX:+AllowVectorizeOnDemand
). Likely arises from applying vectorization on the same array with intersecting ranges (i.e., state[i] = state[i - 1];
). A similar issue would be possible to reproduce if the JVM
would (for some of the elements of the IntStream.range(0, 0x0010_0000)
), optimize the loop:
for (int i = 1; i < SIZE; i++)
state[i] = state[i - 1];
into:
System.arraycopy(state, 0, state, 1, SIZE - 1);
For instance:
class Main {
private final static int SIZE = 33;
public static boolean test2(int seed) {
int[] state = new int[SIZE];
state[0] = seed;
System.arraycopy(state, 0, state, 1, SIZE - 1);
if(seed == 100)
System.out.println(Arrays.toString(state));
return seed != state[SIZE - 1];
}
public static void main(String[] args) {
long count = IntStream.range(0, 0x0010_0000).filter(Main::test2).count();
System.out.println(count);
}
}
output:
[100, 100, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
NEW UPDATE: 01/01/2021
I have sent an email to one of the Developers involved in the implementation/integration of that flag -XX:+AllowVectorizeOnDemandand
received the following reply:
It is known that part of AllowVectorizeOnDemand code is broken.
There was fix (it excluded executing broken code which does incorrect vectorization) which was backported into jdk 11.0.11:
https://hg.openjdk.java.net/jdk-updates/jdk11u-dev/rev/69dbdd271e04
If you can, try build and test latest OpenJDK11u from https://hg.openjdk.java.net/jdk-updates/jdk11u-dev/
From the first link, one can read the following:
@bug 8251994 @summary Test vectorization of Streams$RangeIntSpliterator::forEachRemaining @requires vm.compiler2.enabled & vm.compMode != "Xint"
@run main compiler.vectorization.TestForEachRem test1 @run main compiler.vectorization.TestForEachRem test2 @run main compiler.vectorization.TestForEachRem test3 @run main compiler.vectorization.TestForEachRem test4
From the comments on the JIRA story on that bug, one can read:
I found the cause of the issue. To improve a chance to vectorize a loop, superword tries to hoist loads to the beginning of loop by replacing their memory input with corresponding (same memory slice) loop's memory Phi : http://hg.openjdk.java.net/jdk/jdk/file/8f73aeccb27c/src/hotspot/share/opto/superword.cpp#l471
Originally loads are ordered by corresponding stores on the same memory slice. But when they are hoisted they loose that ordering - nothing enforce the order. In test6 case the ordering is preserved (luckily?) after hoisting only when vector size is 32 bytes (avx2) but they become unordered with 16 (avx=0 or avx1) or 64 (avx512) bytes vectors. (...)
I have simple fix (use original loads ordering indexes) but looking on the code which causing the issue I see that it is bogus/incomplete - it does not help cases listed for JDK-8076284 changes:
https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017645.html
Using unrolling and cloning information to vectorize is interesting idea but as I see it is not complete. Even if pack_parallel() method is able created packs they are all removed by filter_packs() method. And additionally the above cases are vectorized without hoisting loads and pack_parallel - I verified it. That code is useless now and I will put it under flag to not run it. It needs more work to be useful. I reluctant to remove the code because may be in a future we will have time to invest into it.
This might explain why when I was comparing the assembly of the versions with and without the flag -XX:+AllowVectorizeOnDemand
, I notice that the version with the flag for the following code:
for (int i = 1; i < SIZE; i++)
state[i] = state[i - 1];
(that I extract on a method called hotstop
to facilitate looking for it in the assembly), had:
00000001162bacf5: mov %r8d,0x10(%rsi,%r10,4)
0x00000001162bacfa: mov %r8d,0x14(%rsi,%r10,4)
0x00000001162bacff: mov %r8d,0x18(%rsi,%r10,4)
0x00000001162bad04: mov %r8d,0x1c(%rsi,%r10,4)
0x00000001162bad09: mov %r8d,0x20(%rsi,%r10,4)
0x00000001162bad0e: mov %r8d,0x24(%rsi,%r10,4)
0x00000001162bad13: mov %r8d,0x28(%rsi,%r10,4)
0x00000001162bad18: mov %r8d,0x2c(%rsi,%r10,4) ;*iastore {reexecute=0 rethrow=0 return_oop=0}
; - AAAAAA.Main::hotstop@15 (line 21)
Which looks to me like a loop unrolling
, a side from that, the method java.util.stream.Streams$RangeIntSpliterator::forEachRemaining
showed up only in the assembly of the version with the flag.
Answered By - dreamcrash
Answer Checked By - Candace Johnson (JavaFixing Volunteer)