Issue
I'm trying to run a Flink application in Amazon EMR. I'm using the latest versions, so EMR is at 6.7.0 and Flink is at 1.14.2.
I'm using Maven to build my applications and dependencies into a jar for EMR to run. When I run my Flink application as a step in EMR, I get this error
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.commons.math3.stat.descriptive.rank.Percentile.withNaNStrategy(Lorg/apache/commons/math3/stat/ranking/NaNStrategy;)Lorg/apache/commons/math3/stat/descriptive/rank/Percentile;
at MyFlinkApp.main(MyFlinkApp.java:85)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
I've added some logging to find out what the version of commons-math3 is, which is where the Percentile class comes from and it's version 3.1.1, which like it says, doesn't include the withNaNStrategy method it's trying to call.
My assumption was that I could run with a newer version of commons-math3, which does have that method added. In my pom.xml I'm specifying a version of commons-math3 to be 3.6.1, which I can verify gets built into my jar.
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.6.1</version>
</dependency>
I still get the same error, and I can see that it's still using version 3.1.1 of that plugin. Where is this coming from, is this from EMR itself? How could I get around this?
I can provide my full list of dependencies from my pom if that would be helpful.
Solution
Unfortunately, you've hit a very common issue running applications on Apache Flink or Apache Spark (on a cluster).
There's two different kinds of classpaths involved:
- The system classpath on the cluster itself with all Flink dependencies. That classpath contains
commons-math3:3.1.1
as transitive dependency ofhadoop-common
. - Your application classpath with
commons-math3:3.6.1
If you run your Flink applications on a cluster, your application classpath is just appended to the existing system classpath and what's first wins. There's no more dependency resolution that would figure out which version should be effectively used. Also, typically, the application classpath isn't yet available when executors are started ...
If running in local mode everything will typically just work - in most cases at least - as all dependencies are resolved to one classpath.
The solution to this is to shade (aka rename the package of) conflicting libraries using the Maven Shade Plugin. With this plugin you can build an uber jar that contains the conflicting classes at a different location so that they can be used independently to what's already present on the cluster.
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.3.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>allinone</shadedClassifierName>
<artifactSet>
<includes>
<include>*:*</include>
</includes>
</artifactSet>
<relocations>
<relocation>
<pattern>org.apache.commons.math3</pattern>
<shadedPattern>org.apache.commons.math3_1_1</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
Answered By - Moritz
Answer Checked By - Robin (JavaFixing Admin)