hudi-flink-bundle unable to load the s3-fs-hadoop plugin

59 Views Asked by At

When using the hudi-flink-bundle.jar, our Flink SQL jobs are unable to load the s3-fs-hadoop plugin.

Details

We are using Flink 1.17 with Hudi 0.13.1 on top of S3. Following the Hudi documentation, we created our own Flink Docker image and added the hudi-flink-bundle.jar to the flink/lib directory. We also created a plugin folder for the s3-fs-hadoop plugin and copied the plugin jar from the flink/opt directory.

The Flink job jars do not contain the hudi-flink-bundle or the s3-fs-hadoop libraries. When running Flink jobs, we get this exception:
java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found

If we copy the s3-fs-hadoop plugin jar to the flink/lib folder then everything works, but the plugin jar contains a lot of libraries that conflict with versions we are using in our job jar.

I've read the Flink debugging classloading docs, but it doesn't explain how or if plugin jar dependencies can be loaded in the application classloader.

0

There are 0 best solutions below