I have a Java application running on Quarkus for reading Delta Lake tables and Parquet files using Spark libraries with version 3.5.0. Following method works fine in the matter of reading the data, but my problem is with logging (java.util.logging).
public static long getParquetFileRowCount(String parquetFilePath) {
logger.log(Level.INFO, "Starting to read parquet file: " + parquetFilePath);
long count = 0;
try (
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.config("spark.local.dir","/tmp")
.config("spark.master", "local")
.getOrCreate()) {
count = parquetRowCount(spark, parquetFilePath);
spark.stop();
}
System.out.println("SOUT Read parquet file: " + parquetFilePath + " finished.");
System.out.println("SOUT count: " + count);
logger.log(Level.INFO, "Read parquet file: " + parquetFilePath + " finished.");
return count;
}
The output:
INFO: Starting to read parquet file: *filepath*
SOUT Read parquet file: *filepath* finished.
SOUT count: *count*
So the only message displayed is from SOUT method and not the logger, even any other log from different class from my application. Even after next execution of the mothed only messages from SOUT are in the output.
Is there a way to make java.util.logging work with Spark?
Spark is invoking org.slf4j.bridge.SLF4JBridgeHandler::removeHandlersForRootLogger when the session is newly created. After all handlers are removed they then install the SLF4JBridgeHandler on the root JUL logger. The problem is that this code is run after you setup your JUL logger.
What should work the best is using SLF4J with java.util.logging and have either log4j2 configuration file or logback configuration file control all of your loggers. Then JUL will route to SLF4J and SLF4J will bind with the provider you prefer.