Issue downloading/parsing ORC File from S3, or from Local Path

66 Views Asked by FluffyGus At 04 September 2023 at 15:15

I have an application deployed that is supposed to parse/download an ORC File from an S3 bucket.

I have tried multiple things, one of them being, downloading the File locally in the app, and try to create an OrcReader object using the createReader method form Hadoop, using the Hadoop.fs.Path, passing in as an argument the path to the app local file. But every time I'm getting:

- Unknown error occurred
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.LocalFileSystem not found

My code is:

final GetObjectRequest objectRequest = GetObjectRequest.builder()
                                                           .bucket(s3Bucket)
                                                           .key(fullPath)
                                                           .build();
    try (final ResponseInputStream<GetObjectResponse> responseInputStream = s3Client.getObject(objectRequest);
        final FileOutputStream fileOutputStream = new FileOutputStream(downloadPath)) {

      IOUtils.copyLarge(responseInputStream, fileOutputStream);

      Configuration conf = new Configuration();
      conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
      conf.set("fs.https.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());
      conf.set("fs.https.impl", org.apache.hadoop.fs.http.HttpsFileSystem.class.getName());

      return createReader(new Path(downloadPath.toString()), readerOptions(conf));

But I am still getting the error. This would've been way easier with a CSV, and using BufferedReader but unfortunately that is not the case. Also I don't want to read every line from S3 and copy the contents of the file to a temporary file as this will affect the performance of the application

I do have the orc dependency in my pom, as well as the hadoop-common one.

Any kind of help would be greatly appreciated. Thanks!

Original Q&A

Issue downloading/parsing ORC File from S3, or from Local Path

There are 0 best solutions below

Related Questions in JAVA

Related Questions in AMAZON-S3

Related Questions in FILEREADER

Related Questions in ORC

Trending Questions

Popular # Hahtags

Popular Questions