Apache Atlas and AWS S3

858 Views Asked by At

i am working on a project that has a requirement to store scientific data on AWS S3 as raw data for the beginning of a data lake. we are planning JSON for application data and using S3 metadata to persist application metadata (JSON schema) and process metadata. at the moment, on site S3 is the only service that we have available to us from the AWS cloud.

the client would like a publish environment where they can get the raw data back as files. we would like to avoid building a custom catalog and security infrastructure.

i don't see anything about Apache Atlas that will connect directly to AWS S3. but we can put Apache Hive on top of AWS S3 and then put Apache Atlas and Ranger on top of that. but not sure if this is how we can publish the raw data from S3 or if that even works as Hive is more of a processing environment.

is it possible to use Apache Atlas and Ranger on top of AWS S3 directly?

0

There are 0 best solutions below