I've taken a script written by Paul Davies about reingesting Splunk Logs from the AWS Cloud.
The
When my logs have failed to process in Kinesis Firehose they get placed in a backup S3 bucket. The Current format of the key is the following:
Folder/Folder/Year/Month/Day/HH/failedlogs
Example:
splunk-kinesis-firehose/splunk-failed/2023/01/01/01/failedlogs.gz
The key lookup in the script is set like this
key=urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
Is there way to get all the files within the following S3 Bucket under the sub folder - splunk-kinesis-firehose or is there a better way of looping through all folders?
As John Rotenshtein says, your Lambda function, if invoked by S3 trigger, will receive the key as part of the request. You could also invoke the Lambda manually and pass the key in the request.
But if, for some reason, you want to do a full (or partial) listing under a path, then please take a look at
s3list()that I describe in this SO post. It is a fairly general S3 lister. In your case, you would call it with:to get all the objects under that path, or, for example:
to get just the files for the month of May 2023.
Note that
s3listis agenerator: you can start listing a trillion objects and stop whenever you like (internally, it goes by chunks of upo to 1000 objects per call to AWS).