I have a Cloudfront distribution that is backed by an S3 bucket. I'm trying to list the contents of a subdirectory (prefix) within the bucket, but I'm getting a list of the entire bucket contents rather than just the objects with a given prefix.
It's important to note that I have to do this by making a direct https request to the Cloudfront domain directly (not by using the AWS cli or the AWS S3 APIs).
I am able to successfully download/upload objects from the Cloudfront domain using signed cookies to authenticate. And I can clearly list all the objects on the bucket (the auth works), but I'm not able to return only objects with a given prefix.
I also have a bucket policy that allows the following on the bucket itself:
s3:ListBuckets3:ListBucketMultipartUploads
And the following on the bucket objects:
s3:GetObject, s3:GetObjectAttributes, s3:GetObjectVersion, s3:GetObjectTagging, s3:GetObjectVersionTagging, s3:PutObject, s3:PutObjectVersionTagging, s3:PutObjectTagging, s3:DeleteObject, s3:DeleteObjectTagging, s3:DeleteObjectVersionTagging, s3:AbortMultipartUpload, s3:ListMultipartUploadParts
When I try this query, the response returns the entire contents of the directory:
import requests
cookies = {...} # cookies needed to authenticate with cloudfront; this is working correctly
res = requests.get("https://example.cloudfront.net/?prefix=myprefix/", cookies=cookies)
... it gives me a big xml document.
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Name>mybucket</Name>
<Prefix/>
<Marker/>
<MaxKeys>1000</MaxKeys>
<IsTruncated>true</IsTruncated>
<Contents>
<Key>some-other-prefix-I-do-not-want/myfile.text</Key>
...
</Contents>
<Contents>
...
</ListBucketResult>
Or if I try this query, the params have no effect and I get the entire bucket contents (I think this is using the S3 api, so maybe that's why this particular query doesn't work)
params = {"list-type": "2", "delimiter": "/", "prefix": "myprefix"}
res = requests.get("https://example.cloudfront.net", cookies=cookies, params=params)
When I try this query, I get a 404.
res = requests.get("https://example.cloudfront.net/myprefix/", cookies=cookies)
Does anyone know how I can list only the objects with a given prefix when listing the files from a Cloudfront distribution?
I found the answer to this and am posting it for anyone who encounters it. I needed to add a custom
Origin RequestPolicy on my Cloudfront distribution. The policy is a copy of the AWS-providedCORS-S3Originpolicy, with the additional query strings added to allow those query parameters to pass through Cloudfront to the S3 API.So, for example, under
Cloudfront > Policies > Origin request > Create origin request policy, I created a policy with all of the headers from theCORS-S3Originpolicy plus I added Query strings forlist-type,max-keys,delimiter,prefix, andstart-after. Basically, I added any parameters that I wanted to pass though Cloudfront to S3 as documented on AWS's S3 API Reference hereThen, I just needed to attach that policy to my Cloudfront distribution by going to
Cloudfront > Distributions > My Distributionand Edit my distribution. Then, I select the "Behaviors" tab on my distribution, edit that behavior, and change the Origin request policy to my custom policy that I just created.Once I made that change, I was able to filter the response as follows: