How To Query S3 Objects with CLI instead of S3 Select?

1.8k Views Asked by At

I need a CLI alternative similar to the example here in Dashboard link but with Json as input & output serialization types.

I have tried running the following for Json in AWS cloud shell to get the output printed on the terminal,but end up getting an error.

aws s3api select-object-content --bucket "my-bucket" --key jobs/test.json --expression "SELECT * FROM s3object s LIMIT 5" --expression-type 'SQL' --input-serialization "{"JSON":{"Type": "DOCUMENT"},"CompressionType": "None"}" --output-serialization "{"JSON": {Type: 'DOCUMENT'}}" /dev/stdout

Error: Error parsing parameter '--input-serialization': Invalid JSON: Expecting property name enclosed in double quotes: line 1 column 2 (char 1) JSON received: {JSON:{Type: DOCUMENT},CompressionType: None}

I see a lot of options for csv format ,but unable to find the same for Json.

Thank you in advance.

Note:Running on AWS cloudshell which is basically on Linux.

FYI: The following is the dashboard alternative of the input & output serialization I am trying to achieve here.

enter image description here

1

There are 1 best solutions below

5
On BEST ANSWER

Use single quotes ' to enclose the entire JSON string, if you are using linux/macOS terminal. In powershell, use \ to escape the double quotes.

like this -

aws s3api select-object-content --bucket "my-bucket" --key jobs/test.json --expression "SELECT * FROM s3object s LIMIT 5" --expression-type 'SQL' --input-serialization '"{"JSON":{"Type": "DOCUMENT"},"CompressionType": "None"}"' --output-serialization '"{"JSON": {Type: "DOCUMENT"}}"' /dev/stdout

Note: If you have any single quotes inside your JSON string, needs to be escaped with backslash \.