I have a JSON object in S3 which follows this structure:
<code> : {
<client>: <value>
}
For example,
{
"code_abc": {
"client_1": 1,
"client_2": 10
},
"code_def": {
"client_2": 40,
"client_3": 50,
"client_5": 100
},
...
}
I am trying to retrieve the numerical value with an S3 Select query, where the "code" and the "client" are populated dynamically with each query.
So far I have tried:
sql_exp = f"SELECT * from s3object[*][*] s where s.{proc}.{client_name} IS NOT NULL"
sql_exp = f"SELECT * from s3object s where s.{proc}[*].{client_name}[*] IS NOT NULL"
as well as without the asterisk inside the square brackets, but nothing works, I get ClientError: An error occurred (ParseUnexpectedToken) when calling the SelectObjectContent operation: Unexpected token found LITERAL:UNKNOWN at line 1, column X
(depending on the length of the query string)
Within the function defining the object, I have:
resp = s3.select_object_content(
Bucket=<bucket>,
Key=<filename>,
ExpressionType="SQL",
Expression=sql_exp,
InputSerialization={'JSON': {"Type": "Document"}},
OutputSerialization={"JSON": {}},
)
Is there something off in the way I define the object serialization? How can I fix the query so I can retrieve the desired numerical value on the fly when I provide ”code” and “client”?
I did some tinkering based on the documentation, and it works!
I need to access the single event in the
EventStream
(resp
) as follows:Now the correct SQL expression is:
where I have gotten (dynamically) my values for
code
andclient
beforehand. For example, based on the dummy JSON structure above, ifcode = "code_abc"
andclient = "client_2"
, I want this S3 Select query to return the value10
.The f-string resolves to
sql_exp = "SELECT s['code_abc']['client_2'] FROM S3Object s"
, and when we callresp
, we retrieveoutput_dict = {'client_2': 10}
(Not sure if there is a clear way to get the value by itself without the client key, this is how it looks like in the documentation as well).So, the final step is to retrieve
value = output_dict['client_2']
, which in our case is equal to10
.