My scenario is, I can search for an error message in CloudWatch, and I get all the results I want. But from that, I want to get the @requestId (only for results that match the error). From that @requestId, I want to return all the logs.
I have tried parsing the message, where the guid for the requestId exists like this:
parse "Z * Task timed out" as msgId
Then filtering on
filter strcontains(@message, msgId)
But this returns zero results. Likewise, I've also tried:
filter ispresent(msgId)
But this just returns any results that are not null, from the parse command.
Outside of adding in a dedup command and creating my own list to then create a separate search on, I can't seem to find a way to achieve this.
Can you do what I'm trying to do here? Or if not, what is your recommendation on the alternative?
While this isn't possible, it can be achieved by either using the SDK or AWS CLI via your language of choice.
After getting that list above, I could iterate through each response and get the logstream, request id, timestamp (and using python save it in a df).
Writing a method like this:
The return of ['events'] just gives you the json data of the event, instead of the extra data you may not need.
Then using df['log_data'].apply like this
**(note the split is to remove the .000 values from the default timestamp and convert to a nice epoch. (pre-req would be having this in your df as a Timestamp datatype))
After that, I applied some search queries to find the specific data I needed:
This let me isolate the exact message I was looking for using another df.apply
Then ultimately, I was able to grab those payloads and resubmit to an sqs queue. But that's outside the scope of my original question.