Athena Write Performance to AWS S3

551 Views Asked by At

I'm executing a query in AWS Athena and writing the results to s3. It seems like it's taking a long time (way too long in fact) for the file to be available when I execute the query from a lambda script.

I'm scanning 70MB of data, and the file returned is 12MB. I execute this from a lambda script like so:

athena_client = boto3.client('athena')
athena_client.start_query_execution(
    QueryString=query_string,
    ResultConfiguration={
        'OutputLocation': 'location_on_s3',
        'EncryptionConfiguration': 'SSE_S3',
    }
)

If I run the query directly in Athena it takes 2.97 seconds to run. However it looks like the file is available after 2 minutes if I run this query from the lambda script.

Does anyone know the write performance of AWS Athena to AWS S3? I would like to know if this is normal. The docs don't state how quickly the write occurs.

1

There are 1 best solutions below

0
On

Every query in Athena writes to S3.

If you check the History tab on the Athena page in the console you'll see a history of all queries you've run (not just through the console, but generally). Each of those has a link to a download path.

If you click the Settings button a dialog will open asking you to specify an output location. Check that location and you'll find all your query results there.

Why is this taking so much longer from your Lambda script? I'm guessing, but the only possible suggestion I have is that you're querying across regions - if your data is in your region and your result location is in another location you might experience slowness due to transfer cost. Even so, 12MB should be fast.