My AWS account hosts Glue S3 tables in the Glue Data Catalog. We share them with other AWS accounts by using the Glue Catalog resource policy. We do not currently use LakeFormation with this account.
I want to audit who is accessing our Glue Catalog how often, but the CloudTrail events are not useful.
I have tried querying CloudTrail for a variety of ways. The best evidence I can find for interactions with the Catalog Tables is when I look for Glue events:
SELECT * FROM "default"."cloudtrail"
where y = '2023'
and m = '11'
and d = '15'
and eventtime like '2023-11-15T08:32%'
and eventsource = 'glue.amazonaws.com';
I find interactions taking place for the internal AWS Internal user, but it doesn't show anything about the source account or principle:
# eventversion useridentity eventtime eventsource eventname awsregion sourceipaddress useragent errorcode errormessage requestparameters responseelements additionaleventdata requestid eventid resources eventtype apiversion readonly recipientaccountid serviceeventdetails sharedeventid vpcendpointid tlsdetails y m d
1 1.09 {type=AWSService, principalid=null, arn=null, accountid=null, invokedby=AWS Internal, accesskeyid=null, username=null, sessioncontext=null} 2023-11-15T08:32:59Z glue.amazonaws.com BatchGetTable us-east-1 AWS Internal AWS Internal {"catalogId":"123123123123","entries":[{"id":"0","databaseName":"my_db","name":"my_table"}]} {"insufficientLakeFormationPermissions":["my_db:my_table"],"TableArns":{"0":"arn:aws:glue:us-east-1:123123123123:table/my_db/my_table"},"LakeFormationTrustedCallerInvocation":"true"} 9612333-fb0b-4e09-a009-d012312328c4 3a3fbbaa-6e27-4822-bc6c-b46aab739e4b AwsApiCall true 123123123123 011123d0-d4a5-40af-9ed7-b51231235703 2023 11 15
Note the Error about LakeFormation. This is strange since the query runs successfully in the recipient account. And we are not making use of LakeFormation in the owner account.
I suspect the CloudTrail logs for this type of Glue sharing and accessing is not being tracked, but I want to see if the community has faced this issue too? Any other clever way of tracking who is interacting with Glue Databases/Tables?