Is there any way to download exported data from a Google Vault Export?

2.1k Views Asked by At

From documentation on https://developers.google.com/vault/guides/exports, I've been able to create, list, and retrieve exports, but I haven't found any way to download the exported data associated with a specific export. Is there any way to download the exported files via the API, or is this only available through the vault UI?

There is a cloudStorageSink key in the export metadata, but trying to use the values provided using the cloud storage API results in a generic permissions issue (403 Error).

Example export metadata response:

{
    "status": "COMPLETED",
    "cloudStorageSink": {
        "files": [
            {
                "md5Hash": "da5e3979864d71d1e3ac776b618dcf48",
                "bucketName": "408d9135-6155-4a43-9d3c-424f124b9474",
                "objectName": "a740999b-e11b-4af5-b8b1-6c6def35d677/exportly-41dd7886-fe02-432f-83c-a4b6fd4520a5/Test_Export-1.zip",
                "size": "37720"
            },
            {
                "md5Hash": "d345a812e15cdae3b6277a0806668808",
                "bucketName": "408d9135-6155-4a43-9d3c-424f124b9474",
                "objectName": "a507999b-e11b-4af5-b8b1-6c6def35d677/exportly-41dd6886-fb02-4c2f-813c-a4b6fd4520a5/Test_Export-metadata.xml",
                "size": "8943"
            },
            {
                "md5Hash": "21e91e1c60e6c07490faaae30f8154fd",
                "bucketName": "408d9135-6155-4a43-9d3c-424f124b9474",
                "objectName": "a503959b-e11b-4af5-b8b1-6c6def35d677/exportly-41dd6786-fb02-42f-813c-a4b6fd4520a5/Test_Export-results-count.csv",
                "size": "26"
            }
        ]
    },
    "stats": {
        "sizeInBytes": "46689",
        "exportedArtifactCount": "7",
        "totalArtifactCount": "7"
    },
    "name": "Test Export",
    ...
}
3

There are 3 best solutions below

1
On

Once all the exports are created you'll need to wait for them to be completed. You can use https://developers.google.com/vault/reference/rest/v1/matters.exports/list to check the status of every export in a matter. In the response refer to the “exports” array and check the value of “status” for each, any that say "COMPLETED" can be downloaded.

To download a completed export go to the “cloudStorageSink” object of each export and take the "bucketName" and "objectName" value of the first entry in the "files" Array. You’ll need to use the Cloud Storage API and these two values to download the files. This page has code examples for all the popular languages and using the API https://cloud.google.com/storage/docs/downloading-objects#storage-download-object-cpp.

Hope it helps.

0
On

There are two approaches that can do the action you require:

The first:
using OAuth 2.0 refresh and access keys however it requires the intervention of the user, acknowledging your app access. You can find a nice playground supplied by Google and more info here: https://developers.google.com/oauthplayground/.

  1. You will first need to choose your desired API (in your case it is the: https://www.googleapis.com/auth/devstorage.full_controll under the Cloud Storage JSON API v1 section.
  2. Then, you will need to log in with an admin account and click: "Exchange authorization code for tokens" (the fields "Refresh token" and "Access token" will be field automatically).
  3. Lastly, you will need to choose the right URL to perform your request. I suggest using the "List possible operations" to choose the right URL. You will need to choose "Get Object - Retrieve the object" under Cloud Storage API v1 (notice that there are several options with the name -"Get Object", be sure to choose the one under Cloud Storage API v1 and not the one under Cloud Storage JSON API v1). Now just enter your bucket and object name in the appropriate placeholders and click Send the request.

The second:
Programmatically download it using Google client libraries. This is the approach suggested by @darkfolcer however I believe that the documentation provided by Google is insufficient and thus does not really help. If a python example will help, you can find one in the answer to the following question - How to download files from Google Vault export immediately after creating it with Python API?

0
On

The issue you are seeing is because the API works with the principle of least privilege.

The implications for you is that, since your objective is to download the files from the export, you would get the permissions to download only the files, not the whole bucket (even if it contains only those files).

This is why when you request information from the storage bucket, you get the 403 error (permission error). However, you do have permission to download the files inside the bucket. In this way, what you should do is get each object directly, doing requests like this (using the information on the question):

GET https://storage.googleapis.com/storage/v1/b/408d9135-6155-4a43-9d3c-424f124b9474/o/a740999b-e11b-4af5-b8b1-6c6def35d677/exportly-41dd7886-fe02-432f-83c-a4b6fd4520a5/Test_Export-1.zip

So, in short, instead of getting the full bucket, get each individual file generated by the export.

Hope this helps.