Google Genomics API - Internal Server Error + ReferenceIDs

202 Views Asked by At

I'm pretty new to Google genomics APIs. I'm trying to create an annotation. I used both web version and Python API call:

service.annotations().create(body={ 'annotationSetId': '101', 'name': 'TestAnnotation', 'referenceName': 'chrM', 'start': '1', 'end': '1'}, fields='id') 

Here is a sample annotation:

{
  "annotationSetId": "101",
  "name": "TestAnnotation",
  "referenceName": "chrM",
  "start": "1",
  "end": "1",
}

I get the following error for both cases:

500 Internal Server Error
{
 "error": {
  "code": 500,
  "message": "Unknown Error.",
  "status": "UNKNOWN"
 }
} 

Any Suggestion?

One more observation.

We can add a variant set by only submitting datasetId and name; no need to specify referenceId, but we cannot create an annotation set w/o referenceId. Why?

400 HTTP/2.0 400
- SHOW HEADERS -
{
 "error": {
  "code": 400,
  "message": "Invalid value for field \"annotationSet.referenceSetId\": empty or not specified",
  "status": "INVALID_ARGUMENT"
 }
}

BTW, how can I set the WRITE permission for the caller?

Caller must have WRITE permission for the associated annotation set.

Thank you in advance!

6

There are 6 best solutions below

4
On BEST ANSWER

So to have an annotationset associated to a dataset, you would need write permission to that dataset. If you created the dataset then you would have write permission, which would be associated with your account. If it is a public dataset, then you might need to ask for permission from the person who loaded that dataset to add you with write permissions to it, or you could reload it under you account.

Now assuming you created a dataset, then you can create an AnnotationSet via curl directly - you will need to use your API key from the console (please don't post your API key publicly here). Below is the command and what you would fill in:

curl -v -X POST -H "Content-Type: application/json" -d '{"datasetId":"YourActualDatasetID", "referenceSetId":"YourActualReferencesetID"}' https://genomics.googleapis.com/v1/annotationsets?fields=asdf&key=YOUR_API_KEY

Let me know if this worked for you, and if there is anything else that I can help you with.

Thanks,

Paul

0
On
2
On

to add to Paul's answer:

annotationSetId must be the id to a real annotation set. We will work on improving the error message.

We would like to require referenceId for all our APIs. We don't for our Variant API because the Reference API didn't exist when we created the Variant API.

To give a user WRITE permission, add the user as a Project Editor. See https://cloud.google.com/iam/docs/quickstart-roles-members#add_a_project_member_and_grant_them_an_iam_role

1
On

My previous comment didn't get formatted properly, so I'm writing it as an answer instead. For this specific test I would need to enable billing for my account, so my guide is the raw information in the Genomics REST API via the Discovery service:

https://www.googleapis.com/discovery/v1/apis/genomics/v1/rest

Based on the REST API, the scopes for creating a AnnotationSet are the following:

"https://www.googleapis.com/auth/cloud-platform", "https://www.googleapis.com/auth/genomics"

Since you are getting an authentication error, it would be good to first check on the console (https://console.cloud.google.com) for your project that is tied to your API (server) key that you used, if it is enabled for the Genomics and Cloud APIs?

~p

0
On

Glad to hear you got everything to work Amir! It was a fun team effort by the three of us, and I'm always happy to help out as I've used and seen the evolution of the API over the past two years :)

Regarding ReferenceIds I see you already found some of the same links I am posting here. These are basically the id that point to a reference which is a sequence such as a chromosome. A collection of reference IDs belong to a ReferenceSet which represents a reference assembly, and references.bases belong to a ReferenceID. I have not seen in the REST API a way to create load a new reference genome - those are probably populated and made available by Google manually via the backend. Maybe Melissa might have more information regarding that.

Below are a collection of links that may be helpful regarding References - some of which you also discovered - and am listing them as a collection in case others might find them useful in the future:

http://googlegenomics.readthedocs.io/en/latest/use_cases/discover_public_data/reference_genomes.html

https://cloud.google.com/genomics/v1/users-guide#references

https://cloud.google.com/genomics/v1/reference-sets#finding-references

https://cloud.google.com/genomics/reference/rest/v1/referencesets

https://cloud.google.com/genomics/reference/rest/v1/references

https://cloud.google.com/genomics/reference/rest/v1/references.bases

Each of the above of the REST APIs will have their own specific methods for searching and associating to data.

Hope it helps,

~p

0
On

To use the REST API for annotation:

gcloud auth login
TOKEN=$(gcloud auth print-access-token)
curl -v -X POST -H "Authorization: Bearer $TOKEN" -d '{"datasetId": "YOUR_DATA_SET" ,  "referenceSetId": "EMWV_ZfLxrDY-wE" }'  --header "Content-Type: application/json" https://genomics.googleapis.com/v1/annotationsets