Getting data from PUBMED database passing multiple keywords at a time

720 Views Asked by At

I am trying to pass a keywords list to PubMed to get the papers that contain any of those words. I am trying the following code and it produces around 744 papers that are much less than that of the actual number of papers.

def get_abstract(id_list):
    url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id={}&retmode=" \
          "text&rettype=medline".format(id_list)
    response = requests.request("GET", URL)


def get_id_using_rest_api(query_string, max_papers):
    url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?sort=relevance&db=pubmed&term={}&retmode=JSON&retmax={}".format(
        query_string, max_papers)
    response = requests.request("GET", url)
    data = response.text
    json_response = json.loads(data)
    id_list = json_response["esearchresult"]["idlist"]

    return id_list
 


if __name__ == '__main__':
    data = response.text
    keywords_list = ["lncRNA", "long non-coding RNA", "long noncoding RNA", "Aging","senescence"]
    li_final = "+".join(keywords_list)
    ids = get_id_using_rest_api(li_final, 100000)

What I get in ids is a total of 744 paper ids and which is much less than that of the actual number.

My question
How do I pass the list in the request so that it looks for the papers containing any of the keywords rather than making a request for each keyword separately?

1

There are 1 best solutions below

0
On

As per this video provided by EMU Library that demonstrates keyword searching in PubMed, I had to modify my search criteria. I updated the list containing the OR word in it like below.

keywords_list = ["lncRNA", "long non-coding RNA", "long noncoding RNA", "Aging","senescence"]
li_final = " OR ".join(keywords_list)

the final keywords list passed to the request looked like this:

lncRNA OR long non-coding RNA OR long noncoding RNA OR Aging OR senescence