Using the Rentrez package in R, I want to search a list of drugs and find the date of the earliest publication mentioning each. My strategy is as follows:
# Search for pubmed IDs for a drug
drug_name <- "aspirin"
search_query <- paste0(drug_name, "[Title/Abstract]")
search_results <- entrez_search(db = "pubmed", term = search_query, sort = "pub_date", retmax = 1000)
# Get the oldest (first) article ID
oldest_article_id <- last(search_results$ids)
The problem here is that the function will only sort the results in ascending order (most recent first). One option would be to increase 'retmax' to return all of the results, and select the last value. However some of the drugs give more results than the maximum value of retmax.
The Rentrez documentation does not give any option for ascending results, though perhaps there is an undocumented way to do this through the API. Otherwise I will need to identify a totally different strategy such as scraping the web site.
You need some coding here. This is an approximate approach.
You know there is more results than the maximum value of retmax because :
Then, get the Entrez Date for the last element in the R object (= 2007).
Repeat the search until you get all available hits. (Some functions could be written here, so the search is run programmatically.).