I've explored a variety of options and solutions online, but I can't seem to quite figure this out. I'm new to using Entrez so I don't fully understand how it works, but below was my attempt.
My goal would be to print out the online summary, so for instance for Kat2a I'd want it to print out 'Enables H3 histone acetyltransferase activity; chromatin binding activity; and histone acetyltransferase activity (H4-K12 specific). Involved in several processes' ...etc, from the summary section on NCBI.
from Bio import Entrez
def get_summary(gene_name):
Entrez.email = 'x'
query = f'{gene_name}[Gene Name]'
handle = Entrez.esearch(db='gene', term=query)
record = Entrez.read(handle)
handle.close()
NCBI_ids = record['IdList']
for id in NCBI_ids:
handle = Entrez.esummary(db='gene', id=id)
record = Entrez.read(handle)
print(record['Summary'])
return 0
Using Biopython to fetch all gene IDs associated with a provided gene name¹ and gathering all gene summaries per ID²
Bio.Entrez.esearchBio.Entrez.efetchYou were on the right track. Here is one example that further fleshes out the approach you initiated and provide in your question. The function below (still, more customization of course could be done) takes into account the default
Entrez.esearchmax returned Gene IDs of 20 (overriding by default to 100), and also performs the query itself filtering by organism (unless the default 'human' is set toNone).Example 1 – Fetching the gene summary for KAT2A
returns just one gene summary (remember the default is
organism='human'):Example 2 – Using wildcards and receiving many genes for a single organism
For example, gene summaries for all human aldehyde dehydrogenase genes can be obtained using the query
ALDH*(the asterisk representing a wildcard):Example 3 – Receiving thousands of genes across all organisms (unfiltered)
Setting
organism=Nonein the provided Python function andmax_gene_ids=10000for the same query (gene_name='ALDH*') results in 9010 returned Gene IDs (i.e., 9,010 ALDH-family genes among all organisms in the Entrez Gene DB, currently).E.g.,: