How to parse a Uniprot Dat file to retrieve GO in python?

1.3k Views Asked by Muhammad Zeeshan At 25 June 2025 at 01:17

I have tried BioPython SeqIO and other parsers but couldn't find any good tool to parse DAT files.

https://omics.pnl.gov/software/uniprot-dat-file-parser

I have tried this one but they don't provide any gene annotations

http://biopython.org/wiki/SeqIO

They mostly talk about taking inputs of FASTA and not DAT file.

from Bio import SeqIO
   for record in SeqIO.parse("Fasta/f002", "fasta"):
...     print("%s %i" % (record.id, len(record)))

Original Q&A

There are 2 best solutions below

Christian Ebeling On 28 August 2017 at 16:30

Dear Muhammad Zeeshan,

you can use the query functions of the python library pyuniprot to get sequence (or many thing else)

install (with pip or git clone) and update. Find out which taxonomy identifier fits to your organisms. Example here (human, mouse, rat). Don't make a full update for all organisms (takes very long).

pyuniprot.update(taxids=[9606, 10090, 10116])

Use following python code for your problem:

Assuming 1433E_HUMAN and A4_HUMAN are the identifier of interest:

Python code:

import pyuniprot
query = pyuniprot.query() 
entries = query.entry(name=('1433E_HUMAN', 'A4_HUMAN'))  
seqs = [x.sequence.sequence for x in entries]

Peter Cock On 01 August 2017 at 13:48

Those look like what Biopython calls "swiss" format, the plain text format used at SwissProt prior to it being called UniProt. Try:

from Bio import SeqIO
   for record in SeqIO.parse("example.dat", "swiss"):
       print("%s %i" % (record.id, len(record)))

See also the table for formats at http://biopython.org/wiki/SeqIO

How to parse a Uniprot Dat file to retrieve GO in python?

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PARSING

Related Questions in BIOPYTHON

Related Questions in BIOSERVICES

Trending Questions

Popular # Hahtags

Popular Questions