How do I make a python dictionary from a string?

244 Views Asked by At

I am collecting protein sequence ids from this website: https://www.uniprot.org/

I've written this code:


url = 'https://www.uniprot.org/uploadlists/'

params = {
'from': 'ID',
'to': 'UPARC',
'format': 'tab',
'query': 'P00766    P40925'

}

data = urllib.parse.urlencode(params)
data = data.encode('utf-8')
req = urllib.request.Request(url, data)
with urllib.request.urlopen(req) as f:
   response = f.read()
   string_it = (response.decode('utf-8'))
print(string_it)

When I print the resulting string:

I get an output that looks like this:

From    To
P00766  UPI000011047C
P40925  UPI0000167B3E

How do I convert this to a dictionary?

2

There are 2 best solutions below

0
On BEST ANSWER

Basically, just appropriately split and use the values in the string. The code is as follows:

string_list = string_it.split("\n")
string_list = [i for i in string_list if i!=""]
dict_values = {}
for i in string_list[1:]:
    dict_values[i.split("\t")[0]] = i.split("\t")[1]
    
dict_values

The output is:

{'P00766': 'UPI000011047C', 'P40925': 'UPI0000167B3E'}

Code walk through:

  • Initially, split the string based on new lines.
  • This generally results in an empty entry. So, remove that.
  • Initialize a dictionary.
  • Loop through the lines, ignoring the first entry because it is just From and To .
  • Split it based on \t the delimiter and add the values into the dictionary.
0
On

I believe that your string is something like this:

string_it = """
From    To
P00766  UPI000011047C
P40925  UPI0000167B3E
"""

You can use splitlines() to split all the lines. Then again using split() to split the single line.

new_dict = {}
for line in string_it.splitlines():
  if line == "":
    continue
  new_dict[line.split()[0]] = line.split()[1]

new_dict