how to get a sequence after a word with whitespace

62 Views Asked by At

For school I have to parse a string after a word with a lot of whitespace, but I just can't get it. Because the file is a genbank.

So for example:

BLA                                                                                                             
      1 sjafhkashfjhsjfhkjsfkjakshfkjsjkf
      2 isfshkdfhjksfkhksfhjkshkfhkjsakjfhk
      3 kahsfkjshakjfhksjhfkskjfkaskfksj

//

What I have tried is this.

if line.startswith("BLA"):

       start = line.find("BLA")
       end = line.find("//")
       line = line[:end]
       s_string = ""
       string = list()
       if s_string:
           string.append(line)


        else:
            line = line.strip()
            my_seq += line

But what I get is:

**output**
BLA

and that is the only thing it get and I want to get the output be like

**output**
BLA 1 sjafhkashfjhsjfhkjsfkjakshfkjsjkf
    2 isfshkdfhjksfkhksfhjkshkfhkjsakjfhk
    3 kahsfkjshakjfhksjhfkskjfkaskfksj

So I don't know what to do, I tried to get it like that last output. But without success. My teacher told me that I had to do like. If BLA is True you can go iterate it. And if you see "//" you have to stop, but when I tried it with that True - statement I get nothing.

I tried to search it up online, and it said I had to do it with bio seqIO. But the teacher said we can't use that.

1

There are 1 best solutions below

1
codrelphi On BEST ANSWER

Here is my solution:

lines = """BLA
  1 sjafhkashfjhsjfhkjsfkjakshfkjsjkf
  2 isfshkdfhjksfkhksfhjkshkfhkjsakjfhk
  3 kahsfkjshakjfhksjhfkskjfkaskfksj

//"""

lines = lines.strip().split("//")
lines = lines[0].split("BLA")
lines = [i.strip() for i in lines]
print("BLA", " ", lines[1])

Output:

BLA   1 sjafhkashfjhsjfhkjsfkjakshfkjsjkf
      2 isfshkdfhjksfkhksfhjkshkfhkjsakjfhk
      3 kahsfkjshakjfhksjhfkskjfkaskfksj