So let's say i'm reading a txt file in Python which is something like this:
.. Keywords- key1; key2, key3; key4 Abstract .. ..
Now i want to parse the file until i find the word "Keywords", and then put all the keywords into a list, so the list should look something like this: ["key1", "key2", "key3", "key4"]
So its basically everything before the word Abstract and the keywords can be separated either with a comma (,) or with a semicolon (;) or a combination of both.
How do I go about this question?
Here's one way using regex
This will return either an empty list if there are no matches or a list of matches with white-space trimmed. So in this example the returning list will be:
['key1', 'key2', 'key3', 'key4']I use
re.splitas it supports splitting on multiple separators so if you had additional separators you could just add them in further pipe separated options.