I can read one ann file into pandas dataframe as follows:
df = pd.read_csv('something/something.ann', sep='^([^\s]*)\s', engine='python', header=None).drop(0, axis=1)
df.head()
But I don't know how to read multiple ann files into one pandas dataframe. I tried to use concat
, but the result is not what I expected.
How can I read many ann files into one pandas dataframe?
It sounds like you need to use
glob
to pull in all the.ann
files from a folder and add them to a list of dataframes. After that you probably want to join/merge/concat etc. as required.I don't know your exact requirements but the code below should get you close. As it stands at the moment the script assumes, from where you are running the Python script, you have a subfolder called
files
and in that you want to pull in all the.ann
files (it will not look at anything else). Obviously review and change as required as it's commented per line.