I have the following code:
dfs = []
for f in files_xlsx:
city_name = pd.read_excel(f, "1. City", nrows=1, parse_cols="C", header=None, skiprows=1)
country_code = pd.read_excel(f, "1. City", nrows=1, parse_cols="C", header=None, skiprows=2)
data = pd.read_excel(f, "1. City", parse_cols="B:J", header=None, skiprows=8)
data['City name'] = city_name.iat[0,0]
data['City code'] = country_code.iat[0,0]
dfs.append(data)
df = pd.concat(dfs, ignore_index=True)
I would like to run the loop if and only if each Excel file contains the values 91, 92, 93, 94, 95, 96, 97
in the following location (column and row combination) D8, E8, F8, G8, H8, I8, J8
. The loop should only run if this condition is met across all files.
All my Excel files have the same format in theory. In practice, they often don't so I want to run a check before appending them. It would be great if the code could tell me which file does not meet the above condition. Thank you.
Edit:
In[1]: data
Out[1]:
0 1 2 3 4 5 6 7 8 City name City code
0 x x x x x x x x x x
1 x x x x x x x x x x
2 x x x x x x x x x x
Consider building a helper data frame that resembles Excel data but with values matched to specification (row 8 and columns D-J with those specific values). Then in loop iteratively merge against this helper data frame and if it returns a match, conditionally append to your list of data frames.
NOTE: Adjust columns to actual column names, replacing
list('ABCDEFGHIJ')
with list of names,['col1','col2','col3',...]
, inDataFrame()
andmerge()
calls.Below demonstrates with random data
Output (see 8th and 17th rows with matching criteria)