This is to create a general use case of cleaning up tabular Data from the SEC EDGAR database. Considering this below table, I need to remove columns that may contain '[any_int]'
| Column A | Column B | Column C | Column D |
|---|---|---|---|
| val | val | nan | nan |
| val | val | nan | [1] |
| val | val | nan | nan |
The assumption here is that column names are unknown, as each company in the Database will have unique table structures.
cols_to_drop = df.columns[df.columns.str.contains('\[')]
Using the string.contains() method yields no results, though I was expecting it to assign Column D
A simple approach to get desired output.