I have a very simple question, but Google does not seem to be able to help me here. I want a subsample of a pyfits table... basically just remove 90% of the rows, or something like that. I read the table with:
data_table = pyfits.getdata(base_dir + filename)
I like the pyfits table organization where I access a field with data_table.field(fieldname), so I would like to keep the data structure, but remove rows.
You can use
numpy.random.choiceto create an array containing several random choices from another array.In your case you want "x" rows from your
data_table. You can't directly usechoiceon the Table but you can use thelenof your table forrandom.choice:And then index your table:
For example (I'm using astropy because PyFITS isn't developed anymore and has been migrated to
astropy.io.fits):If you want to allow getting the same row several times you can use
replace=Trueinstead.