Get astropy .fits file column data from unique ID in another column

492 Views Asked by At

I'm trying to retrieve data from an astropy fits file, in Python, where the data from lots of unique ID objects have been collated into a continuous record.

By this I mean the .fits file has shape: (25603520,)

I would like to get all the data from the rows where the ID is preselected. E.g. for the .fits file below, I would like to get all 5 columns worth of data for the rows who's first column value is: 'NGTSJ225342.6+015412'.

  FITS_rec([('NGTSJ225342.6+015412', 2457881.8909375 ,  85.15793 , 10.182051, 0),
          ('NGTSJ225342.6+015412', 2457881.89107639, 109.891716, 10.210967, 0),
          ('NGTSJ225342.6+015412', 2457881.89122685,  87.59581 , 10.136151, 0),
          ...,
          ('NGTSJ225330.3+012025', 2458082.58070602,        nan,       nan, 0),
          ('NGTSJ225330.3+012025', 2458082.58085648,        nan,       nan, 0),
          ('NGTSJ225330.3+012025', 2458082.58099537,        nan,       nan, 0)],
         dtype=(numpy.record, [('SOURCE_ID', 'S20'), ('HJD', '>f8'), ('SYSFLUX', '>f4'), ('FLUX_ERR', '>f4'), ('FLAG', '>i4')]))

Ideally I would like to do this with a quick query as these files in my real dataset are so large that I would like to avoid vstacking and using where statements if I can.

Things I have tried:

f[1].data['SOURCE_ID'=='NGTSJ225445.7+014958']

... returns:

FITS_rec([], shape=(0, 25603520),
     dtype=(numpy.record, [('SOURCE_ID', 'S20'), ('HJD', '>f8'), ('SYSFLUX', '>f4'), ('FLUX_ERR', '>f4'), ('FLAG', '>i4')]))

and...

f[1].data['SYSFLUX'][np.array(f[1].data['SOURCE_ID']) == 'NGTSJ225342.6+015412']

...works, but seems non-optimal.


Related package imports:

from astropy.io import fits

Apologies for not being able to provide a more concise and reproducable test script for this. I'm actually not sure how to create the 1D list like .fits record with a mix of strings, floats and intigers like the file I'm working with.

0

There are 0 best solutions below