I am reading data from a file of "events". For each event, there is some number of "tracks". For each track there are a series of "variables". A stripped down version of the code (using awkward0 as awkward) looks like
f = h5py.File('dataAA/pv_HLT1CPU_MinBiasMagDown_14Nov.h5',mode="r")
afile = awkward.hdf5(f)
pocaz = np.asarray(afile["poca_z"].astype(dtype_X))
pocaMx = np.asarray(afile["major_axis_x"].astype(dtype_X))
pocaMy = np.asarray(afile["major_axis_y"].astype(dtype_X))
pocaMz = np.asarray(afile["major_axis_z"].astype(dtype_X))
In this snippet of code, "pocaz", "pocaMx", etc. are what I have called variables (a physics label, not a Python data type). On rare occasions, pocaz takes on an extreme value, pocaMx and/or pocaMy take on nan values, and/or pocaMz takes on the value inf. I would like to remove these tracks from the events using some syntactically simple method. I am guessing this functionality exists (perhaps in the current version of awkward but not awkward0), but cannot find it described in a transparent way. Is there a simple example anywhere?
Thanks, Mike
It looks to me, from the fact that you're able to call
np.asarrayon these arrays without error, that they are one-dimensional arrays of numbers. If so, then Awkward Array isn't doing anything for you here; you should be able to find the one-dimensional NumPy arrays insideas groups (note that this is
f, notafile) and leave Awkward Array entirely out of it.The reason I say that is because you can use np.isfinite on these NumPy arrays. (There's an equivalent in Awkward v1, v2, but you're talking about Awkward v0 and I don't remember.) That will give you an array of booleans for you to slice these arrays.
I don't have the HDF5 file for testing, but I think it would go like this:
If you also need to cut extreme finite values, you can include
etc. in the
goodselection criteria.(The way you'd do this in Awkward Array is not any different, since Awkward is just generalizing what NumPy does here, but if you don't need it, you might as well leave it out.)