I currently have seismic data with 175x events with 3 traces for each event (traces are numpy arrays of seismic data). I have classification labels for whether the seismic data is an earthquake or not for each of those 175 samples. I'm looking to format my data into numpy arrays for modelling. I've tried placing into a dataframe of numpy arrays with each column being a different trace. So columns would be 'Trace one' 'Trace two' 'Trace three'. This did not work. I have tried lots of different methods of arranging the data to use with keras.
I'm now looking to create a numpy matrix for the data to go into and to then use for modelling.
I had thought that the shape may be (175,3,7501)
as (#number of events, #number of traces,#number of samples in trace)
, however I then iterate through and try to add the three traces to the numpy matrix and have failed. I'm used to using dataframes and not numpy for inputting to Keras.
newrow = np.array([[trace_copy_1],[trace_copy_2],[trace_copy_3]])
data = numpy.vstack([data, newrow])
The data
shape is (175,3,7510)
. The newrow
shape is (3,1,7510)
and does not allow me to add newrow
to data
.
The form in which I receive the data is in obspy streams and each stream has the 3 trace objects. With each trace object, it holds the trace data in numpy arrays and so I'm having to access and append those to a dataframe for modelling as obviously I can't feed a stream or trace object to keras model.
If I understand your data correctly you can try one of the following method:
data
shape is(175, 3, 7510)
definenewrow
as followsnewrow = np.array([trace_copy_1,trace_copy_2,trace_copy_3])
withtrace_copy_x
being a numpy array with shape7510
.numpy.reshape(new_row, (3, 7510))
ornew_row.reshape((3, 7510))
pandas.DataFrame(data.reshape((175, 3*7510)))
In addition to that I recommend using
numpy.concatenate
instead ofnumpy.vstack
(more general).I hope it will works.
Cheers