I have a DataFrame with four columns: X, Y, Z, and t. The values in the first three columns are discrete and represent a 3D index. The fourth column is a floating-point number. For example,
df = pd.DataFrame({'X':[1,2,3,2,3,1],
'Y':[1,1,2,2,3,3],
'Z':[1,2,1,2,1,2],
't':np.random.rand(6)})
# X Y Z t
#0 1 1 1 0.410462
#1 2 1 2 0.385973
#2 3 2 1 0.434947
#3 2 2 2 0.880702
#4 3 3 1 0.297190
#5 1 3 2 0.750949
How can I efficiently extend df
into a 3D datacube? (With 18 vertices in this case.) The values of t
in the new rows should be np.nan
. In other words, I want to add all the "missing" rows, such as:
...
#6 1 1 2 nan
#7 1 1 3 nan
#8 1 2 1 nan
...
The extents of X, Y, and Z are large but not huge (say, 10, 200, and 1000 unique values). Numpy-based solutions are welcome, too!
Here is one way to do it with product from Python standard library's itertool module:
Then: