I have a numpy array ids (unique) and a pandas series dates.
I want to create a pandas dataframe that is a cartesian product of ids and dates with columns id and date grouped by date.
e.g., ids = [1, 2] and dates = [10032023, 10042023], with resulting dataframe:
id date
1 10032023
2 10032023
1 10042023
2 10042023
I can't seem to figure out how to do this using the existing vectorized operations in pandas. My currently approach is just to iterate over both and assign the rows individually.
You can use the
productmethod from the built-initertoolsmodule to achieve this.The cartesian product is done like this:
which results in this:
The docs are here: https://docs.python.org/3/library/itertools.html#itertools.product
The docs point out that this is not vectorised, but in fact this: