I would like to randomly select a few columns of a 2 dimensional dataframe, and shuffle the values within those columns. I can easily shuffle all values (column-wise) of the dataframe, but I am looking to only do so to a randomly selected few.
For instance, take the 6x6 dataframe below:
0 1 2 3 4 5
0 5 3 7 1 2 9
1 1 7 5 3 0 8
2 0 2 7 1 6 5
3 8 4 2 1 9 7
4 2 9 5 6 3 4
5 7 5 8 2 1 0
Randomly selecting a few of the 6 columns, note the following output:
0 1 2 3 4 5
0 2 9 7 1 2 4
1 5 7 5 3 0 0
2 7 2 7 1 6 5
3 8 3 2 1 9 7
4 1 5 5 6 3 9
5 0 4 8 2 1 8
The above shows the 1st, 2nd and last column shuffled, and all others remain as is.
The following code allows me to shuffle all columns:
import numpy as np
df = np.random.random((6,6))
np.random.random(df)
And, yet, after many attempts, I have been unable to modify this to only select (randomly) a few columns. Any advice will be greatly appreciated. Thank you.
Assuming this input example:
I would use:
You can even vectorize the last step with
permutedif efficiency is important:Example output: