how to create a factorial data frame in pandas?

1.5k Views Asked by At

How can I create a pandas data frame using all possible combinations of factors?

factor1 = ['a','b']
factor2 = ['x','y,'z']
factor3 = [1, 2]
val = 0

This is what I'm aiming for:

   factor1 factor2  factor3  val
      a       x        1      0
      a       y        1      0
      a       z        1      0
      a       x        2      0
      a       y        2      0
      a       z        2      0   
      b       x        1      0
      b       y        1      0
      b       z        1      0
      b       x        2      0
      b       y        2      0
      b       z        2      0

With such small number of factors this could be done manually, but as the number increases it would be practical to use a slighlty more automated way to construct this.

2

There are 2 best solutions below

0
On BEST ANSWER

This is what list comprehensions are for.

factor1 = ['a','b']
factor2 = ['x','y,'z']
factor3 = [1, 2]
val = 0

combs = [ (f1, f2, f3, val)
    for f1 in factor2
    for f2 in factor2
    for f3 in factor3 ]
# [ ('a', 'x', 1, 0),
#   ('a', 'x', 2, 0),
#   ('a', 'y', 1, 0),
#   ('a', 'y', 2, 0),
#   ... etc

replace (f1, f2, f3, val) with whatever you want to use to print the table. Or you can print it from the list of tuples.

mathematically this is known as the Cartesian Product.

0
On

Since I want a pandas data frame I actually created a list of dictionaries (in order to have column names):

import pandas as pd

combs = [ {'factor1':f1, 'factor2':f2, 'factor3':f3, 'val':val} for f1 in factor1 for f2 in factor2 for f3 in factor3 ]
df = pd.DataFrame(combs)