Make combinations using Excel queries

485 Views Asked by At

I need to make all the possible combinations(not permutations) with three numbers using excel queries.

I have six numbers in a column. From A1:A6. I need to make all the possible combinations(not permutations) with three numbers using excel queries. Please help..

enter image description here

5

There are 5 best solutions below

5
On BEST ANSWER

If by "excel queries" you mean Power Query, here is a script that will do that:

Algorithm

  • Read in the original table
  • Add two custom columns, each cell of which contains the full original table
  • Expand the two custom columns
  • Sort each row of the three column table
  • Remove the duplicates

=> 56 combinations (allowing repeats of numbers in the same combination)

Note that the code below could be generated from the Power Query User interface

let

//Read in table and set data type
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"List", Int64.Type}}),

//Add two more columns where the rows of each column = the full original table
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each #"Changed Type"),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each #"Changed Type"),

//Expand each of the two custom columns
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom1", "Custom", {"List"}, {"Custom.List"}),
    #"Expanded Custom.1" = Table.ExpandTableColumn(#"Expanded Custom", "Custom.1", {"List"}, {"Custom.1.List"}),

//Add a column which contains a Sorted List of the three data columns
    #"Added Custom2" = Table.AddColumn(#"Expanded Custom.1", "Sorted Row", each List.Sort({[List],[Custom.List],[Custom.1.List]})),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"List", "Custom.List", "Custom.1.List"}),

//Remove the duplicates in the list => 56 items
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns"),

//expand the list into a delimited list
    #"Added Custom3" = Table.AddColumn(#"Removed Duplicates", "Custom", 
        each Text.Combine(List.Transform([Sorted Row],each Text.From(_)),"~"), type text),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom3",{"Sorted Row"}),

//split the column by the delimiter
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns1", "Custom", 
        Splitter.SplitTextByDelimiter("~", QuoteStyle.Csv), {"Custom.1", "Custom.2", "Custom.3"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{
        {"Custom.1", Int64.Type}, {"Custom.2", Int64.Type}, {"Custom.3", Int64.Type}})
in
    #"Changed Type1"

Original Data
enter image description here

Combinations with Repeated Values
enter image description here

Edit to generalize the code
Please note that this code is inefficient for large sets, as it first generates all the Permutations, and then eliminates the non-valid values by sorting the rows and removing the duplicates.
For selections of 7 items out of a group of 8, the number of permutations is in excess of 2,000,000 which Excel Power Query does not handle efficiently, so it takes a while to run.
The same code running on Power BI Desktop runs in one or two seconds

I have written it as a custom function named fnCombos

Custom Function
Rename fnCombos

(l as list, num as number)=>

let
  t = Table.FromColumns({l}),

  #"Permutations" = List.Last(
    List.Generate(
      ()=>[T=t, idx=0],
      each [idx] < num,
      each [T = Table.AddColumn([T],Text.From([idx]), each l), idx=[idx]+1],
      each [T])),

  #"Expand Columns" = List.Accumulate({"0"..Text.From(num-2)},#"Permutations", (state, current)=>
      Table.ExpandListColumn(state,current)),

  //Sort each row
  //  Then remove duplicates
  #"Sort Rows and DeDupe" = List.Distinct(Table.TransformRows(#"Expand Columns", (r) => 
      Record.FromList(List.Sort(Record.FieldValues(r)),Record.FieldNames(r)))),

    #"Converted to Table" = Table.FromList(#"Sort Rows and DeDupe", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", Table.ColumnNames(#"Expand Columns"))
  
in
    #"Expanded Column1"

Main Query

let

//Read in table and set data type
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"List", Int64.Type}}),

//Call custom function to create the combinations
    #"Combos" = fnCombos(#"Changed Type"[List],7)

in #"Combos"
1
On

To get the 56 rows:

=LET(list,A2:B7,n,ROWS(list),draw,B2,nTot,n^draw,all,SEQUENCE(nTot,1,0),pick,BASE(all,n,draw),allComb,MAKEARRAY(nTot,draw,LAMBDA(r,c,INDEX(list,MID(INDEX(pick,r),c,1)+1))),UNIQUE(DROP(REDUCE("",all,LAMBDA(a,c,VSTACK(a,SORT(INDEX(allComb,c+1,0),,,TRUE)))),1)))

enter image description here

Improved formula:

=LET(list,DROP(TOCOL(A:A,1),1),
n,ROWS(list),
draw,B2,
nTot,n^draw,
all,SEQUENCE(nTot,1,0),
allComb,MAKEARRAY(nTot,draw,LAMBDA(r,c,INDEX(list,MOD(QUOTIENT(INDEX(all,r),n^(draw-c)),n)+1))),
UNIQUE(DROP(REDUCE("",all,LAMBDA(a,c,VSTACK(a,SORT(INDEX(allComb,c+1,0),,,TRUE)))),1)))

enter image description here

Note

OP is correct that there should be 56 rows. You can check this here where the formula used is

enter image description here

or in Excel using

=COMBINA(6,3)
0
On

Reworked my answer to allow for combinations instead of permutations. Therefore too verbose, but I'll leave it up.

To write this out with a formula, one could use:

enter image description here

Formula in C2:

=LET(x,A2:A7,y,DROP(REDUCE(0,x,LAMBDA(a,b,VSTACK(a,IFERROR(HSTACK(b,DROP(REDUCE(0,x,LAMBDA(c,d,VSTACK(c,IFERROR(HSTACK(d,x),d)))),1)),b)))),1),UNIQUE(DROP(REDUCE(0,SEQUENCE(ROWS(y)),LAMBDA(a,b,VSTACK(a,SORT(INDEX(y,b),,,1)))),1)))
0
On

Alternative:

=LET(
    ζ,A2:A7,
    ξ,REPT(" ",9),
    κ,0+TRIM(MID(TOCOL(TOROW(ζ&ξ&TOROW(ζ))&ξ&ζ),9*{0,1,2}+1,9)),
    UNIQUE(MAKEARRAY(ROWS(κ),3,LAMBDA(α,β,SMALL(INDEX(κ,α,),β))))
)

This is currently statically set for all combinations of 3 numbers, though can be made dynamic as well if desired.

Large numbers in the range might require a larger second parameter for REPT.

0
On

Here, another alternative using Excel functions. We use a recursive function to generates all the index positions of the input data. This approach produces exactly the set of combination we need without generating an additional set of combination that then require to remove non valid set of values, such duplicates, etc., therefore the calculation steps are optimized.

You need to enter in the Name Manager (see formula at the end in case you don't want to use it), the following user LAMBDA function: NEXT_ROW:

=LAMBDA(x,m,n,i, IF(i=0, x, LET(s, SEQUENCE(,n), idx, XMATCH(m, IF(i<n, 
 IF(s>i, m+1, x+1), x+1),-1,-1), if(idx=i, x + N(s=i), 
 NEXT_ROW(IF(s=i, INDEX(x+1,idx), x),m,n,i-1)))))

UPDATE: The function can be simplified as follows:

=LAMBDA(x,m,n,i, IF(i=0, x, LET(s, SEQUENCE(,n), y, x+1, idx, XMATCH(m, IF(i<n, 
 IF(s>i,m+1,y),y),-1,-1),NEXT_ROW(IF(s=i,INDEX(y,idx),x),m,n,IF(idx=i,0,i-1)))))

Where:

  • x, array with the index positions with shape: 1 x n.
  • m, The total number of input values we need to distribute in n positions.
  • n, The total number of index positions to distribute
  • i, The first index position to evaluate. On every iteration we reduce the index position by one. When i=0, we finish the recurrence, returning the array of the next row. We start from right to left, so i=n.

NEXT_ROW, returns the index positions of the next row, based on input array x. For example the first element is in the range: A1:C1 for the sample data from the question, where m=6, n=3, then:

1   1   1
1   1   2 <- NEXT_ROW(A1:C1,6,3,3)
1   1   3 <- NEXT_ROW(A2:C2,6,3,3)
1   1   4 <- NEXT_ROW(A3:C3,6,3,3)
1   1   5 <- NEXT_ROW(A4:C4,6,3,3)
1   1   6 <- NEXT_ROW(A5:C5,6,3,3)
1   2   2 <- NEXT_ROW(A6:C6,6,3,3)
...
6   6   6 <- NEXT_ROW(A56:C56,6,3,3)

which corresponds with the index position we need.

Note: If you are using Excel Web, which doesn't provide access to the Name Manager, you can install the following add-ins: Advanced Formula Environment. Here is the view of NEXT_ROW function:

add-ins

Having the sequence of all possible index positions, to generate the final result is straightforward:

=LET(in, A2:A7, m, ROWS(in), n,3, cnts, SEQUENCE(COMBINA(m,n)-1),
 idx, REDUCE(SEQUENCE(,n,1,0), cnts, LAMBDA(ac,i,
 VSTACK(ac, NEXT_ROW(IF(i=1, ac, TAKE(ac,-1)),m,n,n)))),INDEX(in, idx))

We use the REDUCE/VSTACK pattern to generate the entire set of index positions. Check my answer to the question: how to transform a table in Excel from vertical to horizontal but with different length.

We initialize the accumulator of REDUCE with the first set of index positions, a constant array of ones: SEQUENCE(,n,1,0), that is why we need one less iteration from all total combinations: COMBINA(m,n) with replacements.

Here is the output: excel output

You can encapsulate the entire process in a new user LAMBDA function COMBINA_SET and add to the Name Manager, to reuse it in the future:

=LAMBDA(x, n, LET(y, TOCOL(x), m, ROWS(y), IF(AND(n=1,m=1), x,
 LET(cnts, SEQUENCE(COMBINA(m,n)-1), idx, REDUCE(SEQUENCE(,n,1,0), cnts,
 LAMBDA(ac,i, VSTACK(ac, NEXT_ROW(IF(i=1, ac, TAKE(ac,-1)),m,n,n)))),
 INDEX(y, idx)))))

Then you can invoke it as follows:

COMBINA_SET(A2:A7,3)

We consider additional scenarios for a more general case:

  1. Treat the special case of n=1 and m=1, we don't need a recursive process for that and for this case the previous formula produced an error.
  2. Allow the general case, where the input argument x, can be a column-wise array.

Note: You need anyway to create NEXT_ROW, because you cannot create a recursive function inside of a LET statement. You can overcome it, following the suggestion from this post: LAMBDA Formulaic Recursion: It’s All About ME! (credit to @JosWoolley for sharing this link):

=LAMBDA(x, n, LET(NEXT_SET, LAMBDA(ME,x,m,n,i, IF(i=0, x, LET(s, SEQUENCE(,n), 
 y, x+1, idx, XMATCH(m, IF(i<n, IF(s>i,m+1,y),y),-1,-1),ME(ME, IF(s=i,
  INDEX(y,idx),x),m,n,IF(idx=i,0,i-1))))), y, TOCOL(x), m, ROWS(y), 
 IF(AND(n=1,m=1), x, LET(cnts, SEQUENCE(COMBINA(m,n)-1), 
 idx, REDUCE(SEQUENCE(,n,1,0), cnts, LAMBDA(ac,i, VSTACK(ac, 
 NEXT_SET(NEXT_SET, IF(i=1, ac, TAKE(ac,-1)),m,n,n)))),INDEX(y, idx)))))

If you don't really want to use the Name Manager, with the previous approach using ME workaround to circumvent it, then you can have everything in a single formula:

=LET(A, A2:A7,n,3, COMBINA_SET, LAMBDA(x, n, LET(NEXT_SET, LAMBDA(ME,x,m,n,i, 
 IF(i=0, x, LET(s, SEQUENCE(,n), y, x+1, 
 idx, XMATCH(m, IF(i<n, IF(s>i,m+1,y),y),-1,-1),ME(ME, IF(s=i,
   INDEX(y,idx),x),m,n,IF(idx=i,0,i-1))))), y, TOCOL(x), m, ROWS(y), 
 IF(AND(n=1,m=1), x, LET(cnts, SEQUENCE(COMBINA(m,n)-1), 
  idx, REDUCE(SEQUENCE(,n,1,0), cnts, LAMBDA(ac,i, VSTACK(ac, 
  NEXT_SET(NEXT_SET, IF(i=1, ac, TAKE(ac,-1)),m,n,n)))),INDEX(y, idx))))),
 COMBINA_SET(A,n))