How do you deal with multiple values in a single cell separated by comma?

83 Views Asked by At

Suppose you have a table that is setup like:

Student Fruits
Foo Apple,Banana
Bar Orange
Baz Lemon,Orange

What is your usual approach if you need to quantify/do analysis on Fruits?

One approach I've used is to create multiple helper columns, then split the values like so:

Student Fruits fruit1 fruit2
Foo Apple,Banana Apple Banana
Bar Orange Orange
Baz Lemon,Orange Lemon Orange

But I'm not sure if this is the best approach, as this leads to having both horizontally and vertically entered data. So if I need to, say find out the fruits for 'Foo' in a conventional way, I'd have to again restructure this so that each fruit is in its own row to make it "spreadsheet readable". It also doesn't translate too well if I need to create pivot tables.

Is there a better approach for this kind of task?

2

There are 2 best solutions below

0
TheMaster On BEST ANSWER

The data should preferably be normalized to the first-normal-form:

Student Fruits
Foo Apple
Foo Banana
Bar Orange
Baz Lemon
Baz Orange

You can try formulas like the one below to make the transformation.

=QUERY(
  WRAPROWS(
   FLATTEN(
    MAP(
     A1:A4,
     B1:B4,
     LAMBDA(a,b,
      TOROW(
       REDUCE(,SPLIT(b,","),
        LAMBDA(ac,cur,
          VSTACK(ac,{a,cur}))))))),2),
 "where Col1<>''")
0
z.. On

In these cases I usually restructure the data in a more spreadsheet-friendly format, using a formula along the lines of:

=ARRAYFORMULA(
   QUERY(
     SPLIT(TOCOL(A2:A4&"|"&SPLIT(B2:B4,",")),"|"),
     "where Col2 is not null"))

enter image description here

Depending on what exactly you're trying to do this may not always be needed as there are cases in which it's not difficult to work with the unstructured data.