I have a google sheet which takes form responses. The bulk of these responses are "which project is it" and "which email addresses are involved ?" The responses for email address will be in a comma separated string. The responses for project could be random
[Projects and email addresses]

|Who are the users you are sending it to ? | Which Project is this for ? |
| [email protected], [email protected], [email protected] |Project 1
| [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] | Project 2 |
[query shows what is wrong]

I would like to be able to use a query to sort all projects and users so that all project numbers are combined, but all users are unique (with duplicates removed)
I have tried many queries and looked at multiple suggestions from StackOverflow, the best of which only gives me the first email address from the selection, and also doesn't combine the projects.
The formula for this is
=TRANSPOSE(QUERY({A1:A50, ARRAYFORMULA(SPLIT(B1:B50, ",", true, true)) },"Select Col1, Count(Col2), Col2 WHERE Col1 Is Not Null GROUP BY Col1, Col2 LABEL Count(Col2) ''"))
From my understanding - it must obviously be wrong
The first part in curly braces {A1:A50, ARRAYFORMULA(SPLIT(B1:B50, ",", true, true)) } is the array to look at (which splits the comma separated values in column B and removes white space).
The second part is the selection where I look at column1 (project) count column 2 (email addresses)
The expected outcome would be
Where am I going wrong ?? Please help

Try this fomula; you need one range reference!.
Now, let's break it down:
LAMBDA Functions:
LAMBDAfunctions. These are user-defined functions that make your formula more modular and easier to understand.Outermost
LAMBDAFunction:LAMBDAfunction takes one argumentr. It's essentially a function wrapper that you can call with some data later.Innermost
LAMBDAFunction:LAMBDAfunction takes one argumentd.QUERY Function:
QUERYfunction that extracts non-empty rows from columns A and B using the condition "Where Col1 <> ''".SPLIT and TOCOL Functions:
QUERYis then split and combined into a single column usingSPLITandTOCOLfunctions, creating a list of strings in the format "ProjectName^Email1, Email2, ...".QUERY Function (Again):
QUERYfunction is applied to this split data, filtering out rows where the second column (Col2) is not empty.Intermediate Result:
Outer LAMBDA Function (Continued):
LAMBDAfunction with the filtered data, which is passed asd.UNIQUE, SORT, and TRANSPOSE:
LAMBDA, the formula performs several operations.INDEX(r,,1)extracts the first column from the inputr.UNIQUE,SORT, andTRANSPOSEare used to create a sorted list of unique project names from the first column.BYCOL Function:
BYCOLfunction is applied to the sorted unique project names. It groups the project names and performs an operation on each group.BYCOLfunction, an innerLAMBDAfunction is used. It takes one argumentf, representing a unique project name.FILTERfunction is applied to the original data (r), extracting email addresses from the second column where the project name in the first column matches the current unique project namef.So, this formula essentially takes a dataset containing project names and associated emails, filters it, processes it, and returns a sorted list of unique project names along with their associated email addresses.
I hope this has been helpful :)