I have a large dataset (>79000 rows). My dataset looks like this:
| Col1 | Sampl1 | Sampl2 | Sampl3 |
|---|---|---|---|
| sp1 | 0.1 | 0.5 | 1.2 |
| sp2 | 1 | 2.1 | 4 |
| sp3 | 2 | 3 | 0.75 |
| sp1 | 1 | 4 | 1 |
I want to combine the duplicate columns in the species column while also summing the values in the samples' columns so my dataset will look like this:
| Col1 | Sampl1 | Sampl2 | Sampl3 |
|---|---|---|---|
| sp1 | 1.1 | 4.5 | 2.2 |
| sp2 | 1 | 2.1 | 4 |
| sp3 | 2 | 3 | 0.75 |
I have tried using the consolidate function in excel but it did not produce anything after loading (I would select my references and the "sum" function, I waited, and after it loaded nothing appeared in the new sheet).


As commented above, this can be performed more elegantly using
Power Querywithout any issue. May be there can be more eloquent solutions to this, but i will personally use Power Query here when dealing with larger datasets where i need to Group by duplicates and sum the respective values. Note that this is a one time approach, next time when you add new data to your source data, you just need to refresh the table imported using Power Query.So that said, you can follow the steps to perform the task using Power Query
• Select some cell in your Data Table,
• Data Tab => Get&Transform => From Table/Range,
• When the PQ Editor opens: Home => Advanced Editor,
• Make note of all the Tables Names,
• Paste the M Code below in place of what you see.
Approach One:
Or Approach Two:
You can use either of the one, but using the Second One will be more succinct.
• Change the Table name if required before importing it back into Excel.
• From Home Tab => Close & Load => Close & Load To
• When importing, you can either select Existing Sheet with the cell reference you want to place the table or you can simply click on NewSheet
Approach One --> Screenshots
Approach Two --> Screenshots
An Excel Formula alternative approach using New MS365 functions. -->
REDUCE UNIQUE BYCOL LAMBDA VSTACK HSTACK
• Formula used in cell
F1The formula approach also works fine for me, Tested with 86017 rows of data and here is a sample workbook.