Array formula with if conditions and substraction

142 Views Asked by At

I am searching for a formula solution that would summarize hours based on two conditions (date - column a, activity - column b). More precise, I want to summarize hours of sleep each day with array formula that would include whole column range.

The data looks like this:

enter image description here

When I define exact range the formula works.

{=IF(A2:A10=I$6;IF(B2:B10="Sleep";(D2:D10)-(C2:C10);0);0)}

But when I try to include whole column it returns 0.

{=IF(A:A=I$6;IF(B:B="Sleep";(D:D)-(C:C);0);0)}

Thank you!

3

There are 3 best solutions below

8
On BEST ANSWER

You can use the entire column using an IF statement inside SUM or SUMPRODUCT. It ensures the time differences is only carried out on valid rows of the input data:

=SUM(IF(($A:$A=F1) * ($B:$B="Sleep"), ($D:$D + ($C:$C > $D:$D) - $C:$C),0))

Then just extend the formula to the right (notice the $-notation).

Using SUM or SUMPRODUCT directly produces an error (#VALUE!), for example:

=SUM(($A:$A=F1) * ($B:$B="Sleep") * ($D:$D + ($C:$C > $D:$D) - $C:$C))

because it doesn't filter first for valid rows where the subtract doesn't produce an error.

Here is the output:

excel output

Note: You need to try for your excel version, for O365 it works, but it has to be tested for an older version.

The calculation for time differences (the parenthesis is required):

$D:$D + ($C:$C > $D:$D) - $C:$C

Ensures that when the end date represents a time from the next day as it is in row 5, it adds 1, to consider the next day. You can achieve the same using another IF (longer but may be easier to understand):

IF($D:$D > $C:$C, $D:$D - $C:$C, (1+$D:$D) - $C:$C)

Performance: Please keep in mind @JosWoolley comments below, about performance. Usually, indicating the entire column in the formula instead of a specific range, forces Excel to check all rows (current maximum: 1,048,576). No significant impact for invoking the formula just one time, but for multiple invocations, it will be a significant impact on performance or Excel can even not respond.

0
On

Take a look at this link for some little-known insight from Microsoft. Microsoft deliberately prevents the use of entire columns in some formulas that internally use arrays:

Excel Limitations

In particular, see this paragraph.

1
On

If you designate your data as an Excel table then you could use structured referencing, as in the example below screenshot illustrating SUMPRODUCT solution with structured references

=SUMPRODUCT(((Schedule[End]+(Schedule[Start]>Schedule[End]))-Schedule[Start])*(Schedule[Activity]="Sleep")*(Schedule[Date]=I6))

but you should recognise that the organisation of your data does not permit the (by-date) analysis you require, e.g. most of the sleep attributable to Jan 22nd is properly attributable to Jan 23rd but that can't be reflected in the summary, as you have treated the end time on Jan 22nd as being 16 hours before the start time...

(cells I7:K7 have the custom number format [hh]:mm)