Sunday, 4 February 2018

Arrays Part 1 - the SUMPODUCT function

Introduction

In the last post we used the COUNTIFS function to dynamically calculate the data for Table 1, and a simple SUM function in Table 2 for the weekly total of each species. In this post, we are going to use an unusual approach to achieve the same result, and look at how the SUMPRODUCT function can be adapted to a variety of unintended uses, based on array calculations.

The SUMPRODUCT function


The SUMPRODUCT function multiplies the value of each line of the first column by the value in the corresponding line in the second column, and then sums the result:



In this example, the calculation is the PRICE column multiplied by the QUANTITY column; SUMPRODUCT makes this calculation for each row in the selected range, and creates a virtual array in RAM of the results which is then summed as the final result.  

The formulas used are to the right of the results in the example above. The first example is the basic PRODUCT and SUM result, and the second introduces a further multiplication to include 15% tax. Ranges D4:D7 and E4:E7 are the arrays that these examples are working with (in the second example, the tax calculation is a static value, not an array). Any number of arrays can be included, but will be limited by the amount of RAM that is available. In the example below, I've introduced a criteria - only calculate the result if the value in array D4:D7 = Robin (referenced to cell G13):


Notice that after the parenthesis for the first array, there is a multiplication operator, whereas in the first example, the arrays were separated by a comma. The first part of this formula is a logical test on each row of the column D array: Does the value equal Robin? This will generate a Boolean value of either TRUE or FALSE, which we are unable to use. The multiplication operation against the PRICE array forces Excel to convert the values in the SPECIES array to either 1 (TRUE) or 0 (FALSE), which can then be used in the subsequent calculations. This is widely referred to as a "coerced" value.

Although the function naturally works with arrays of data, it is not and array formula. The array formula equivalent can be seen below:



Coercing Boolean values 

Where the result of a logical test of TRUE or FALSE is going to be used in a calculation, whether alone or within a wider formula, the value needs to be coerced to produce 1 or 0. In the examples below, we can see two methods of forcing Excel to provide a numerical rather than Boolean value:


In the LOGICAL TEST column is a simple comparison between each row of REFERENCE 1 and REFERENCE 2, with the Boolean result in the next column. In the COERCED TEST column are two methods of coercing the Boolean value: the first method precedes the logical test with two minus operators "--"; the second simply multiplies the result by 1. The results are in the COERCED OUTPUT column. 

In the example below, the formula applied to the LOGICAL TEST column can be seen below the tables (note the absolute and mixed cell references). Because there is a multiplication operator following the first array, the values are immediately coerced to 1 or 0 for all three arrays, and the results are then multiplied by the AWARD value of  £250. In the right table, you can see the way in which these arrays are calculated. There is only one row where all of the results of the logical test are TRUE in the left table; in the right table, this is the only row where the multiplication evaluates to 1. The virtual array that is created by the multiplication sequence can be seen in the RAM ARRAY column. The final calculation is a multiplication of the the virtual array by 250 which results in one row evaluating to 250: (1*1*1*250=250). All the other multiplications in the virtual array result in 0. For example, the calculation in the first row is (1*0*0*250=0):





We can use logical testing and coerced values to adapt the SUMPRODUCT function to perform as a COUNTIFS function in Table 1 and a SUM function in Table 2.

Table1

In Table 1, we previously used the COUNTIFS function to dynamically calculate the values for each day of week and each species. With SUMPRODUCT, we can use either version of these formulas which are also dynamic:




The difference between the two versions is to show the methods used to coerce Boolean values ( "--" and/ or "*"). It isn't actually necessary to use the "--" in the second formula - I've included it simply to show where it would fit into a formula that does not include a subsequent mathematical operator, as in the example we saw earlier:



The references are highlighted by border colour for each version, and show where the references are absolute and mixed. In this example, I've added a row that contains the dates for the current week, which forms part of the references used. Note that the far right cell is equal to the far left cell, then working from the penultimate right cell, a simple subtraction reduces that date by one day as it's dragged back to the left.

Table 2

With table 2, the SUMPRODUCT function has been stripped down to a single array, which will simply produce a sum of that array:



Note that the array is horizontal - it's the entire row of Table 1 per species - and because there is no logical test, there is no value to coerce. 


Side by side the two tables look like this:



The SUMPRODUCT function can by used in a variety ways that it was never intended for due to the way it is able to work with arrays of data, without being and array formula. Understanding the way in which the arrays of data are handled, and how to coerce logical test results, opens the door to creating array formulas that can perform multiple criteria look-ups across two dimensional arrays, find multiple instances of the same criteria in a range, and much more.

I hope you enjoyed this post - if you have any questions or need any help, please get in touch.

In the next two posts, we're going to look at the INDEX and MATCH functions, how we can use them as a highly flexible look-up function and how to use the coerced values of logical tests to specify more that one look-up criteria! 





No comments:

Post a Comment