Summing a binary truth list

D

Del Cotter

For some reason I can't figure this one out. What's the most elegant way
to sum up a list of numbers according to a binary truth table? To start
with, here's an example of the raw data:

C B A Combo Numbers
N N N None 42
N N Y A only 23
N Y N B only 16
N Y Y A & B 26
Y N N C only 20
Y N Y A & C 11
Y Y N B & C 10
Y Y Y All 51

But now I want to add up all the rows in which, e.g. B has any part,
like so:

C B A Combo Numbers
N N Y A any 23+26+11+51=111
N Y N B any 16+26+10+51=103
N Y Y A & B 26+51=77
Y N N C any 20+11+10+51=92
Y N Y A & C 11+51=62
Y Y N B & C 10+51=61
Y Y Y All 51

I thought this would be easy, but I can't get my head around it. I'm
willing to change Y/N to binary TRUE/FALSE or 1/0 if it will make an
elegant function, but what function will best read the list above to
make the list below?
 
E

Elkar

The SUMPRODUCT function should work well here:

For just A, use:

=SUMPRODUCT(--(A1:A7="Y"),E1:E7)

For A and B, use:

=SUMPRODUCT(--(A1:A7="Y"),--(B1:B7="Y"),E1:E7)

For all 3, use:

=SUMPRODUCT(--(A1:A7="Y"),--(B1:B7="Y"),--(C1:C7="Y"),E1:E7)

Adjust these accordingly for whatever combination you need.

HTH,
Elkar
 
B

Bob Phillips

B only

=SUMIF(B:B,"B",D:D)

which assumes that B's Ys are in column B, the values in column D

A&B

=SUMPRODUCT(--(B1:B100="Y"),--(C1:C100="Y"),D1:D100)
--
HTH

Bob

(there's no email, no snail mail, but somewhere should be gmail in my addy)
 
D

Del Cotter

B only

=SUMIF(B:B,"B",D:D)

which assumes that B's Ys are in column B, the values in column D

A&B

=SUMPRODUCT(--(B1:B100="Y"),--(C1:C100="Y"),D1:D100)

Yes, I'm quite capable of applying a custom solution by hand for each
row, using my eye and human judgment to decide which function applies to
which row. I think you and Elkar missed the point of "elegant".

SUMPRODUCT or some array function with curly brackets feels like it
should be the way to go, but I was surprised I wasn't able to see my way
toward such a function. The ideal winner would be trivially simple to
modify for a table of four columns of Yes/No, having sixteen rows, and
so forth.
 
B

Bob Phillips

Then I think you really need to explain what you mean by elegant as we are
not mind readers.

--
HTH

Bob

(there's no email, no snail mail, but somewhere should be gmail in my addy)
 
D

Del Cotter

Then I think you really need to explain what you mean by elegant as we are
not mind readers.

1) Having written the function for the first row, you should be able to
copy it down to the remaining six or seven, and have it work for those
rows too.

2) For bonus elegance points, a simple hand modification should make it
work for 2 columns * 4 rows, 4 columns * 16 rows, or 5 columns * 32
rows. But 1) is the feature I'm really looking for. If the solution
only works for 3 columns * 8 rows, then so be it.
 
P

Pete_UK

It is confusing that in your second table you are using N to indicate
"don't care" - I have assumed that you would leave it blank for this
and use N and Y for exact matches. I put your table including headings
in A1:E9, and used 1's and 0s instead of "Y" and "N". Then, allowing
space for you to make it 4 variables, I put C, B and A as headings in
A19:C19.

I made use of 3 helper columns as follows:

F20: =IF(A20="","(1)",IF(A20=0,"(A2:A9=0)","(A2:A9=1)"))
G20: =IF(B20="","(1)",IF(B20=0,"(B2:B9=0)","(B2:B9=1)"))
H20: =IF(C20="","(1)",IF(C20=0,"(C2:C9=0)","(C2:C9=1)"))

These make up the constituent parts of an SP formula, but before
setting that up I added this User-defined function:

Function eval(func As String)
Application.Volatile
eval = Evaluate(func)
End Function

Then in K20 I added this:

=eval("sumproduct("&F20&"*"&G20&"*"&H20&"*(E2:E9))")

I copied F20:K20 down a few rows and put some values in A20:C20
onwards - I got the following results:

0 <blank> 1 49
<blank> 1 <blank> 103
<blank> 1 1 77

so it seems to do its job. If you only wanted one formula then you can
substitute the formulae from the helper columns into the formula in
K20, but it would become difficult to maintain. If you have more
variables (and thus more rows), you would have to change the ranges in
F20:H20 (using Find & Replace would be easiest - change 9 to 17 etc)
as well as having a new formula in I20 and incorporating this into
K20.

I'm not sure if this is "elegant", but hope it helps.

Pete
 
D

Del Cotter

It is confusing that in your second table you are using N to indicate
"don't care" - I have assumed that you would leave it blank for this
and use N and Y for exact matches.

If you think that works, I'm happy to do it. One of the hardest parts of
asking this type of question is persuading people about the parts you
*aren't* inflexible about.
I made use of 3 helper columns as follows:

F20: =IF(A20="","(1)",IF(A20=0,"(A2:A9=0)","(A2:A9=1)"))
G20: =IF(B20="","(1)",IF(B20=0,"(B2:B9=0)","(B2:B9=1)"))
H20: =IF(C20="","(1)",IF(C20=0,"(C2:C9=0)","(C2:C9=1)"))

I think I can improve that a little if we allow the value 2 to be used
as the wild card instead of blank:

F20: =CHOOSE(A20+1,"(A2:A9=0)","(A2:A9=1)","(1)")
These make up the constituent parts of an SP formula, but before
setting that up I added this User-defined function:

Function eval(func As String)
Application.Volatile
eval = Evaluate(func)
End Function

I'm not wild about introducing VBA, but if it has to be done...
What's an SP formula? SUMPRODUCT?
Then in K20 I added this:

=eval("sumproduct("&F20&"*"&G20&"*"&H20&"*(E2:E9))")

I copied F20:K20 down a few rows and put some values in A20:C20
onwards - I got the following results:
I'm not sure if this is "elegant", but hope it helps.

The VBA and the large number of helper columns are troubling, but I was
wondering along the lines of whether a text function might help, so this
is useful. Thanks for the suggestion.

Actually, your point about blanks above has given me an idea... the
function =OR(ISBLANK(A2),A2=A1) is TRUE when A2 is blank, whether A1 is
TRUE or FALSE, which is just what I was looking for. Now, can I make use
of this?

I'll post a follow-up in a little bit. Thanks again for your help.
 
D

Del Cotter

It is confusing that in your second table you are using N to indicate
"don't care" - I have assumed that you would leave it blank for this
and use N and Y for exact matches. I put your table including headings
in A1:E9, and used 1's and 0s instead of "Y" and "N". Then, allowing
space for you to make it 4 variables, I put C, B and A as headings in
A19:C19.

Your point about blanks jogged me to change how I present the problem,
and inspired me to make the following function the judge of whether to
count a row:

=OR(ISBLANK(A$2),A$2=A11)

This returns TRUE if cell A2 matches A11, or if A2 is blank whatever the
value of A11, which is what I wanted. Three of those together gives me:

=AND(OR(ISBLANK(A$2),A$2=A11),
OR(ISBLANK(B$2),B$2=B11),
OR(ISBLANK(C$2),C$2=C11))

And I can multiply that by the data in that row via a helper column like
so:

1: A B C Sum
2: TRUE TRUE =SUM(E11:E18)

10: A B C Data Helper
11: FALSE FALSE FALSE 42 =D11*--AND(OR(ISBLANK(A$2),A$...
12: TRUE FALSE FALSE 23 =D12*--AND(OR(ISBLANK(A$2),A$...
13: FALSE TRUE FALSE 16 =D13*--AND(OR(ISBLANK(A$2),A$...
14: TRUE TRUE FALSE 26 =D14*--AND(OR(ISBLANK(A$2),A$...
15: FALSE FALSE TRUE 20 =D15*--AND(OR(ISBLANK(A$2),A$...
16: TRUE FALSE TRUE 11 =D16*--AND(OR(ISBLANK(A$2),A$...
17: FALSE TRUE TRUE 10 =D17*--AND(OR(ISBLANK(A$2),A$...
18: TRUE TRUE TRUE 51 =D18*--AND(OR(ISBLANK(A$2),A$...

Resulting in the following:

1: A B C Sum
2: TRUE TRUE 77

10: A B C Data Helper
11: FALSE FALSE FALSE 42 0
12: TRUE FALSE FALSE 23 0
13: FALSE TRUE FALSE 16 0
14: TRUE TRUE FALSE 26 26
15: FALSE FALSE TRUE 20 0
16: TRUE FALSE TRUE 11 0
17: FALSE TRUE TRUE 10 0
18: TRUE TRUE TRUE 51 51

This gives me the result for *one* row, yay! Now why can't I make an
array function that cycles through eight rows (and does not need eight
helper columns) to give me the following?

10: A B C Data
11: FALSE FALSE FALSE 42
12: TRUE FALSE FALSE 23
13: FALSE TRUE FALSE 16
14: TRUE TRUE FALSE 26
15: FALSE FALSE TRUE 20
16: TRUE FALSE TRUE 11
17: FALSE TRUE TRUE 10
18: TRUE TRUE TRUE 51

20: A B C Sum
21: 199
22: TRUE 111
23: TRUE 103
24: TRUE TRUE 77
25: TRUE 92
26: TRUE TRUE 62
27: TRUE TRUE 61
28: TRUE TRUE TRUE 51
 
P

Pete_UK

Well, it's a different approach. Seems more complex than my solution
to you, though, and as I pointed out you could combine the helper
columns into one composite formula if you wish.

One thing - I don't think you need the double unary minus in:

=D11*--AND(OR(ISBLANK(A$2),A$...

as the multiplication should coerce the TRUE/FALSE to 1 or 0.

Pete
 
D

Del Cotter

Well, it's a different approach. Seems more complex than my solution
to you, though, and as I pointed out you could combine the helper
columns into one composite formula if you wish.

In that case, you're really not going to like the solution I finally
came up with :)

I finally realised I was never going to get an array formula to work
properly with Excel Boolean functions, and that I would have to fake
them. So I changed the binary format again, to 1, 0, and blank, and
where I had had the OR() function, I used:

=SIGN( (A$11:A$18=A21)+ISBLANK(A21) )

instead. The sum acts as an OR, and the SIGN() function keeps the sum
from becoming more than 1.

Then, to mimic an AND(), I multiplied three such expressions together,

=SIGN( (A$11:A$18=A21)+ISBLANK(A21) ) *
SIGN( (B$11:B$18=B21)+ISBLANK(B21) ) *
SIGN( (C$11:C$18=C21)+ISBLANK(C21) )

Then-- and here at last comes the array bit-- I used TRANSPOSE, and
MMULT to multiply the binaries and the data:

=MMULT(TRANSPOSE(D$11:D$18),
SIGN( (A$11:A$18=A21)+ISBLANK(A21) ) *
SIGN( (B$11:B$18=B21)+ISBLANK(B21) ) *
SIGN( (C$11:C$18=C21)+ISBLANK(C21) ) )

The final spreadsheet looks like this:

10: A B C Data
11: 0 0 0 42
12: 1 0 0 23
13: 0 1 0 16
14: 1 1 0 26
15: 0 0 1 20
16: 1 0 1 11
17: 0 1 1 10
18: 1 1 1 51

20: A B C Sum
21: {=MMULT(TRANSPOSE(D$11:D$18)
, SIGN( (A$11:A$18=A21)+ISBLANK(A21) )
* SIGN( (B$11:B$18=B21)+ISBLANK(B21) )
* SIGN( (C$11:C$18=C21)+ISBLANK(C21) ) )
22: 1 {=MMULT(TRANSPOSE(D$11:D$18),
23: 1 {=MMULT(TRANSPOSE(D$11:D$18),
24: 1 1 {=MMULT(TRANSPOSE(D$11:D$18),
25: 1 {=MMULT(TRANSPOSE(D$11:D$18),
26: 1 1 {=MMULT(TRANSPOSE(D$11:D$18),
27: 1 1 {=MMULT(TRANSPOSE(D$11:D$18),
28: 1 1 1 {=MMULT(TRANSPOSE(D$11:D$18),

(the long expressions, except for cell E21, are truncated to save space
here, obviously)

and the calculated figures look like this:

A B C Data
0 0 0 42
1 0 0 23
0 1 0 16
1 1 0 26
0 0 1 20
1 0 1 11
0 1 1 10
1 1 1 51

A B C Sum
199
1 111
1 103
1 1 77
1 92
1 1 62
1 1 61
1 1 1 51

A B C Sum
0 0 0 42
0 0 65
0 0 58
0 107
0 0 62
0 96
0 88
199
One thing - I don't think you need the double unary minus in:

=D11*--AND(OR(ISBLANK(A$2),A$...

Thanks, and although it doesn't look as though I've used any of your
advice, I couldn't have got here without this discussion. I would
welcome any further advice on getting a more compact (but clear and
manageable) function than this one.
 
D

Del Cotter

Then-- and here at last comes the array bit-- I used TRANSPOSE, and
MMULT to multiply the binaries and the data:

=MMULT(TRANSPOSE(D$11:D$18),
SIGN( (A$11:A$18=A21)+ISBLANK(A21) ) *
SIGN( (B$11:B$18=B21)+ISBLANK(B21) ) *
SIGN( (C$11:C$18=C21)+ISBLANK(C21) ) )

Update: I used MMULT and TRANSPOSE because SUMPRODUCT wasn't working for
me before, but I just tried it now and it works!

=SUMPRODUCT((D$11:D$18),
SIGN( (A$11:A$18=A21)+ISBLANK(A21) ) *
SIGN( (B$11:B$18=B21)+ISBLANK(B21) ) *
SIGN( (C$11:C$18=C21)+ISBLANK(C21) ) )

I think it's because I'd cleaned out some redundant column arrays in the
meantime.
 
P

Pete_UK

Well, you've certainly taken it a bit further, Del !! (Not sure about
elegance, though <bg>)

I can't help wondering where the intial data comes from and if there
is some relationship between A B and C, along the lines of:

= constant + a*A + b*B + c*C

I can't spot any on a quick investigation, but if there is such a
relationship and you could unearth what it is, then this could give
rise to a different approach to the solution.

I imagine that this could be some flow-rate experiment, where A B and
C represent valves to change the flow, or something to do with heat
transfer affected by A B or C. Do you mind revealing what it is all
about?

Pete
 
D

Del Cotter

Well, you've certainly taken it a bit further, Del !! (Not sure about
elegance, though <bg>)

It's readable, maintainable and scalable, and it's got no helper cells
and no VBA. It'll do for my tastes. Latest improvement is removing the
asterisks in the SUMPRODUCT and replacing them with commas:

=SUMPRODUCT(D$11:D$18,
SIGN((A$11:A$18=A21)+ISBLANK(A21)),
SIGN((B$11:B$18=B21)+ISBLANK(B21)),
SIGN((C$11:C$18=C21)+ISBLANK(C21)))

The only thing I think I could do to improve it at this point is if I
could work out how to turn those three separate factors into one array
expression. Then, adding a fourth, fifth, etc. would be as trivial as
changing a range reference. It's a rotten shame that proper Boolean
functions break when you try to use them in Excel array functions, so
that you have to fake it with silly sums.
I can't help wondering where the intial data comes from and if there
is some relationship between A B and C, along the lines of:

= constant + a*A + b*B + c*C

It's just a Venn diagram of three intersecting circles: you're given the
eight combinations of A only; A and B but not C; A, B and C etc. and the
challenge is how you answer questions like "How many in circle A
total?", "How many in neither A nor B?" and so on. "total in A" is
A+A&BnotC+A&CnotB+AB&C; "neither in A nor B" is CnotBorA+none, etc. The
key was some sort of matrix but I always had trouble with matrices at
school, and I also have trouble with arrays in Excel, so this was a
struggle to work through.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top