Tushar Mehta wrote...
Oh, really?
HLOOKUP, VLOOKUP, LOOKUP return incorrect values in Excel
http://support.microsoft.com/default.aspx?scid=kb;en-us;181201
Which says:
"CAUSE
This behavior occurs when either of the following conditions is true:
· The range specified for the "table_array" argument (LOOKUP) or the
range specified for the "lookup_vector" argument (VLOOKUP and HLOOKUP)
is not sorted in ascending order.
-or-
· Number formatting is applied to the range that is hiding the
underlying values."
With respect to the first point:
The semantic issue here is that the table range isn't sorted. This is a
problem in all languages that provide binary search against arbitrary
arrays. For binary search to work, the arrays must be sorted, usually
in ascending order. If you feed binary search an unsorted array, you do
have GIGO.
One very slow way to fix this is to have the search procedure sort its
input array. I suppose there's a case to be made for that, but it'd
really slow things down, and it'd more than eliminate the benefits of
binary search vs linear search.
With respect to the second point:
At what point do users become responsible for what they feed to
functions?
This particular KB article seems clearly intended for fairly naive
users who wouldn't consider reading online help, to wit for VLOOKUP:
"Table_array is the table of information in which data is looked up.
Use a reference to a range or a range name, such as Database or List.
If range_lookup is TRUE, the values in the first column of table_array
must be placed in ascending order: ..., -2, -1, 0, 1, 2, ..., A-Z,
FALSE, TRUE; otherwise VLOOKUP may not give the correct value. If
range_lookup is FALSE, table_array does not need to be sorted. [...]"
MS has also documented a list of XL functions that accept arrays as
arguments. Of course, given how incompatible MS's search algorithms
and my way of thinking are I cannot find it after 30 minutes of
searching. But, I believe N() is on that list.
Unlike you, I believe it should be sufficient to refer to online help.
That is, 'documented' functionality should be what online help says. So
for the N function:
"N
Returns a value converted to a number.
Syntax
N(value)
Value is the value you want converted. N converts values listed in
the following table.
If value is or refers to N returns
A number That number
A date, in one of the built-in date formats available in Microsoft
Excel The serial number of that date
TRUE 1
FALSE 0
An error value, such as #DIV/0! The error value
Anything else 0"
It does mention TRUE and FALSE, but not arrays. N (and T) work VERY
DIFFERENTLY when fed range vs array arguments. N({1,2,3,4}<2.5) returns
{1,1,0,0}, but with C1:C4 containing {TRUE;TRUE;FALSE;FALSE}, N(C1:C4)
returns 1. Self-evident to anyone who used to use 123, but likely
surprising to everyone else.
The bottom line is this. All of you who enjoy exploiting *documented*
bugs in XL are welcome to do so. If -- and yes I know all about pigs
not yet becoming airbore -- MS fixes those holes in its software you
will have no one to blame but yourself. However, hoisting those
"solutions" onto others *without* warning about what you are doing is
what I object to.
First find that url to a Microsoft source that explicitly confirms that
N and T functions are *intended* to handle arrays, then I'll consider
that you're not abusing the N function. In other words, no I won't just
take your word for it.
Second, Microsoft may change worksheet function behavior, but would it
abandon compatibility with other spreadsheets entirely? Excel's lookup
and match functions behave the *same* as Lotus 123 when the lookup
value is larger than any value in the lookup range and the lookup or
match mode is nonexact. I suppose I should overlook your ignorance of
compatability issues.
There may be a chance that Microsoft would deprecate this particular
bit of functionality. I wonder whether I'd be surprised or not if they
didn't provide prior warning. I know I wouldn't be surprised if many IT
departments upgraded without checking for semantic changes like this.
Anyway, if Microsoft is going to go to the uncharacteristic effort of
changing core functionality, maybe they'll add some as well as break
other. If Excel only supported regular expressions, as OpenOffice Calc
and Gnumeric do, one could just have used a formula like the following
OpenOffice Calc one to find the substring following the final comma.
=MID(A1;SEARCH("[^,]+$";A1);256)
But my money would be on no changes even 'real soon now'.