Magic Excel function or UDF?

H

Henrietta Horne

I need check a list of words against an alphabet (a string of letters)
and return the index of the letter in the word that is the highest in
the alphabet.

For example, for the alphabet is "etaoin", these words would return
these indices:

Word Index
to 4
ate 3
ten 6
neat 6
tee 2

Is there some magic Excel function that will do this?

If not, can someone post the guts of a search loop to select each
letter in sWord and find the index in sAlphabet?

The actual alphabet will have all 26 letters and might look like this:

etaoinsrhldcumgfpwybvkjxzq
 
C

Clif McIrvin

Henrietta Horne said:
I need check a list of words against an alphabet (a string of letters)
and return the index of the letter in the word that is the highest in
the alphabet.

For example, for the alphabet is "etaoin", these words would return
these indices:

Word Index
to 4
ate 3
ten 6
neat 6
tee 2

Is there some magic Excel function that will do this?

If not, can someone post the guts of a search loop to select each
letter in sWord and find the index in sAlphabet?

The actual alphabet will have all 26 letters and might look like this:

etaoinsrhldcumgfpwybvkjxzq


Untested "air code"

Function GetIndex(sWord As String, sAlphabet As String) _
As Integer

Dim sChar As String
Dim iIndex As Integer
Dim iLoop As Integer
Dim iTemp As Integer

For iLoop = 1 To Len(sWord)
sChar = Mid(sWord, iLoop)
iTemp = InStr(sChar, sAlphabet, vbTextCompare)
If iTemp > iIndex Then
iIndex = iTemp
End If
Next iLoop

GetIndex = iIndex

End Function

HTH!
 
G

GS

Henrietta Horne explained on 1/11/2011 :
I need check a list of words against an alphabet (a string of letters)
and return the index of the letter in the word that is the highest in
the alphabet.

For example, for the alphabet is "etaoin", these words would return
these indices:

Word Index
to 4
ate 3
ten 6
neat 6
tee 2

Is there some magic Excel function that will do this?

If not, can someone post the guts of a search loop to select each
letter in sWord and find the index in sAlphabet?

The actual alphabet will have all 26 letters and might look like this:

etaoinsrhldcumgfpwybvkjxzq

Try this...

Function GetHighIndex(sWord As String, sAlphabet As String) As Integer
Dim i As Integer, iPos As Integer
For i = 1 To Len(sWord)
iPos = InStr(1, sAlphabet, Mid$(sWord, i, 1), vbTextCompare)
If iPos > GetHighIndex Then GetHighIndex = iPos
Next
End Function
 
H

Henrietta Horne

Untested "air code"

Function GetIndex(sWord As String, sAlphabet As String) _
As Integer

Dim sChar As String
Dim iIndex As Integer
Dim iLoop As Integer
Dim iTemp As Integer

For iLoop = 1 To Len(sWord)
sChar = Mid(sWord, iLoop)
iTemp = InStr(sChar, sAlphabet, vbTextCompare)
If iTemp > iIndex Then
iIndex = iTemp
End If
Next iLoop

GetIndex = iIndex

End Function

HTH!

Thank you so much. You and Garry had essentially the same solution.
Hugs.
 
H

Henrietta Horne

Henrietta Horne explained on 1/11/2011 :

Try this...

Function GetHighIndex(sWord As String, sAlphabet As String) As Integer
Dim i As Integer, iPos As Integer
For i = 1 To Len(sWord)
iPos = InStr(1, sAlphabet, Mid$(sWord, i, 1), vbTextCompare)
If iPos > GetHighIndex Then GetHighIndex = iPos
Next
End Function

Thank you so much. You and Clif had essentially the same solution.
Hugs.
 
R

Rick Rothstein

Here is a shorter function that approaches the problem "backwards"...

Function GetIndex(Word As String, Alphabet As String) As Long
For GetIndex = Len(Alphabet) To 1 Step -1
If InStr(1, Word, Mid(Alphabet, GetIndex, _
1), vbTextCompare) Then Exit Function
Next
End Function

Rick Rothstein (MVP - Excel)




"Henrietta Horne" wrote in message

I need check a list of words against an alphabet (a string of letters)
and return the index of the letter in the word that is the highest in
the alphabet.

For example, for the alphabet is "etaoin", these words would return
these indices:

Word Index
to 4
ate 3
ten 6
neat 6
tee 2

Is there some magic Excel function that will do this?

If not, can someone post the guts of a search loop to select each
letter in sWord and find the index in sAlphabet?

The actual alphabet will have all 26 letters and might look like this:

etaoinsrhldcumgfpwybvkjxzq
 
H

Henrietta Horne

I need check a list of words against an alphabet (a string of letters)
and return the index of the letter in the word that is the highest in
the alphabet.

For example, for the alphabet is "etaoin", these words would return
these indices:

Word Index
to 4
ate 3
ten 6
neat 6
tee 2

Is there some magic Excel function that will do this?

If not, can someone post the guts of a search loop to select each
letter in sWord and find the index in sAlphabet?

The actual alphabet will have all 26 letters and might look like this:

etaoinsrhldcumgfpwybvkjxzq

When I ran the function against the actual data, I quickly discovered
that some words have characters (like "-" and "'"), that are not in
the alphabet. I decided I want those letters treated as if they were
at the end of the alphabet.

The modified function is now:


Function GetHighIndex(sWord As String, sAlphabet As String) As Integer

Dim i As Integer 'Loop index
Dim iPos As Integer 'Index variable
Dim iLenA As Integer 'Length of alphabet
iLenA = Len(sAlphabet) '.For index for missing characters

For i = 1 To Len(sWord)
iPos = InStr(1, sAlphabet, Mid$(sWord, i, 1), vbTextCompare)
If iPos = 0 Then iPos = iLenA + 1
If iPos > GetHighIndex Then GetHighIndex = iPos
Next i

End Function

Comments?
 
H

Henrietta Horne

Here is a shorter function that approaches the problem "backwards"...

Function GetIndex(Word As String, Alphabet As String) As Long
For GetIndex = Len(Alphabet) To 1 Step -1
If InStr(1, Word, Mid(Alphabet, GetIndex, _
1), vbTextCompare) Then Exit Function
Next
End Function

Rick Rothstein (MVP - Excel)

Clever solution. Instead of comparing each letter in the word to the
alphabet, you are comparing the alphabet to the work backwards and
stopping as soon as you get one match.

I wonder which is actually faster (for my data)? My alphabet is 26
letters. Most of my words are less than 5-6 letters.
 
R

Rick Rothstein

The actual alphabet will have all 26 letters and might look like this:
etaoinsrhldcumgfpwybvkjxzq

Will it **always** have the 26 letters of the alphabet in it? Also,
important for speeding things up a little, will the letters **always** be
lower case?
When I ran the function against the actual data, I quickly
discovered that some words have characters (like "-" and "'"),
that are not in the alphabet. I decided I want those letters
treated as if they were at the end of the alphabet.

Are those two characters the **only** non-letter characters that can be in
your word? If not, are you willing to have **any** non-letter return 27
(assuming you "alphabet" always contains 26 letters)?

Rick Rothstein (MVP - Excel)
 
H

Henrietta Horne

Will it **always** have the 26 letters of the alphabet in it?

For this application, yes.
Also,
important for speeding things up a little, will the letters **always** be
lower case?

In the application that will use this data, the words will be in
sentences, so the first word and proper names will be uppercase. But
for this UDF, all lower case.
Are those two characters the **only** non-letter characters that can be in
your word? If not, are you willing to have **any** non-letter return 27
(assuming you "alphabet" always contains 26 letters)?

There can be others, but for now, I'm happy to have all of them return
a "27".

Larger alphabets and/or shorter words benefit the forward algorithm,
as far as compute time, no?
 
G

GS

Henrietta Horne submitted this idea :
When I ran the function against the actual data, I quickly discovered
that some words have characters (like "-" and "'"), that are not in
the alphabet. I decided I want those letters treated as if they were
at the end of the alphabet.

The modified function is now:


Function GetHighIndex(sWord As String, sAlphabet As String) As Integer

Dim i As Integer 'Loop index
Dim iPos As Integer 'Index variable
Dim iLenA As Integer 'Length of alphabet
iLenA = Len(sAlphabet) '.For index for missing characters

For i = 1 To Len(sWord)
iPos = InStr(1, sAlphabet, Mid$(sWord, i, 1), vbTextCompare)
If iPos = 0 Then iPos = iLenA + 1
If iPos > GetHighIndex Then GetHighIndex = iPos
Next i

End Function

Comments?

So then, you might want to exit the function if the character is not in
sAlphabet to skip further looping:

Function GetHighIndex(sWord As String, sAlphabet As String) As Integer
Dim i As Integer, iPos As Integer
For i = 1 To Len(sWord)
iPos = InStr(1, LCase$(sAlphabet), Mid$(LCase$(sWord), i, 1), 1)
If iPos = 0 Then GetHighIndex = Len(sAlphabet) + 1: Exit Function
If iPos > GetHighIndex Then GetHighIndex = iPos
Next
End Function

I also had the same concern as Rick regarding case and so note that I
forced lowercase.
 
G

GS

It happens that Henrietta Horne formulated :
For this application, yes.


In the application that will use this data, the words will be in
sentences, so the first word and proper names will be uppercase. But
for this UDF, all lower case.


There can be others, but for now, I'm happy to have all of them return
a "27".

Larger alphabets and/or shorter words benefit the forward algorithm,
as far as compute time, no?

So does using fewer variables!<g>
 
C

Clif McIrvin

Henrietta Horne said:
When I ran the function against the actual data, I quickly discovered
that some words have characters (like "-" and "'"), that are not in
the alphabet. I decided I want those letters treated as if they were
at the end of the alphabet.

The modified function is now:


Function GetHighIndex(sWord As String, sAlphabet As String) As Integer

Dim i As Integer 'Loop index
Dim iPos As Integer 'Index variable
Dim iLenA As Integer 'Length of alphabet
iLenA = Len(sAlphabet) '.For index for missing characters

For i = 1 To Len(sWord)
iPos = InStr(1, sAlphabet, Mid$(sWord, i, 1), vbTextCompare)
If iPos = 0 Then iPos = iLenA + 1
If iPos > GetHighIndex Then GetHighIndex = iPos
Next i

End Function

Comments?


I'm *guessing* that the time difference between Rick's function and ours
would be measured in micro-seconds ... if you're calling it millions of
times it "might" be worth investigating <grin>.

(BTW, IMO the main difference between Garry's offering and mine is "self
documentation" and ease of debugging.)

As long as your rule is valid, it seems that your solution covers all
"extraneous" characters. To cover that issue with Rick's approach would
require including all possible characters in sAlphabet ... which would
wipe out the elegance (and hurt execution time, as well), but it would
allow a variety of rules.
 
G

GS

Clif McIrvin submitted this idea :
(BTW, IMO the main difference between Garry's offering and mine is "self
documentation" and ease of debugging.)

Hi Clif,
I don't see where my offering's brevity poses any challenges for
debugging. I'll give you that yours is more "self documented", as you
put it, but at what cost? IMO, the less var tracking I have to do the
easier my code is to understand AND debug.

Not saying that there's anything wrong with your offering 'as is'. I
just like brevity<g>, especially Rick's one-liners<bg>!
 
C

Clif McIrvin

GS said:
Henrietta Horne submitted this idea :
So then, you might want to exit the function if the character is not
in sAlphabet to skip further looping:

Excellent point!
Function GetHighIndex(sWord As String, sAlphabet As String) As Integer
Dim i As Integer, iPos As Integer
For i = 1 To Len(sWord)
iPos = InStr(1, LCase$(sAlphabet), Mid$(LCase$(sWord), i, 1), 1)
If iPos = 0 Then GetHighIndex = Len(sAlphabet) + 1: Exit Function
If iPos > GetHighIndex Then GetHighIndex = iPos
Next
End Function

I also had the same concern as Rick regarding case and so note that I
forced lowercase.

I was under the impression that is what using vbTextCompare
accomplished. Am I mistaken?

I note that you are using the "$" variation of the function names
(LCase$(), Mid$()) ... is there a speed difference vs. using LCase() and
Mid()?

I understand that using variants is slower than typed variables; I typed
the function parameters as variant in my original suggestion to allow
the possibility of testing for Empty, though I didn't include that in my
code.
 
R

Rick Rothstein

See inline comments...
In the application that will use this data, the words will
be in sentences, so the first word and proper names will
be uppercase. But for this UDF, all lower case.

No sense giving you different code for this "test" case when your "real"
case will be different.
There can be others, but for now, I'm happy to have all
of them return "27".

Again, it make no sense not designing for you final application.
Larger alphabets and/or shorter words benefit the forward
algorithm, as far as compute time, no?

For such short text strings, for a single call of the function, you would be
hard pressed to measure the difference. If you were examining hundreds of
thousands of words in a loop, then maybe the difference would start to
become noticeable enough to choose one method over the other.

Okay, with all that said, here is my code, modified to report one more than
the length of your alphabet for non-letters...

Function GetIndex(Word As String, Alphabet As String) As Long
If Word Like "*[!" & Alphabet & "]*" Then
GetIndex = Len(Alphabet) + 1
Else
For GetIndex = Len(Alphabet) To 1 Step -1
If InStr(Word, Mid(Alphabet, GetIndex, 1)) Then Exit For
Next
End If
End Function

Rick Rothstein (MVP - Excel)
 
R

Rick Rothstein

As long as your rule is valid, it seems that your solution
covers all "extraneous" characters. To cover that issue
with Rick's approach would require including all possible
characters in sAlphabet ... which would wipe out the
elegance (and hurt execution time, as well), but it would allow a variety
of rules.

See my latest response to the OP for a still quick, and what I consider a
still "elegant" solution, to covering the non-letter characters.

Rick Rothstein (MVP - Excel)
 
C

Clif McIrvin

GS said:
Clif McIrvin submitted this idea :

Hi Clif,
I don't see where my offering's brevity poses any challenges for
debugging. I'll give you that yours is more "self documented", as you
put it, but at what cost? IMO, the less var tracking I have to do the
easier my code is to understand AND debug.

Not saying that there's anything wrong with your offering 'as is'. I
just like brevity<g>, especially Rick's one-liners<bg>!

--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc


I'm a part-time developer, and I've learned through harsh experience
that the more documentation in the code the better! If I was coding
every day, then yes, I'd agree with your observation.

I agree that there is some difference in execution time -- and sometimes
that would even be relevant! -- but in my single user environment that
has not (so far) been an issue.

I learn from both of you guys (and others!) and I agree ... Rick's
solutions often make me back up and say something like, "I sure wish I'd
thought of that!"
 
G

GS

Clif McIrvin used his keyboard to write :
Excellent point!


I was under the impression that is what using vbTextCompare accomplished. Am
I mistaken?

No, you're not mistaken. If I read Rick's context correctly, I believe
he was refering to the typical issue of users having the Caps Lock on
inadvertently OR without giving it any thought. IMO, I'd use a filter
function to determine any chars not in my expected string, and strip
them out. But this is not what fits the OP's task.
I note that you are using the "$" variation of the function names (LCase$(),
Mid$()) ... is there a speed difference vs. using LCase() and Mid()?

I read somewhere that it has an effect. Not sure of the details
exactly, but I started using the $ after reading the article.
I understand that using variants is slower than typed variables; I typed the
function parameters as variant in my original suggestion to allow the
possibility of testing for Empty, though I didn't include that in my code.

And this is why my loop UBounds to Len(sWord). If it's an empty string
there's nothing to do, and the function immediately exits and returns
'0'. I think that clearly indicates a zero length string.
CBool(Len("")=0) returns TRUE. (This makes a good arg for
documentation!<g>)
 
C

Clif McIrvin

Rick Rothstein said:
See my latest response to the OP for a still quick, and what I
consider a still "elegant" solution, to covering the non-letter
characters.

Rick Rothstein (MVP - Excel)


Elegant, for sure!

Reading your post after replying to Garry regarding brevity, I recall a
tale I heard many years ago about assembler code that had been running
flawlessly for years which suddenly died. The programmer no longer
worked there, and the team was stumped when they came to a dead-end in
the source code-- there was no instruction to execute! How could the
code have ever worked in the first place?!

Eventually someone realized that with the combination of processor speed
and rotation of the "swap drum" the code was running on the missing
instruction was just coming up under the read head at exactly the right
time, and the program broke when the swap drum was replaced with a head
per track disk.

The development environment has certainly changed since those days!

[BTW: I don't believe I've ever used LIKE in VBA ... I'll have to
remember that one!]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top