USCASCII characters

  • Thread starter The PhoenixTN ICT Team
  • Start date
T

The PhoenixTN ICT Team

Hi all out there,

I just was wondering if there is some Excel function that could help us
with this issue. We frequently use lists of usernames including
non-USCASCII characters (for instance, María José Rodríguez Malmierca,
Øivind Larsen, Ólöf Gardarsdóttir, and so on, so forth.

Please, is there any Excel function that could automatically (or almost
automatically by looking at a substitution table) the non-USCASCII
characters with the best USCASCII alternative?

Thanks in advance,

Ricardo
 
J

Jon Quixley

Here's a couple of ideas:

1. You could "teach" your spellchecker all the non ASCII words and
re-spell them the way you want them

2. You could load up a LOOKUP table with the original non-ASCII words
and have your anglisised names next to them. Then it would be a matter
of getting the lookup table to substiute the former for the latter

3. You could leave them alone and be less xenophobic
 
H

Harlan Grove

The PhoenixTN ICT Team wrote...
I just was wondering if there is some Excel function that could help us
with this issue. We frequently use lists of usernames including
non-USCASCII characters (for instance, María José Rodríguez Malmierca,
Øivind Larsen, Ólöf Gardarsdóttir, and so on, so forth.

Please, is there any Excel function that could automatically (or almost
automatically by looking at a substitution table) the non-USCASCII
characters with the best USCASCII alternative?

With VBA, it's relatively easy. Without VBA it's cumbersome. Either
way, you start off with a 2-column by multiple row range of non-ASCII
chars in the left column and their closest ASCII translation in the
right column.

Perhaps the following which adopts the typical American English rules
of simply dispensing with grave, acute, circumflex and tilda accents,
and uses the German rule for umlauts on a, o and u but discards them
for e, i and y. As for the Scandinavian letters Å and å, they're
pronounced like English long O (as in boat), so O and o would be
closest phonetic, and Ø and ø are pronounced the same as German Ö
and ö, so transliterate them the same.

Š Sz
Œ oe
Ž Zs
š sz
œ oe
ž zs
Ÿ y
¡ j
À A
Á A
 A
à A
Ä Ae
Å O
Æ Ae
Ç C
È E
É E
Ê E
Ë E
Ì I
Í I
Î I
Ï I
Ñ N
Ò O
Ó O
Ô O
Õ O
Ö Oe
Ø Oe
Ù U
Ú U
Û U
Ü Ue
Ý Y
Þ Th
ß ss
à a
á a
â a
ã a
ä ae
å o
æ ae
ç c
è e
é e
ê e
ë ee
ì i
í i
î i
ï i
ð o
ñ n
ò o
ó o
ô o
õ o
ö oe
ø oe
ù u
ú u
û u
ü ue
ý y
þ th
ÿ y

Since VBA would be much better suited to this, here's a function that
would transliterate the characters.


Function tr(s As String, tt As Variant) As String
Dim i As Long, k As Long, n As Long

If TypeOf tt Is Range Then tt = tt.Value

If Not IsArray(tt) Then
'tt a single string, so deletes it
i = Asc(CStr(tt))
ReDim tt(1 To 1, 1 To 2)
tt(1, 1) = Chr(i)
tt(1, 2) = ""
End If

On Error Resume Next

With Application.WorksheetFunction

n = UBound(tt, 2)
If Err.Number <> 0 Then
'tt a 1D array, so deletes all strings in it
tt = .Transpose(tt)
ReDim Preserve tt(1 To UBound(tt, 1), 1 To 2)
End If

n = UBound(tt, 1)
tr = s

For i = 1 To n
If InStr(0, tr, tt(i, 1)) > 0 Then
tr = .Substitute(tr, tt(i, 1), tt(i, 2))
End If
Next i

Err.Clear

End With

End Function


If you had a name with non-ASCII characters in cell B5 and the
transliteration table were in a range named TLT, you could use this
function in formula calls like

=tr(B5,TLT)

That said, Excel *IS* *NOT* *A* *TEXT* *PROCESSING* *TOOL*! You'd be
much better off doing this in a scripting language, such as VBScript
which has come as part of Windows since Windows 2000.
 
H

Harlan Grove

Jon Quixley wrote...
....
3. You could leave them alone and be less xenophobic

Unlikely it's the words that cause the problems. Much more likely it's
the characters. If Excel were only a stopping point on the data's path
to a system that could only handle 7-bit ASCII. How would that be
xenophobic?
 
T

The PhoenixTN ICT Team

Jon said:
Here's a couple of ideas:

1. You could "teach" your spellchecker all the non ASCII words and
re-spell them the way you want them

2. You could load up a LOOKUP table with the original non-ASCII words
and have your anglisised names next to them. Then it would be a matter
of getting the lookup table to substiute the former for the latter

3. You could leave them alone and be less xenophobic

Thanks, Jon,

Options 1 is pretty straightforward, but of course option 2 mostly
after Harlan's message is the most elegant alternative! I will try to
implement it here. It won't be difficult following Harlan's message.

Greetings,

Ricardo
 
T

The PhoenixTN ICT Team

Harlan said:
Jon Quixley wrote...
...

Unlikely it's the words that cause the problems. Much more likely it's
the characters. If Excel were only a stopping point on the data's path
to a system that could only handle 7-bit ASCII. How would that be
xenophobic?

Thanks, Harlan.

I've being struggling for years to use the correct spelling vernacular
name of at least people and institutions we work with. You can believe
me: it is by no means an easy matter working in an enlarged European
Union and having contacts with Asian countries!

UTF is doing its business, but there are still a lot of applications
causing problems over here.

Greetings,

Ricardo
 
Top