Formula to extract digits from a text string?

H

Hash

Gang -

Dealing with a dataset of phone numbers. Engineers have figured out
about 18 different ways, using any combinations of spaces, parens,
dashes, periods and digits to enter their phone numbers.

I know I can write a UDF that will substitute out all the alpha
characters, one by tedious one. I was wondering if there was a clever
formula that will pull the numerics out, in the sequence entered. In
other words, for (123) 456-7890, 1234567890 as text results. Similarly
for 123.456.7890 and 123-456-7890 and 123 456 7890, and especially
123456789 or 123.4567890, which Excel insists are numbers etc.

Thanks in advance.
....best, Hash
 
R

Ron Rosenfeld

Gang -

Dealing with a dataset of phone numbers. Engineers have figured out
about 18 different ways, using any combinations of spaces, parens,
dashes, periods and digits to enter their phone numbers.

I know I can write a UDF that will substitute out all the alpha
characters, one by tedious one. I was wondering if there was a clever
formula that will pull the numerics out, in the sequence entered. In
other words, for (123) 456-7890, 1234567890 as text results. Similarly
for 123.456.7890 and 123-456-7890 and 123 456 7890, and especially
123456789 or 123.4567890, which Excel insists are numbers etc.

Thanks in advance.
...best, Hash

I would use a UDF. You do have to go through each character, but the code is
pretty short.

=============================
Function PN(PhonNum) As Double
Dim i As Long
Dim a
Dim temp As String

For i = 1 To Len(PhonNum)
a = Mid(PhonNum, i, 1)
If a Like "[0-9]" Then
temp = temp & a
End If
Next i

If Len(temp) > 1 Then
PN = Left(temp & "00000000", 10)
Else
PN = "" 'Gives #VALUE error if no PhonNum
End If

End Function
==========================


--ron
 
H

Harlan Grove

Ron Rosenfeld said:
I would use a UDF. You do have to go through each character, but the code
is
pretty short.

=============================
Function PN(PhonNum) As Double
Dim i As Long
Dim a
Dim temp As String

For i = 1 To Len(PhonNum)
a = Mid(PhonNum, i, 1)
If a Like "[0-9]" Then
temp = temp & a
End If
Next i

If Len(temp) > 1 Then
PN = Left(temp & "00000000", 10)
Else
PN = "" 'Gives #VALUE error if no PhonNum
End If

End Function
==========================

I thought you had become a regular expression advocate.

An alternative would be to use the REGEX.SUBSTITUTE function in Laurent
Longre's MOREFUNC.XLL add-in, available at

http://xcell05.free.fr/english

and use it as

=REGEX.SUBSTITUTE(A1,"\D+")

With regard to your udf, arguable whether you should pad the result with
zeros. Better to just return the digits found, especially if some of the
clever users entering phone numbers enter something like 1-234-555-1212.
Also, my reading of the OP was that the results should be text. If so, bad
to set the return type to double rather than string. And, FTHOI, I'd write
it as


Function foo(p As Variant) As String
Dim t As String, n As Long

t = CStr(p)
n = Len(t)

Do While n > 0
If Not Mid$(t, n, 1) Like "#" Then Mid$(t, n, 1) = " "
n = n - 1
Loop

'assumes VBA6; use Application.WorksheetFunction.Substitute w/VBA5
foo = Replace(t, " ", "")
End Function
 
R

Ron Rosenfeld

Ron Rosenfeld said:
I would use a UDF. You do have to go through each character, but the code
is
pretty short.

=============================
Function PN(PhonNum) As Double
Dim i As Long
Dim a
Dim temp As String

For i = 1 To Len(PhonNum)
a = Mid(PhonNum, i, 1)
If a Like "[0-9]" Then
temp = temp & a
End If
Next i

If Len(temp) > 1 Then
PN = Left(temp & "00000000", 10)
Else
PN = "" 'Gives #VALUE error if no PhonNum
End If

End Function
==========================

I thought you had become a regular expression advocate.

An alternative would be to use the REGEX.SUBSTITUTE function in Laurent
Longre's MOREFUNC.XLL add-in, available at

http://xcell05.free.fr/english

and use it as

=REGEX.SUBSTITUTE(A1,"\D+")

That was, indeed, my first thought, and would be what I would use here. But
I've been recommending it so often, I thought I'd try something else. I've
been feeling like the man with a hammer!
With regard to your udf, arguable whether you should pad the result with
zeros. Better to just return the digits found, especially if some of the
clever users entering phone numbers enter something like 1-234-555-1212.
Also, my reading of the OP was that the results should be text. If so, bad
to set the return type to double rather than string. And, FTHOI, I'd write
it as


Function foo(p As Variant) As String
Dim t As String, n As Long

t = CStr(p)
n = Len(t)

Do While n > 0
If Not Mid$(t, n, 1) Like "#" Then Mid$(t, n, 1) = " "
n = n - 1
Loop

'assumes VBA6; use Application.WorksheetFunction.Substitute w/VBA5
foo = Replace(t, " ", "")
End Function

Points well taken.

My thoughts were that there would also need to be some code to assure it was a
valid phone number and, since some of the phone numbers were being entered as
numbers, the "short ones" would be missing 0's at the end.

And yes, the output should have been TEXT.

I note that with Longre's REGEX.SUBSTITUTE function "\D" is sufficient. To do
a "one-to-many" substitution, no quantifiers seem to be required.

By the way, I've started Friedl's book that you recommended. Seems very
understandable, even to a novice like myself. Thanks.


--ron
 
H

Hash

Ron -

Thank you. That's much simpler than the UDF I would have written. I
think PN is text at the bottom and a double at the top?

Never-the-less, much simpler. Thanks.

....best, Hash

Ron Rosenfeld said:
Gang -

Dealing with a dataset of phone numbers. Engineers have figured out
about 18 different ways, using any combinations of spaces, parens,
dashes, periods and digits to enter their phone numbers.

I know I can write a UDF that will substitute out all the alpha
characters, one by tedious one. I was wondering if there was a clever
formula that will pull the numerics out, in the sequence entered. In
other words, for (123) 456-7890, 1234567890 as text results. Similarly
for 123.456.7890 and 123-456-7890 and 123 456 7890, and especially
123456789 or 123.4567890, which Excel insists are numbers etc.

Thanks in advance.
...best, Hash

I would use a UDF. You do have to go through each character, but the code is
pretty short.

=============================
Function PN(PhonNum) As Double
Dim i As Long
Dim a
Dim temp As String

For i = 1 To Len(PhonNum)
a = Mid(PhonNum, i, 1)
If a Like "[0-9]" Then
temp = temp & a
End If
Next i

If Len(temp) > 1 Then
PN = Left(temp & "00000000", 10)
Else
PN = "" 'Gives #VALUE error if no PhonNum
End If

End Function
==========================


--ron
 
H

Hash

Harlan -

As always, Thanks. I'll explore both ideas.

....best, Hash


Harlan Grove said:
Ron Rosenfeld said:
I would use a UDF. You do have to go through each character, but the code
is
pretty short.

=============================
Function PN(PhonNum) As Double
Dim i As Long
Dim a
Dim temp As String

For i = 1 To Len(PhonNum)
a = Mid(PhonNum, i, 1)
If a Like "[0-9]" Then
temp = temp & a
End If
Next i

If Len(temp) > 1 Then
PN = Left(temp & "00000000", 10)
Else
PN = "" 'Gives #VALUE error if no PhonNum
End If

End Function
==========================

I thought you had become a regular expression advocate.

An alternative would be to use the REGEX.SUBSTITUTE function in Laurent
Longre's MOREFUNC.XLL add-in, available at

http://xcell05.free.fr/english

and use it as

=REGEX.SUBSTITUTE(A1,"\D+")

With regard to your udf, arguable whether you should pad the result with
zeros. Better to just return the digits found, especially if some of the
clever users entering phone numbers enter something like 1-234-555-1212.
Also, my reading of the OP was that the results should be text. If so, bad
to set the return type to double rather than string. And, FTHOI, I'd write
it as


Function foo(p As Variant) As String
Dim t As String, n As Long

t = CStr(p)
n = Len(t)

Do While n > 0
If Not Mid$(t, n, 1) Like "#" Then Mid$(t, n, 1) = " "
n = n - 1
Loop

'assumes VBA6; use Application.WorksheetFunction.Substitute w/VBA5
foo = Replace(t, " ", "")
End Function
 
R

Ron Rosenfeld

Ron -

Thank you. That's much simpler than the UDF I would have written. I
think PN is text at the bottom and a double at the top?

Yeah, but it gets outputted as a Double -- as Harlan points out, and I agree,
it should probably be a String. I had left it as a number so you could use a
custom format for display, but you could just as easily format it within the
UDF.

Also, you might want to add some logic to test for legitimate data. Obviously
10 digits should be legit; 9 might be missing a trailing zero, if the data is
numeric; 11 might have a leading '1'; etc. Or perhaps you could use data
validation to force correct entry.

Glad to help.

Never-the-less, much simpler. Thanks.

...best, Hash

Ron Rosenfeld said:
Gang -

Dealing with a dataset of phone numbers. Engineers have figured out
about 18 different ways, using any combinations of spaces, parens,
dashes, periods and digits to enter their phone numbers.

I know I can write a UDF that will substitute out all the alpha
characters, one by tedious one. I was wondering if there was a clever
formula that will pull the numerics out, in the sequence entered. In
other words, for (123) 456-7890, 1234567890 as text results. Similarly
for 123.456.7890 and 123-456-7890 and 123 456 7890, and especially
123456789 or 123.4567890, which Excel insists are numbers etc.

Thanks in advance.
...best, Hash

I would use a UDF. You do have to go through each character, but the code is
pretty short.

=============================
Function PN(PhonNum) As Double
Dim i As Long
Dim a
Dim temp As String

For i = 1 To Len(PhonNum)
a = Mid(PhonNum, i, 1)
If a Like "[0-9]" Then
temp = temp & a
End If
Next i

If Len(temp) > 1 Then
PN = Left(temp & "00000000", 10)
Else
PN = "" 'Gives #VALUE error if no PhonNum
End If

End Function
==========================


--ron

--ron
 
W

wjohnson

Copy the column - Paste it into WORD - then In Word select REPLACE. Whe
the REPLACE Menu pops up Click on the button that SAYS MORE - the
Select "SPECIAL" then Select any LETTER and it should enter ^$ in th
FIND Box - Leave the Replace Box empty and then hit Replace ALL
Then copy the column and insert it back into your EXCEL document
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top