Need to extract text between numbers in a string

D

dsimanton

I have several hundred rows of text which contains numbers. I need t
separate the segments from each other. For example: "1twinkle twinkl
little star 2how I wonder what you are 3up above the world 4so brigh
5like a diamond 6in the sky 7shine so high 8twinkle twinkle 9little sta
10how I wonder what you are"
The text between the numbers will vary in length and numbers may doubl
digit but will always be in numerical order but not necessarily startin
with 1.
I can do this with multiple steps: remove first number so as to be abl
to find the second number and then trim the left portion up to the 2n
number. Repeating this multiple times works. But there must be a simple
way. I get bogged down though when the numbers become double digit. Hel
please
 
R

Ron Rosenfeld

I have several hundred rows of text which contains numbers. I need to
separate the segments from each other. For example: "1twinkle twinkle
little star 2how I wonder what you are 3up above the world 4so bright
5like a diamond 6in the sky 7shine so high 8twinkle twinkle 9little star
10how I wonder what you are"
The text between the numbers will vary in length and numbers may double
digit but will always be in numerical order but not necessarily starting
with 1.
I can do this with multiple steps: remove first number so as to be able
to find the second number and then trim the left portion up to the 2nd
number. Repeating this multiple times works. But there must be a simpler
way. I get bogged down though when the numbers become double digit. Help
please.

You can do this fairly easily with a VBA Macro, but your request does not indicate where these strings are located, and what you want to do with the result of splitting them. Also, it is not clear whether you want to retain the numbers of not.

The macro below will split the sentences at any whole number, and place it into a VBA array. For the sake of this exercise, I assumed your rows were in column A, and that you wanted the results displayed in the adjacent columns. I also assumed you wanted to retain any carriage returns in the original text.

To enter this Macro (Sub), <alt-F11> opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this Macro (Sub), <alt-F8> opens the macro dialog box. Select the macro by name, and <RUN>.

===================================
Option Explicit
Sub SplitOnNumber()
Dim re As Object, mc As Object, m As Object
Dim rSrc As Range, c As Range
Dim rDest As Range
Dim SplitArray()
Dim P As Long, I As Long
Set rSrc = Range("A1", Cells(Rows.Count, "A").End(xlUp))
Set re = CreateObject("vbscript.regexp")
With re
.Global = True
.Pattern = "\d+"
End With

For Each c In rSrc
Set mc = re.Execute(c.Text)
ReDim SplitArray(1 To mc.Count + 1)
P = 0
I = 1
For Each m In mc
SplitArray(I) = Mid(c.Text, P + 1, m.firstindex - P)
P = m.firstindex + m.Length
I = I + 1
Next m
SplitArray(I) = Mid(c.Text, P + 1)
Set rDest = c.Offset(columnoffset:=1)
Range(rDest, rDest(1, Columns.Count - 1)).ClearContents
Set rDest = rDest.Resize(columnsize:=I)
rDest = SplitArray
rDest.EntireColumn.AutoFit
Next c

End Sub
=======================================
 
D

dsimanton

'Ron Rosenfeld[_2_ said:
;1604149']On Sun, 29 Jul 2012 18:48:00 +0000, dsimanto
I have several hundred rows of text which contains numbers. I need to
separate the segments from each other. For example: "1twinkle twinkle
little star 2how I wonder what you are 3up above the world 4so bright
5like a diamond 6in the sky 7shine so high 8twinkle twinkle 9littl star
10how I wonder what you are"
The text between the numbers will vary in length and numbers ma double
digit but will always be in numerical order but not necessaril starting
with 1.
I can do this with multiple steps: remove first number so as to b able
to find the second number and then trim the left portion up to the 2nd
number. Repeating this multiple times works. But there must be simpler
way. I get bogged down though when the numbers become double digit Help
please.-

You can do this fairly easily with a VBA Macro, but your request doe
not indicate where these strings are located, and what you want to d
with the result of splitting them. Also, it is not clear whether yo
want to retain the numbers of not.

The macro below will split the sentences at any whole number, and plac
it into a VBA array. For the sake of this exercise, I assumed your row
were in column A, and that you wanted the results displayed in th
adjacent columns. I also assumed you wanted to retain any carriag
returns in the original text.

To enter this Macro (Sub), <alt-F11> opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this Macro (Sub), <alt-F8> opens the macro dialog box. Select th
macro by name, and <RUN>.

===================================
Option Explicit
Sub SplitOnNumber()
Dim re As Object, mc As Object, m As Object
Dim rSrc As Range, c As Range
Dim rDest As Range
Dim SplitArray()
Dim P As Long, I As Long
Set rSrc = Range("A1", Cells(Rows.Count, "A").End(xlUp))
Set re = CreateObject("vbscript.regexp")
With re
.Global = True
.Pattern = "\d+"
End With

For Each c In rSrc
Set mc = re.Execute(c.Text)
ReDim SplitArray(1 To mc.Count + 1)
P = 0
I = 1
For Each m In mc
SplitArray(I) = Mid(c.Text, P + 1, m.firstindex - P)
P = m.firstindex + m.Length
I = I + 1
Next m
SplitArray(I) = Mid(c.Text, P + 1)
Set rDest = c.Offset(columnoffset:=1)
Range(rDest, rDest(1, Columns.Count - 1)).ClearContents
Set rDest = rDest.Resize(columnsize:=I)
rDest = SplitArray
rDest.EntireColumn.AutoFit
Next c

End Sub
=======================================

I'm new to this and am sorry if I didn't provide enough information.
Initially I was looking for a solution outside of VBA directly in a cel
or series of cells. The VBA code you provided works great though too, i
not better than I had envisioned.
Ideally when finished I want to have series of text in separate cell
with each number such as: A5 - "1 twinkle twinkle little star" A6 - "
how I wonder what you are" A7 - "3 up above the world" etc.
This would be with the number included as part of the text. I'm not sur
how to modify the VBA code provided to include the numbers in the text.
thank
 
R

Ron Rosenfeld

I'm new to this and am sorry if I didn't provide enough information.
Initially I was looking for a solution outside of VBA directly in a cell
or series of cells. The VBA code you provided works great though too, if
not better than I had envisioned.
Ideally when finished I want to have series of text in separate cells
with each number such as: A5 - "1 twinkle twinkle little star" A6 - "2
how I wonder what you are" A7 - "3 up above the world" etc.
This would be with the number included as part of the text. I'm not sure
how to modify the VBA code provided to include the numbers in the text.
thanks

This is code modified to include the leading numbers. Of course, since we are splitting on numbers, any numbers within the text string will also cause a split. If that is a problem in order to avoid that, we need an unambiguous method of differentiating the numbers which represent a split from those that do not.
==========================
Option Explicit
Sub SplitOnNumber()
Dim re As Object, mc As Object, m As Object
Dim rSrc As Range, c As Range
Dim rDest As Range
Dim SplitArray()
Dim I As Long
Set rSrc = Range("A1", Cells(Rows.Count, "A").End(xlUp))
Set re = CreateObject("vbscript.regexp")
With re
.Global = True
.Pattern = "\d+\D+"
End With

For Each c In rSrc
Set mc = re.Execute(c.Text)
ReDim SplitArray(1 To mc.Count)
I = 1
For Each m In mc
SplitArray(I) = m
I = I + 1
Next m
Set rDest = c.Offset(columnoffset:=1)
Range(rDest, rDest(1, Columns.Count - 1)).ClearContents
Set rDest = rDest.Resize(columnsize:=I)
rDest = SplitArray
rDest.EntireColumn.AutoFit
Next c

End Sub
=====================================

In your last post, you indicated you wanted the results in the same column, rather than the same row as I had programmed. That's easily done. But I wonder where your "hundreds of rows" to be processed are located.
 
S

Stunn

This gets close:

in A5:

=MID($A$1,1,MATCH(TRUE,INDEX(ISNUMBER(-MID($A$1,ROW(OFFSET($1:$1,1,,LEN($A$1))),1)),),0)-1)

in A6:

=MID($A$1,SUM(INDEX(LEN(A$5:A5),))+1,IFERROR(MATCH(TRUE,INDEX(ISNUMBER(-MID($A$1,ROW(OFFSET($1:$1,SUM(INDEX(LEN(A$5:A5),))+2,,LEN($A$1))),1)),),0),999))

Copied downward through to A15. It leaves a blank cell before every 2 digit number, but correcting this could require a much longer formula.

Steve D.
 
S

Stunn

Try this:
In A4:

=LEFT($A$1,MAX(INDEX(MATCH(FALSE,INDEX(ISNUMBER(-MID($A$1,ROW(OFFSET($A$1,,,LEN($A$1))),1)),),0),))-1)

retrieves the first number regardless of how many digits it has.

In A5:

=MID($A$1,FIND(A$4+ROWS(A$4:A4)-1,$A$1),FIND(A$4+ROWS(A$4:A4),$A$1)-FIND(A$4+ROWS(A$4:A4)-1,$A$1))

Copied downward as far as required. As long as the numbers are always consecutive there shouldn't be a problem.

Steve D.
 
S

Stunn

Try this:

In A4:



=LEFT($A$1,MAX(INDEX(MATCH(FALSE,INDEX(ISNUMBER(-MID($A$1,ROW(OFFSET($A$1,,,LEN($A$1))),1)),),0),))-1)



retrieves the first number regardless of how many digits it has.



In A5:



=MID($A$1,FIND(A$4+ROWS(A$4:A4)-1,$A$1),FIND(A$4+ROWS(A$4:A4),$A$1)-FIND(A$4+ROWS(A$4:A4)-1,$A$1))



Copied downward as far as required. As long as the numbers are always consecutive there shouldn't be a problem.



Steve D.

Amendment to second formula:

=MID($A$1,FIND(A$4+ROWS(A$4:A4)-1,$A$1),IFERROR(FIND(A$4+ROWS(A$4:A4),$A$1)-FIND(A$4+ROWS(A$4:A4)-1,$A$1),999))
 
S

Stunn

Try this:

In A4:



=LEFT($A$1,MAX(INDEX(MATCH(FALSE,INDEX(ISNUMBER(-MID($A$1,ROW(OFFSET($A$1,,,LEN($A$1))),1)),),0),))-1)



retrieves the first number regardless of how many digits it has.



In A5:



=MID($A$1,FIND(A$4+ROWS(A$4:A4)-1,$A$1),FIND(A$4+ROWS(A$4:A4),$A$1)-FIND(A$4+ROWS(A$4:A4)-1,$A$1))



Copied downward as far as required. As long as the numbers are always consecutive there shouldn't be a problem.



Steve D.

Amendment to first formula:

=LEFT($A$1,MATCH(FALSE,INDEX(ISNUMBER(-MID($A$1,ROW(OFFSET($1:$1,,,LEN($A$1))),1)),),0)-1)

(I got a little bit carried away!)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top