Extracting dates from a String


R

Raj

Hi,

Cells in Column A have strings of the following type:
eg A1:
Billing for the period March 14 2009 to March 13 2010 - Order dated
April 01 2009
In B1 I want the date : 14-Mar-2009
In C1 I want the date: 13-Mar-2010
(to elaborate, only the first two dates in the string need to be
extracted, third and subsequent dates in the string to be ignored)

Thanks in advance for the help.

Regards,
Raj
 
Ad

Advertisements

A

Al

Try for start date:
=MID(A1,FIND("period",A1)+7,(FIND("to",A1))-(FIND("period",A1)+8))

Stop date:
=MID(A1,FIND("to",A1)+3,(FIND("-",A1))-(FIND("to",A1)+4))


HTH
 
R

Rajendra

Try for start date:
=MID(A1,FIND("period",A1)+7,(FIND("to",A1))-(FIND("period",A1)+8))

Stop date:
=MID(A1,FIND("to",A1)+3,(FIND("-",A1))-(FIND("to",A1)+4))

HTH








- Show quoted text -

Thanks. While that worked right for the given string, it did not work
for all the rows as the strings are composed differently. The word
"period" does not precede the first date at all times nor does the "-"
suceed the second date. Is there a way to extract the first two dates
by looking for the first and second occurrences of the standard "mmmm"
months in the String and then compute the dates based on the position
of those strings?

Regards
Raj
 
R

Rick Rothstein

I think you need to give us a little more information about the structure of
these text strings of yours. Is the month name **always** spelled out? Is
the year **always** a 4-digit number? Is the only character between date
parts the space character (that is, a comma is **never** used between the
day and the year)? Is the word "to" **always** used to separate the two
dates and, if not, can the text separating them be more than one word long?
Could there other numbers before the first date?
--
Rick (MVP - Excel)


Try for start date:
=MID(A1,FIND("period",A1)+7,(FIND("to",A1))-(FIND("period",A1)+8))

Stop date:
=MID(A1,FIND("to",A1)+3,(FIND("-",A1))-(FIND("to",A1)+4))

HTH








- Show quoted text -

Thanks. While that worked right for the given string, it did not work
for all the rows as the strings are composed differently. The word
"period" does not precede the first date at all times nor does the "-"
suceed the second date. Is there a way to extract the first two dates
by looking for the first and second occurrences of the standard "mmmm"
months in the String and then compute the dates based on the position
of those strings?

Regards
Raj
 
R

Rajendra

I think you need to give us a little more information about the structure of
these text strings of yours. Is the month name **always** spelled out? Is
the year **always** a 4-digit number? Is the only character between date
parts the space character (that is, a comma is **never** used between the
day and the year)? Is the word "to" **always** used to separate the two
dates and, if not, can the text separating them be more than one word long?
Could there other numbers before the first date?
--
Rick (MVP - Excel)






Thanks. While that worked right for the given string, it did not work
for all the rows as the strings are composed differently. The word
"period" does not precede the first date at all times nor does the "-"
suceed the second date. Is there a way to extract the first two dates
by looking for the first and second occurrences of the standard "mmmm"
months in the String and then compute the dates based on the position
of those strings?

Regards
Raj

The month name is almost always spelled out in full : January,
February and so on. There are are a few instances of Jan or Feb in the
data.
Year is always a 4 digit number
The separator between dates is always either "to" or "To" or
"tp" (misspelling)
There are no numbers before the first date.
The dates are of the type: (actually more than I initially thought):
November 24th 2008 To May 23rd 2009
12th November 2008 To 11th November 2009
June 20 2009 to June 19 2010
1st Dec 08 tp 31st Dec 08

Thanks and Regards,
Raj
 
R

Rick Rothstein

The month name is almost always spelled out in full : January,
February and so on. There are are a few instances of Jan or Feb in the
data.
Year is always a 4 digit number
The separator between dates is always either "to" or "To" or
"tp" (misspelling)
There are no numbers before the first date.
The dates are of the type: (actually more than I initially thought):
November 24th 2008 To May 23rd 2009
12th November 2008 To 11th November 2009
June 20 2009 to June 19 2010
1st Dec 08 tp 31st Dec 08

Your say in your 2nd answer that the year is always a 4-digit number, but
then in your last line you show dates with 2-digit years... do we have to
account for 2-digit years or not?
 
Ad

Advertisements

R

Rick Rothstein

See if the following macro does what you want. I assumed the starting row
for your data was Row 2 and that the column containing your text data is
Column A (you can change each of these as necessary in the obvious Const
statements); also, the split out dates are placed in the next two columns to
the right of the specified column containing your text data.

Sub SplitOutDates()
Dim X As Long, Z As Long, LastRow As Long
Dim FirstDate As Date, SecondDate As Date
Dim Txt As String, Parts() As String
Const StartRow = 2
Const DataCol = "A"
LastRow = Cells(Rows.Count, DataCol).End(xlUp).Row
For X = StartRow To LastRow
Txt = Cells(X, DataCol)
Txt = Replace(Txt, "tp", "to", 1, -1, Compare:=vbTextCompare)
Txt = Replace(Txt, "st", "", 1, -1, vbTextCompare)
Txt = Replace(Txt, "nd", "", 1, -1, vbTextCompare)
Txt = Replace(Txt, "rd", "", 1, -1, vbTextCompare)
Txt = Replace(Txt, "th", "", 1, -1, vbTextCompare)
Parts = Split(" " & Txt & " ", " to ", -1, vbTextCompare)
On Error GoTo Continue
For Z = 0 To UBound(Parts) - 1
If UBound(Split(Parts(Z))) > 2 Then
FirstDate = Split(WorksheetFunction.Substitute(Parts(Z), " ", _
Chr$(1), UBound(Split(Parts(Z))) - 2), Chr$(1))(1)
SecondDate = Split(WorksheetFunction.Substitute(Parts(Z + 1), _
" ", Chr$(1), 3), Chr$(1))(0)
Exit For
End If
Continue:
Next
Cells(X, DataCol).Offset(0, 1).Value = FirstDate
Cells(X, DataCol).Offset(0, 2).Value = SecondDate
Next
End Sub
 
R

Rajendra

See if the following macro does what you want. I assumed the starting row
for your data was Row 2 and that the column containing your text data is
Column A (you can change each of these as necessary in the obvious Const
statements); also, the split out dates are placed in the next two columns to
the right of the specified column containing your text data.

Sub SplitOutDates()
Dim X As Long, Z As Long, LastRow As Long
Dim FirstDate As Date, SecondDate As Date
Dim Txt As String, Parts() As String
Const StartRow = 2
Const DataCol = "A"
LastRow = Cells(Rows.Count, DataCol).End(xlUp).Row
For X = StartRow To LastRow
Txt = Cells(X, DataCol)
Txt = Replace(Txt, "tp", "to", 1, -1, Compare:=vbTextCompare)
Txt = Replace(Txt, "st", "", 1, -1, vbTextCompare)
Txt = Replace(Txt, "nd", "", 1, -1, vbTextCompare)
Txt = Replace(Txt, "rd", "", 1, -1, vbTextCompare)
Txt = Replace(Txt, "th", "", 1, -1, vbTextCompare)
Parts = Split(" " & Txt & " ", " to ", -1, vbTextCompare)
On Error GoTo Continue
For Z = 0 To UBound(Parts) - 1
If UBound(Split(Parts(Z))) > 2 Then
FirstDate = Split(WorksheetFunction.Substitute(Parts(Z), " ", _
Chr$(1), UBound(Split(Parts(Z))) - 2), Chr$(1))(1)
SecondDate = Split(WorksheetFunction.Substitute(Parts(Z + 1), _
" ", Chr$(1), 3), Chr$(1))(0)
Exit For
End If
Continue:
Next
Cells(X, DataCol).Offset(0, 1).Value = FirstDate
Cells(X, DataCol).Offset(0, 2).Value = SecondDate
Next
End Sub

Rick,

It worked like magic. Thanks a Ton

Regards,
Raj
 
Ad

Advertisements

R

Ron Rosenfeld

Hi,

Cells in Column A have strings of the following type:
eg A1:
Billing for the period March 14 2009 to March 13 2010 - Order dated
April 01 2009
In B1 I want the date : 14-Mar-2009
In C1 I want the date: 13-Mar-2010
(to elaborate, only the first two dates in the string need to be
extracted, third and subsequent dates in the string to be ignored)

Thanks in advance for the help.

Regards,
Raj

The possible variety of formats makes this a bit tricky, but if in your last
post you've included all the possible formats, the following UDF should work.

To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basic
Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

=GetDate(A1,Index)

in some cell.

A1 is any cell reference containing a string which has multiple included dates.
Index is the date instance you wish to extract. (1 for first instance; 2 for
second instance, etc). If Index is greater than the number of dates in the
string, the function will return a #VALUE! error.

Other limitations and assumptions are noted in the code below.

Be sure to format the result as a Date, otherwise you will just see a 5 digit
number.

======================================================
Option Explicit
Function GetDate(s As String, Optional Index As Long = 1) As Date
'returns date entered in a variety of formats as a real Excel date
'Index = the instance of date within a string
'if Index is greater than the number of dates, #VALUE! error
Dim m As String, dy As String, yr As String
m = GetMonth(s, Index)
dy = GetDayYr(s, Index)(0)
yr = GetDayYr(s, Index)(1)
GetDate = DateValue(m & " " & dy & ", " & yr)
End Function
'-------------------------------------------------------------
Private Function GetMonth(s As String, Optional i As Long = 1)
'Assumes at least first three letters of month present
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.Global = True
re.Pattern = "\b(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)"
If re.test(s) = True Then
Set mc = re.Execute(s)
GetMonth = mc(i - 1).Value
End If
Set re = Nothing
End Function
'-------------------------------------------------------------------
Private Function GetDayYr(s As String, Optional i As Long = 1) As Variant
'assumes that only digits represent day(date) and year; and that day(date)
' always precedes year
Dim re As Object, mc As Object
Dim aTemp(0 To 1)
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = "\b(\d{1,2}(?=\D))\D+\b(\d{4}|\d{2})\b"
If re.test(s) = True Then
Set mc = re.Execute(s)
aTemp(0) = mc(i - 1).submatches(0)
aTemp(1) = mc(i - 1).submatches(1)
GetDayYr = aTemp
End If
Set re = Nothing
End Function
=========================================
--ron
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top