Find date in excel string previous to last occurence of specific text

O

osuejm1026

I have been combing through multiple forums and excel formula code bu
cannot find anything to help me out.

I have thousands of strings I need to parse through to extract out
specific date. The date location has no specific parameters around i
to easily extract it out. The location from the beginning or the en
varies with every string. The only thing that is consistent is
message that reads "Changed Status to Subtasks Created" which can b
displayed anywhere from 20 characters to 660 characters after the dat
we need to extract.

Here is a snippet:
Changed Item Details to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_1 to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_2 to
Changed Details_3 to
Added the following Subtasks to this Request: 146448.
Escalated: Rename Master Ticket - NH
Deleted Assignee: UAM
Changed Status to Subtasks Created from Generating Subtasks
Changed Request Title
Changed Initial Approval to None
Changed Additional Approval to No
Changed Ticket Item to
02/15/2013 12:14 PM Escalated Changed Status to Closed from Subtask
Created

So I need to extract then 02/15/2013 9:49 AM text out of this string
The above snippet has 4200 characters in front of what I showed above.
Can anyone help
 
C

Claus Busch

Hi,

Am Tue, 23 Jul 2013 20:28:41 +0100 schrieb osuejm1026:
Changed Item Details to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_1 to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_2 to
Changed Details_3 to
Added the following Subtasks to this Request: 146448.
Escalated: Rename Master Ticket - NH
Deleted Assignee: UAM
Changed Status to Subtasks Created from Generating Subtasks
Changed Request Title
Changed Initial Approval to None
Changed Additional Approval to No
Changed Ticket Item to
02/15/2013 12:14 PM Escalated Changed Status to Closed from Subtasks
Created

try the function:

Function myDate(myCell As Range) As Date
Dim i As Integer
Dim j As Integer
Dim k As Integer

i = InStr(1, myCell, "Changed Status to Subtasks Created") - 5
j = InStrRev(myCell, "AM", i)
k = InStrRev(myCell, "PM", i)
myDate = Format(Mid(myCell, WorksheetFunction.Max(j, k) _
- 17, 19), "mm/dd/yyyy h:mm AM/PM")
End Function

Your string in A1 then into the sheet:
=myDate(A1)


Regards
Claus B.
 
O

osuejm1026

I enter in your function and when i type in the function in excel, i ge
a #VALUE! returned. I stepped through the code and it seems to b
working, but nothing gets returned. Thoughts?
 
R

Ron Rosenfeld

I have thousands of strings I need to parse through to extract out a
specific date. The date location has no specific parameters around it
to easily extract it out. The location from the beginning or the end
varies with every string. The only thing that is consistent is a
message that reads "Changed Status to Subtasks Created" which can be
displayed anywhere from 20 characters to 660 characters after the date
we need to extract.

Based on your snippet, there could be more than one date preceding the particular message. In your snippet, there are two that happen to be identical. Do you want to return the date closest to the "message", or the first date in the string that meets the "distance (20-660)" requirement?

Also, is your entire string in a single cell? Or are they in multiple cells? If the latter, how are they split up?

If the entire string is in a single cell, then this User Defined Function will extract the date. If there are multiple dates (as in your snippet), I have included lines to extract either the first date encountered, or the date closest to the message. Just comment out the one you don't want.

To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

=GetDate(A1) in some cell. Change A1 to reflect the location of the string.

in some cell.

==============================
Option Explicit
Function GetDate(s As String) As Date
Dim re As Object, mc As Object, m As Object
Const sPat As String = "(?:0[1-9]|1[012])[- /.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}" & _
"\s+(?:0?[1-9]|1[12]):[0-5][0-9]\s+[AP]M(?=[\s\S+]{20,660}Changed Status to Subtasks Created)"
Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

Set mc = re.Execute(s)

'Comment out or delete what you don't want here:
'to return 1st date-time followed by string
GetDate = mc(0)
'to return last date-time followed by string
GetDate = mc(mc.Count - 1)

End Function
=================================
 
O

osuejm1026

thanks for the code. Yes there are multiple dates preceding th
message, its essentially an audit trail. So there is exactly one dat
prior to the message that i need. So I need the closest one to m
message not after the message but before the message occurs.

The data is all in a single cell (essentially a database dump for th
record into one cell). So your UDT looks cool but I actually need th
date preceding not following.



'Ron Rosenfeld[_2_ said:
;1612872']On Tue, 23 Jul 2013 20:28:41 +0100, osuejm102
I have thousands of strings I need to parse through to extract out a
specific date. The date location has no specific parameters around it
to easily extract it out. The location from the beginning or the end
varies with every string. The only thing that is consistent is a
message that reads "Changed Status to Subtasks Created" which can be
displayed anywhere from 20 characters to 660 characters after the date
we need to extract. -

Based on your snippet, there could be more than one date preceding th
particular message. In your snippet, there are two that happen to b
identical. Do you want to return the date closest to the "message", o
the first date in the string that meets the "distance (20-660)
requirement?

Also, is your entire string in a single cell? Or are they in multipl
cells? If the latter, how are they split up?

If the entire string is in a single cell, then this User Define
Function will extract the date. If there are multiple dates (as in you
snippet), I have included lines to extract either the first dat
encountered, or the date closest to the message. Just comment out th
one you don't want.

To enter this User Defined Function (UDF), <alt-F11> opens the Visua
Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

=GetDate(A1) in some cell. Change A1 to reflect the location of th
string.

in some cell.

==============================
Option Explicit
Function GetDate(s As String) As Date
Dim re As Object, mc As Object, m As Object
Const sPat As String = "(?:0[1-9]|1[012])[
/.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}" & _

"\s+(?:0?[1-9]|1[12]):[0-5][0-9]\s+[AP]M(?=[\s\S+]{20,660}Changed Statu
to Subtasks Created)"
Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

Set mc = re.Execute(s)

'Comment out or delete what you don't want he
'to return 1st date-time followed by string
GetDate = mc(0)
'to return last date-time followed by string
GetDate = mc(mc.Count - 1)

End Function
================================
 
R

Ron Rosenfeld

thanks for the code. Yes there are multiple dates preceding the
message, its essentially an audit trail. So there is exactly one date
prior to the message that i need. So I need the closest one to my
message not after the message but before the message occurs.

The data is all in a single cell (essentially a database dump for the
record into one cell). So your UDT looks cool but I actually need the
date preceding not following.

I think I'm not understanding you correctly.

How does the date preceding a string, differ from the date that is followed by the string? (where string=message)

date ..... string Note that the date precedes the string

date .... string Note that the date is followed by the string

I don't see a difference in the sequence.

But tell me what went wrong when you tried my UDF on one of your samples by showing me the relevant data input, what you expected to be returned, and what was actually returned by the UDF.
As written, it should have returned the date closest to the "message" that occurred before the message.
 
R

Ron Rosenfeld

thanks for the code. Yes there are multiple dates preceding the
message, its essentially an audit trail. So there is exactly one date
prior to the message that i need. So I need the closest one to my
message not after the message but before the message occurs.

Although you haven't reported back as to what the problem was with my UDF, I'm wondering if there might be more than one date to extract from a single cell.
In other words, can there be, in a single cell a sequence like:

Date1
Date2
....
....
....
Message
....
....
Date3
....
....
Message

Where "Message" = "Changed Status to Subtasks Created"
and, in the above example, you would want to extract both Date2 and Date3?

It is doable, but with a more complicated Regular Expression (regex).
As written, the regex looks for a date that is followed by Message in the range of 20 to 660 characters. It returns all the dates but we only look at the last one returned.
However, if there might be more than one date that needs to be returned, we need to change the regex logic, and also the UDF output logic so it returns an array of dates.
It's easily doable, but I would not add the complexity unless it is required.
 
O

osuejm1026

I have entered your UDF and when I type it into a cell I see GetDat
appear. I enter in the appropriate cell but all i get returned i
#NAME?. I put a watch stamp on the code and the code is not eve
triggered like it is not even catching.

To answer your later post, I only need to return the 1 date for righ
now. I have other logic to pull in other dates that i require that i
much easier to extract.

So I am not sure what is going on, but I made sure macro security i
disabled so all macros work, and i have closed and reopened, but stil
no dice.

Thoughts?



'Ron Rosenfeld[_2_ said:
;1612879']On Wed, 24 Jul 2013 01:27:58 +0100, osuejm102
thanks for the code. Yes there are multiple dates preceding the
message, its essentially an audit trail. So there is exactly one date
prior to the message that i need. So I need the closest one to my
message not after the message but before the message occurs. -

Although you haven't reported back as to what the problem was with m
UDF, I'm wondering if there might be more than one date to extract fro
a single cell.
In other words, can there be, in a single cell a sequence like:

Date1
Date2
....
....
....
Message
....
....
Date3
....
....
Message

Where "Message" = "Changed Status to Subtasks Created"
and, in the above example, you would want to extract both Date2 an
Date3?

It is doable, but with a more complicated Regular Expression (regex).
As written, the regex looks for a date that is followed by Message i
the range of 20 to 660 characters. It returns all the dates but we onl
look at the last one returned.
However, if there might be more than one date that needs to be returned
we need to change the regex logic, and also the UDF output logic so i
returns an array of dates.
It's easily doable, but I would not add the complexity unless it i
required
 
O

osuejm1026

Nevermind, figured out the error. However, for some reason every 100t
row or so displays a #VALUE! So after a couple thousands records it
only a handful that I could look up manually but just curious if yo
knew why it might throw that. But your code is working wonders of th
ones I have checked so far.
 
R

Ron Rosenfeld

I have entered your UDF and when I type it into a cell I see GetDate
appear. I enter in the appropriate cell but all i get returned is
#NAME?. I put a watch stamp on the code and the code is not even
triggered like it is not even catching.

To answer your later post, I only need to return the 1 date for right
now. I have other logic to pull in other dates that i require that is
much easier to extract.

So I am not sure what is going on, but I made sure macro security is
disabled so all macros work, and i have closed and reopened, but still
no dice.

Thoughts?

Did you enter the UDF code into a regular module using the "exact" method I laid out?
Is the UDF in an open workbook?

The #NAME? error is usually because your workbook is not recognizing the macro. But when you write "I see GetDate appear", it seems it should be -- if what you mean by that is that
when you start typing =get a dropdown appears and one of the items on the dropdown is GetDate.

If all else fails, post a copy of your workbook (sanitized as necessary) with the UDF included, on some public sharing website (e.g. SkyDrive) and post a link here.
 
R

Ron Rosenfeld

To answer your later post, I only need to return the 1 date for right
now. I have other logic to pull in other dates that i require that is
much easier to extract.

My question wasn't so much as to whether to extract just one date, but rather if there could be more than one

<Date Time>
....
....
<Message>

grouping in a single cell.

Also, although the regular expression looks complicated, it is fairly simple and could be adapted to other dates.

(?:0[1-9]|1[012])[- /.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}\s+(?:0?[1-9]|1[12]):[0-5][0-9]\s+[AP]M

is merely a method for recognizing VALID date strings in the format of dd/mm/yyyy (h)h:mm AM or PM. As written it will recognize dates from 1/1/1900 - 12/31/2000 but can be extended if necessary.

The rest of the regex: (?=[\s\S+]{20,660}Changed Status to Subtasks Created)
says "look ahead beetween 20 - 660 characters until you get that message "Changed Status to Subtasks Created"

So if similar logic applies to your other dates, it would be trivial to change the regex to accomodate.
 
R

Ron Rosenfeld

Nevermind, figured out the error. However, for some reason every 100th
row or so displays a #VALUE! So after a couple thousands records its
only a handful that I could look up manually but just curious if you
knew why it might throw that. But your code is working wonders of the
ones I have checked so far.

A #VALUE! error most likely means that the sequence being tested for

Date (in the format specified)
....
....
<message>

does not exist in that cell.

If you could post an exact copy of a cell that returns #VALUE!, I should be able to locate the issue.
 
C

Claus Busch

Hi,

Am Tue, 23 Jul 2013 22:52:16 +0100 schrieb osuejm1026:
I enter in your function and when i type in the function in excel, i get
a #VALUE! returned. I stepped through the code and it seems to be
working, but nothing gets returned. Thoughts?

have a look:
https://skydrive.live.com/#cid=9378AAB6121822A3&id=9378AAB6121822A3!326
for the workbook "Date"
So in SkyDrive macros are disabled you have to rightclick and download
the workbook.


Regards
Claus B.
 
O

osuejm1026

'Ron Rosenfeld[_2_ said:
;1612887']On Wed, 24 Jul 2013 14:36:16 +0100, osuejm102
I have entered your UDF and when I type it into a cell I see GetDate
appear. I enter in the appropriate cell but all i get returned is
#NAME?. I put a watch stamp on the code and the code is not even
triggered like it is not even catching.

To answer your later post, I only need to return the 1 date for right
now. I have other logic to pull in other dates that i require that is
much easier to extract.

So I am not sure what is going on, but I made sure macro security is
disabled so all macros work, and i have closed and reopened, but still
no dice.

Thoughts?-

Did you enter the UDF code into a regular module using the "exact
method I laid out?
Is the UDF in an open workbook?

The #NAME? error is usually because your workbook is not recognizing th
macro. But when you write "I see GetDate appear", it seems it should b
-- if what you mean by that is that
when you start typing =get a dropdown appears and one of the item
on the dropdown is GetDate.

If all else fails, post a copy of your workbook (sanitized as necessary
with the UDF included, on some public sharing website (e.g. SkyDrive
and post a link here.

Not sure what I did, but all of a sudden it just started working.

Ok so for 5,000 lines the code worked except for 167 of them. Now som
are legit as the audit trail never captured the specific string i wa
looking for. But for 145 of them, the string exists, just returned
#VALUE! instead. Not sure why, looking to see if there is any extr
spaces or anything different, I am just not seeing it.

Here is a snippet of an audit trail that failed with your code:
06/01/2013 10:32 AM Escalated Changed UAM Email to Employe
Information:
06/01/2013 10:32 AM Escalated Changed AutoClose to No
Added the following Subtasks to this Request: 765640.
06/01/2013 10:45 AM Escalated Escalated: Rename Master Ticket
CU-IU/SLM
Changed Status to Subtasks Created from Checking Approvals
Changed Request Title
Changed Employee Last Name: to Srew
Changed Request data or Contact data
Changed Initial Approval to None
Changed Additional Approval to No
Changed Action Needed to
Changed ProfileID to 765634
06/06/2013 10:38 AM Escalated Changed Status to Closed from Subtask
Created
Master automatically set to Closed as a result of change i
subtask .


Here is a snippet of one where the code did work:
05/01/2013 10:54 AM Escalated Escalated: Awaiting Subtask Processing
Changed Status to Generating Subtasks from Request Verified
Escalated: Security Groups Subtask
Changed Status to Awaiting Incident Creation from Generatin
Subtasks
Changed Request Title
Changed Employee Last Name: to
Changed Security Groups to Currently Active
Changed Request data or Contact data
Changed Initial Approval to None
Changed Additional Approval to No
Changed Action Needed to Remove
Changed ProfileID to
Changed Ticket Item to Security Groups
Changed Item Details to Security Group Detail:
05/01/2013 10:54 AM Escalated Changed UAM Email to Employe
Information:
05/01/2013 10:54 AM Escalated Added the following Subtasks to thi
Request: 761510.
05/01/2013 11:05 AM Escalated Escalated: Rename Master Ticket - CU / DU
Changed Status to Subtasks Created from Awaiting Inciden
Creation
Changed Request Title
Changed Employee Last Name: to colsky cotolsky
Changed Request data or Contact data
Changed Initial Approval to None
Changed Additional Approval to No
Changed Action Needed to
Changed ProfileID to 761458
05/01/2013 1:38 PM Escalated Changed Status to Closed from Subtask
Created
Master automatically set to Closed as a result of change i
subtask
 
R

Ron Rosenfeld

Not sure what I did, but all of a sudden it just started working.

Ok so for 5,000 lines the code worked except for 167 of them. Now some
are legit as the audit trail never captured the specific string i was
looking for. But for 145 of them, the string exists, just returned a
#VALUE! instead. Not sure why, looking to see if there is any extra
spaces or anything different, I am just not seeing it.

Got it. There was an error in the regex in that it would not detect times from 10:00 to 10:59
Easy fix:

==========================
Option Explicit
Function GetDate(s As String) As Date
Dim re As Object, mc As Object, m As Object
Const sPat As String = "(?:0[1-9]|1[012])[- /.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}" & _
"\s+(?:0?[1-9]|1[012]):[0-5][0-9]\s+[AP]M(?=[\s\S+]{20,660}Changed Status to Subtasks Created)"
Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

Set mc = re.Execute(s)
'to return last date-time followed by string
GetDate = mc(mc.Count - 1)

End Function
========================================================
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top