Isolate Patterns

E

EllenM

Hello,
I'd like to isolate patterns such as 2A3672, 3B3696 and 6T3930. I can find
records with this pattern with the following query:

SELECT InventoryOct2006.Originator
FROM InventoryOct2006
WHERE (((InventoryOct2006.Originator) Like "*#?####*"));

I'd like to write a query to give me only the data that fits the pattern,
for instance 2A3672, and omit all the other stuff in that cell. Placing
"#?####" in the criteria doesn't seem to do it.

Thanks in advance for your help.

Ellen
 
M

Michel Walsh

Have you tried:


LIKE "*[0-9]?[0-9][0-9][0-9][0-9]*"


Hoping it may help,
Vanderghast, Access MVP.
 
E

EllenM

Thanks, Michel. That finds the whole string in the cell. I want to be able
to parse out patterns such as 3B1234 or 4C2345.

Michel Walsh said:
Have you tried:


LIKE "*[0-9]?[0-9][0-9][0-9][0-9]*"


Hoping it may help,
Vanderghast, Access MVP.



EllenM said:
Hello,
I'd like to isolate patterns such as 2A3672, 3B3696 and 6T3930. I can
find
records with this pattern with the following query:

SELECT InventoryOct2006.Originator
FROM InventoryOct2006
WHERE (((InventoryOct2006.Originator) Like "*#?####*"));

I'd like to write a query to give me only the data that fits the pattern,
for instance 2A3672, and omit all the other stuff in that cell. Placing
"#?####" in the criteria doesn't seem to do it.

Thanks in advance for your help.

Ellen
 
J

John Spencer

So you are saying that you want to see only the data that matches within the
field?

Example
Field Value: "This string is 2A3672 and xxxxx"
The returned value should only show "2A3672" and discard all the verbiage
around it.

IF that is the case, I would think that you would need some fairly complex
vba code to handle and/or the use of John Nurick's reg expression code.

http://www.mvps.org/access/modules/mdl0063.htm

Or check out
http://www.j.nurick.dial.pipex.com/Code/vbRegex/rgxExtract.htm
--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 
E

EllenM

Hi John,
My coworker and I have poured over the second link. We've had success with
doing immediate pane exercise. The code that worked is:

? rgxExtract("blah blah FAP 3b3333 blah blah", "\w+\s\d\w\d{4}", 0)

The only glitch is that my string contains a carriage return, so that it
looks like:'\

? rgxExtract("Abbott Laboratories;
FAP 2A3672", "\n*\w+\s\d\w\d{4}", 0)

The vb editor interpreted that script as two lines and therefore it didn't
work.

Is there a way for me to strip out the carriage returns from my string prior
to passing it into the function?

How do I apply this function to my table outside the vb editor? BTW, the
pattern only exists once in a given string.

Thanks so much for your help.

Ellen
 
J

John Spencer

You could use the VBA replace function (Access 2000 or later)

rgxExtract(Replace([The field],Chr(13) & Chr(10)," ") ,"\w+\s\d\w\d{4}",0)

In a query, you could do something like

Field: Stripped: rgxExtract(Replace([TheField],Chr(13) & Chr(10)," ")
,"\w+\s\d\w\d{4}",0)
Criteria: Not null or <> ""

I would probably use an additional column and apply criteria to it on the
theory that SQL engine is going to weed things down for me before I do the
rgxExtract. You might try it both ways and see if there is any benefit to
using the like criteria versus using criteria on just the stripped result.

Field: TheField
Criteria: Like "*#[a-z]####*"

Of course, one benefit of using criteria on only the "Stripped" result is
that you don't have to figure out a like criteria that matches up with what
you want to extract.
--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 
R

RoyVidar

EllenM said:
Hi John,
My coworker and I have poured over the second link. We've had
success with doing immediate pane exercise. The code that worked
is:

? rgxExtract("blah blah FAP 3b3333 blah blah", "\w+\s\d\w\d{4}", 0)

The only glitch is that my string contains a carriage return, so that
it looks like:'\

? rgxExtract("Abbott Laboratories;
FAP 2A3672", "\n*\w+\s\d\w\d{4}", 0)

The vb editor interpreted that script as two lines and therefore it
didn't work.

Is there a way for me to strip out the carriage returns from my
string prior to passing it into the function?

How do I apply this function to my table outside the vb editor? BTW,
the pattern only exists once in a given string.

Thanks so much for your help.

Ellen

I don't understand your pattern. You say you want only the "3b3333",
but the pattern "\w+\s\d\w\d{4}" should give you a bit more than that.

n* - zero or more new line
w+ - one or more occurences of [a-zA-Z_0-9]
s - whitespace

which occurs before "your pattern".

d - is numeric
w - again, is [a-zA-Z_0-9] - are you sure about that, or should you
rather use for instance [a-zA-Z]?

I would guess perhaps "\d[a-zA-Z]\d{4}", perhaps, to ensure it doesn't
exist within other words, you could test for word boundaries,
"\b\d[a-zA-Z]\d{4}\b"

Here's an attempt at a function. I'd suggest a module level variable
for the regexp object, placed in the declaration section

Private mre As Object

' then the function

Public Function rvsGetText(ByVal TheString As Variant) As String

Dim mc As Object
Const MyPattern As String = "\d[a-zA-Z]\d{4}"

If Not IsNull(TheString) Then
If mre Is Nothing Then
Set mre = CreateObject("vbscript.regexp")
End If
With mre
.MultiLine = True
.Pattern = MyPattern
Set mc = .Execute(TheString)
If mc.Count > 0 Then
rvsGetText = mc(0)
End If
Set mc = Nothing
End With

End If

End Function

In the query, call it like this

NewField: rvsGetText([YourField])

or in SQL

SELECT rvsGetText([YourField]) as NewField, ...

There's no errorhandling, though, and it will return ZLS if no match.
 
E

EllenM

Thanks for your response, Roy. Actually you're right. We're looking for a 3
or 4 string of capital letters followed by a space, then a digit,a capital
letter followed by 4 digits. For examle, FAP 3B3333. I started out asking
for just the last 6 characters as that happens to be unique.

And thanks, to you, John. I'll read both replies in detail tomorrow
morning. My coworker (who is quite versed in VB) will be able to help me
then.

Bye!
Ellen

RoyVidar said:
EllenM said:
Hi John,
My coworker and I have poured over the second link. We've had
success with doing immediate pane exercise. The code that worked
is:

? rgxExtract("blah blah FAP 3b3333 blah blah", "\w+\s\d\w\d{4}", 0)

The only glitch is that my string contains a carriage return, so that
it looks like:'\

? rgxExtract("Abbott Laboratories;
FAP 2A3672", "\n*\w+\s\d\w\d{4}", 0)

The vb editor interpreted that script as two lines and therefore it
didn't work.

Is there a way for me to strip out the carriage returns from my
string prior to passing it into the function?

How do I apply this function to my table outside the vb editor? BTW,
the pattern only exists once in a given string.

Thanks so much for your help.

Ellen

I don't understand your pattern. You say you want only the "3b3333",
but the pattern "\w+\s\d\w\d{4}" should give you a bit more than that.

n* - zero or more new line
w+ - one or more occurences of [a-zA-Z_0-9]
s - whitespace

which occurs before "your pattern".

d - is numeric
w - again, is [a-zA-Z_0-9] - are you sure about that, or should you
rather use for instance [a-zA-Z]?

I would guess perhaps "\d[a-zA-Z]\d{4}", perhaps, to ensure it doesn't
exist within other words, you could test for word boundaries,
"\b\d[a-zA-Z]\d{4}\b"

Here's an attempt at a function. I'd suggest a module level variable
for the regexp object, placed in the declaration section

Private mre As Object

' then the function

Public Function rvsGetText(ByVal TheString As Variant) As String

Dim mc As Object
Const MyPattern As String = "\d[a-zA-Z]\d{4}"

If Not IsNull(TheString) Then
If mre Is Nothing Then
Set mre = CreateObject("vbscript.regexp")
End If
With mre
.MultiLine = True
.Pattern = MyPattern
Set mc = .Execute(TheString)
If mc.Count > 0 Then
rvsGetText = mc(0)
End If
Set mc = Nothing
End With

End If

End Function

In the query, call it like this

NewField: rvsGetText([YourField])

or in SQL

SELECT rvsGetText([YourField]) as NewField, ...

There's no errorhandling, though, and it will return ZLS if no match.
 
R

RoyVidar

EllenM said:
Thanks for your response, Roy. Actually you're right. We're looking
for a 3 or 4 string of capital letters followed by a space, then a
digit,a capital letter followed by 4 digits. For examle, FAP
3B3333. I started out asking for just the last 6 characters as that
happens to be unique.

And thanks, to you, John. I'll read both replies in detail tomorrow
morning. My coworker (who is quite versed in VB) will be able to
help me then.

Bye!
Ellen

RoyVidar said:
EllenM said:
Hi John,
My coworker and I have poured over the second link. We've had
success with doing immediate pane exercise. The code that worked
is:

? rgxExtract("blah blah FAP 3b3333 blah blah", "\w+\s\d\w\d{4}", 0)

The only glitch is that my string contains a carriage return, so
that it looks like:'\

? rgxExtract("Abbott Laboratories;
FAP 2A3672", "\n*\w+\s\d\w\d{4}", 0)

The vb editor interpreted that script as two lines and therefore it
didn't work.

Is there a way for me to strip out the carriage returns from my
string prior to passing it into the function?

How do I apply this function to my table outside the vb editor?
BTW, the pattern only exists once in a given string.

Thanks so much for your help.

Ellen

:

So you are saying that you want to see only the data that matches
within the field?

Example
Field Value: "This string is 2A3672 and xxxxx"
The returned value should only show "2A3672" and discard all the
verbiage around it.

IF that is the case, I would think that you would need some fairly
complex vba code to handle and/or the use of John Nurick's reg
expression code.

http://www.mvps.org/access/modules/mdl0063.htm

Or check out
http://www.j.nurick.dial.pipex.com/Code/vbRegex/rgxExtract.htm
--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

Hello,
I'd like to isolate patterns such as 2A3672, 3B3696 and 6T3930.
I can find
records with this pattern with the following query:

SELECT InventoryOct2006.Originator
FROM InventoryOct2006
WHERE (((InventoryOct2006.Originator) Like "*#?####*"));

I'd like to write a query to give me only the data that fits the
pattern, for instance 2A3672, and omit all the other stuff in
that cell. Placing "#?####" in the criteria doesn't seem to do
it.

Thanks in advance for your help.

Ellen

I don't understand your pattern. You say you want only the "3b3333",
but the pattern "\w+\s\d\w\d{4}" should give you a bit more than
that.

n* - zero or more new line
w+ - one or more occurences of [a-zA-Z_0-9]
s - whitespace

which occurs before "your pattern".

d - is numeric
w - again, is [a-zA-Z_0-9] - are you sure about that, or should you
rather use for instance [a-zA-Z]?

I would guess perhaps "\d[a-zA-Z]\d{4}", perhaps, to ensure it
doesn't exist within other words, you could test for word
boundaries, "\b\d[a-zA-Z]\d{4}\b"

Here's an attempt at a function. I'd suggest a module level variable
for the regexp object, placed in the declaration section

Private mre As Object

' then the function

Public Function rvsGetText(ByVal TheString As Variant) As String

Dim mc As Object
Const MyPattern As String = "\d[a-zA-Z]\d{4}"

If Not IsNull(TheString) Then
If mre Is Nothing Then
Set mre = CreateObject("vbscript.regexp")
End If
With mre
.MultiLine = True
.Pattern = MyPattern
Set mc = .Execute(TheString)
If mc.Count > 0 Then
rvsGetText = mc(0)
End If
Set mc = Nothing
End With

End If

End Function

In the query, call it like this

NewField: rvsGetText([YourField])

or in SQL

SELECT rvsGetText([YourField]) as NewField, ...

There's no errorhandling, though, and it will return ZLS if no
match.

Try this pattern

"[A-Z]{3,4}\s\d[A-Z]\d{4}"

This might give wrong results when/if the initial string of three or
four uppercase letters consists of more than four letters, so also
testing for word boundary might be necessary

"\b[A-Z]{3,4}\s\d[A-Z]\d{4}"

This should match your latest requirements.

BTW - if you want to pass a new line from the immediate pane for
testing, try something like this

?MyFunction("This is the first part " & vbcrlf & " the next part")
 
E

EllenM

Thanks so much for your help, particularly the link given by John.
My coworker wrote a script in Excel that does what I want. It brings the
data from column F into empty column G.

Sub Regex()
Dim Submission_title As Variant
Range("F4").Select



For Counter = 1 To 170
Submission_title = rgxExtract(ActiveCell.Value, "\w+\s\d\w\d{4}", 0)

If IsNull(Submission_title) = True Then
ActiveCell.Offset(rowOffset:=1, columnOffset:=0).Activate
Else
'MsgBox (Submission_title)
ActiveCell.Offset(rowOffset:=0, columnOffset:=1).Activate
ActiveCell.Value = Submission_title
ActiveCell.Offset(rowOffset:=1, columnOffset:=-1).Activate
End If

Next Counter

End Sub



I'm still going to read the posts to write a query to do the same thing.
Thank you both John and Roy.

Ellen
 
E

EllenM

Hi John,
Could you write the script in SQL?

Thanks so much for everything. You've been most helpful.

Ellen
 
J

John Spencer

I'm sorry but I am not sure I understand your request.

Do you want someone to write the SQL for the query?

Did you copy John Nurick's code and put it into module?

Perhaps something like the following will work.

SELECT InventoryOct2006.Originator
, rgxExtract(Replace([Originator],Chr(13) & Chr(10)," "),
"\b[A-Z]{3,4}\s\d[A-Z]\d{4}",0) as FoundValue
FROM InventoryOct2006
WHERE InventoryOct2006.Originator Like "*#?####*"


I am not that familar with using regular expressions and I have to play
around quite a bit to get the correct results. You can try the above
and see if it works.
'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================

Hi John,
Could you write the script in SQL?

Thanks so much for everything. You've been most helpful.

Ellen

John Spencer said:
You could use the VBA replace function (Access 2000 or later)

rgxExtract(Replace([The field],Chr(13) & Chr(10)," ") ,"\w+\s\d\w\d{4}",0)

In a query, you could do something like

Field: Stripped: rgxExtract(Replace([TheField],Chr(13) & Chr(10)," ")
,"\w+\s\d\w\d{4}",0)
Criteria: Not null or <> ""

I would probably use an additional column and apply criteria to it on the
theory that SQL engine is going to weed things down for me before I do the
rgxExtract. You might try it both ways and see if there is any benefit to
using the like criteria versus using criteria on just the stripped result.

Field: TheField
Criteria: Like "*#[a-z]####*"

Of course, one benefit of using criteria on only the "Stripped" result is
that you don't have to figure out a like criteria that matches up with what
you want to extract.
--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 
E

EllenM

I'm speechless... I was kind of expecting VBA something or other that I
couldn't understand. Your script was exactly was I've been looking for... a
simple query that even I could do.

Many, many thanks!!!

Ellen

John Spencer said:
I'm sorry but I am not sure I understand your request.

Do you want someone to write the SQL for the query?

Did you copy John Nurick's code and put it into module?

Perhaps something like the following will work.

SELECT InventoryOct2006.Originator
, rgxExtract(Replace([Originator],Chr(13) & Chr(10)," "),
"\b[A-Z]{3,4}\s\d[A-Z]\d{4}",0) as FoundValue
FROM InventoryOct2006
WHERE InventoryOct2006.Originator Like "*#?####*"


I am not that familar with using regular expressions and I have to play
around quite a bit to get the correct results. You can try the above
and see if it works.
'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================

Hi John,
Could you write the script in SQL?

Thanks so much for everything. You've been most helpful.

Ellen

John Spencer said:
You could use the VBA replace function (Access 2000 or later)

rgxExtract(Replace([The field],Chr(13) & Chr(10)," ") ,"\w+\s\d\w\d{4}",0)

In a query, you could do something like

Field: Stripped: rgxExtract(Replace([TheField],Chr(13) & Chr(10)," ")
,"\w+\s\d\w\d{4}",0)
Criteria: Not null or <> ""

I would probably use an additional column and apply criteria to it on the
theory that SQL engine is going to weed things down for me before I do the
rgxExtract. You might try it both ways and see if there is any benefit to
using the like criteria versus using criteria on just the stripped result.

Field: TheField
Criteria: Like "*#[a-z]####*"

Of course, one benefit of using criteria on only the "Stripped" result is
that you don't have to figure out a like criteria that matches up with what
you want to extract.
--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top