Regex & Wildcards

Vince · Jan 6, 2005

Hey,

I need to find the following by matching Wild Cards.

1.1 mol/L
1 mol/L
1mol/L
1.1 mol /L
1 mol /L
1mol /L
1.1 mol / L
1 mol / L
1mol / L
1.1 mol/ L
1 mol/ L
1mol/ L

A sentence could contain any one this. For instance "James drank a solution
of Nitrogen Peroxide with a concentration of 5.15 mol/L".
This is what I could come up with:

([0-9.]@)( @)(mol/L)

Takes care of any numerals / decimals but does not account for:
a) The space between the number and mol/L (It looks for one space or more
but there is a possibility that a space might not exist like 1.1mol/L)
b) It strictly looks for mol/L and can't account for mol / L, mol/ L or mol
/L. In order to use this, I would have to repeat each instance with
appropriate spaces!

Questions:
1) How do I write a single Wildcard match for all the possibilities listed
above?
2) How can I say "Optional" in Regex. Eg. Di[peg] could be anyone of "Dig"
"Dip" or "Die". But I need to say that "Di" may or may not be followed by
"p" "e" or "g". In Perl, I would say "(Di)([epg])*" How do I say that in
VBA?

Thanks a lot for your time / any reponse.

Vince

Helmut Weber · Jan 6, 2005

Hi Vince,
before putting much effort into something,
that is hardly possible, as wildcard search
does not allow to search for zero or more occurences,
why not adjusting the text beforehand, like

Sub Test777()
ResetSearch
Dim rDcm As Range
Set rDcm = ActiveDocument.Range
With rDcm.Find
.Text = "mol"
.Replacement.Text = " mol"
.Execute Replace:=wdReplaceAll
.Text = "mol[ ]{1,}/"
.Replacement.Text = "mol/"
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
.Text = "mol/[ ]{1,}L"
.Replacement.Text = "mol/L"
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}mol/L"
.Replacement.Text = " mol/L"
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
End Sub
'---
Public Sub ResetSearch()
With Selection.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = ""
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute
End With
End Sub

HTH

Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word XP, Win 98
http://word.mvps.org/

Vince · Jan 7, 2005

Hey Helmut,

Thanks for your response.

I wanted to save efforts by coming up with a text file that contained all
find and replace conditions. At the risk of boring you, please allow me to
explain.

Problem: I am trying to copy edit word files and part of the long list of
copy editing rules, involves separating numerals and units of the format
"numeral thin space unit". So, I copied a huge list of units from the
internet and wrote a function that reads from a text file and does the find
and replace automatically. For instance, the text file could be:

([0-9.]@)( @)(mol/L)SPLIT\1^s\3SPLITTRUESPLITTRUE ' This tells the program
to find the first part before the first split, replace it with the
([0-9.]@)( @)(m/s)SPLIT\1^s\3SPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive

Basically, I wanted this text file to be edited by the user so that they can
add their own units that I missed. But, the problem or rather, the
inconvenience is that they need to type all possibilities into the file. For
instance, the above would be:

([0-9.]@)( @)(mol/L)SPLIT\1^s\3SPLITTRUESPLITTRUE ' This tells the program
to find the first part before the first split, replace it with the
([0-9.]@)( @)(m/s)SPLIT\1^s\3SPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive
([0-9.]@)( @)(mol / L)SPLIT\1^smol/LSPLITTRUESPLITTRUE ' This tells the
program to find the first part before the first split, replace it with the
([0-9.]@)( @)(m / s)SPLIT\1^sm/sSPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive
([0-9.]@)( @)(mol /L)SPLIT\1^smol/LSPLITTRUESPLITTRUE ' This tells the
program to find the first part before the first split, replace it with the
([0-9.]@)( @)(m /s)SPLIT\1^sm/sSPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive
([0-9.]@)( @)(mol/ L)SPLIT\1^smol/LSPLITTRUESPLITTRUE ' This tells the
program to find the first part before the first split, replace it with the
([0-9.]@)( @)(m/ s)SPLIT\1^sm/sSPLITTRUESPLITTRUE ' part before the second
split, match wild characters and be case sensitive

This two units, multplies to over 8 lines! This could slow down the program
(Don't really mind that...) but the main problem is that the text file could
become a little too big in the long run. This is why I was wondering if I
could somehow accomodate the possiblities in the text file to begin with
(using some wildcard search).

What I could do, however, is to use your method so that the program (when
reading from the file) also makes rooms for the possibilites listed above.
If you have a better idea, please let me know.

Thank you for your time.

Vince

Helmut Weber said:
Hi Vince,
before putting much effort into something,
that is hardly possible, as wildcard search
does not allow to search for zero or more occurences,
why not adjusting the text beforehand, like

Sub Test777()
ResetSearch
Dim rDcm As Range
Set rDcm = ActiveDocument.Range
With rDcm.Find
.Text = "mol"
.Replacement.Text = " mol"
.Execute Replace:=wdReplaceAll
.Text = "mol[ ]{1,}/"
.Replacement.Text = "mol/"
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
.Text = "mol/[ ]{1,}L"
.Replacement.Text = "mol/L"
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}mol/L"
.Replacement.Text = " mol/L"
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
End Sub
'---
Public Sub ResetSearch()
With Selection.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = ""
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute
End With
End Sub

HTH

Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word XP, Win 98

Helmut Weber · Jan 7, 2005

Hi Vince,
not that I understand all, but for things like:

"mol /L", "mol/ L", "mol / L", "mol / L"
"m /s", "m / s", "m/ s", "m /s", "m / s"

a possible workaround would be
to replace first "/" by " / ", in order to
overcome the limition that there is no search for
zero ore more occurences of a character.
So we add additional characters first!
After that, each "/" would be surrounded by spaces.
And after that, the following search using wildcards
would find all occurences of [ ]{1,}/[ ]{1,} and
can be replaced by "/": Resulting in "mol/L", "m/s".

And there may be more such simple tricks.

HTH
Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word 2002, Windows 2000

Vince · Jan 7, 2005

Hey Helmut,

Thanks, that's a great idea! I just have to find out if adding a space
before and after every slash in the document is acceptable (what if there's
some text that has a '/' and is not a unit). But, I don't think it should be
a problem.....

Thanks, again!

Vince

Helmut Weber · Jan 7, 2005

Hi Vince,

just one more word,
depending on how big and how complex your docs
are, and on how much effort is justified,
one could even create a macro, that after
removing all spaces from slashes, highlights all
units as they are defined in a list, and locates
"/" that are not highlighted. And many more
variations.

Cheers

Helmut Weber

Vince · Jan 7, 2005

Thanks, Helmut!

Excellent idea. I am changing everything coming from the text file to Green
color. Easy to detect odd ones out like you mentioned.

Thanks, again!

Vince

VBA to generate quote in MS Word (GST on the fly)	2	Apr 23, 2009
VAMOS AJUDAR UNS AOS OUTROS.	0	Jul 10, 2009
Wildcard search is sulking	6	Aug 30, 2005
Can I use wildcard characters in array formulas	1	Nov 7, 2007
PROGRAMMERS - We could all use a good laugh	10	Jan 29, 2011
Wildcard search help	4	Nov 8, 2004
Appendix Heading Numbering only partially functional - bug list	2	Jan 21, 2004
How to replace a string of characters with the count of the charac	11	Oct 16, 2007

Regex & Wildcards

Vince

Helmut Weber

Vince

Helmut Weber

Vince

Helmut Weber

Vince

Ask a Question

Similar Threads