Macro to insert hyper links

J

J

Hi,

Not sure if its possible or overly complicated to write a macro for
this.

Below is a sample of an OCRed document sent to Word as .RTF, or HTML,
or txt.
I need all letters followed by numbers to be links.
For instance: B26 should a <link>B26</a>. Basically the macro would
have to find the combination of Letter/Number, then insert link. I've
a little experience with VB but not enough to figure this out.

Any help/ pointers would be appreciated.


Wire: B26, 12.14 [Bamboo NA: G3.5] [Beach: Gl4.52] [Bombardment vs:
C1.822] [Bypass: A4.3] [Caves: Gl1.931] [Clear
ance: B24.73] [Column: E11.534] [in PB, Crash drm for Glider
Landing: SSR PB20] [Double Time NA: B26.46] [Dozer vs:
G15.23] [in RB, NA for Russian HIP: SSR RB7] [Manhandling:
ClO.3] [OCEAN with Tetrahedrons: G14.52]
Withdrawal: A11.2 [AFV: A11.7] [Ambush: AllAl] [ENEMY:
 
M

Mark Tangard

Hi J,

No need for a macro; you can do this with a single Find/Replace
(just record the Find/Replace if you need a macro regardless):

FIND: <[A-Z][0-9]{1,2}>
REPLACE WITH: <link>^&</a>

Click MORE in the Find/Replace dialog and check Use Wildcards,
then click Replace All.

Note that this searches for a "word" containing 1 capital letter
(that's the [A-Z]) followed by 1 or 2 numbers. (The {1,2} means
1 to 2 occurrences of the preceding character.) The <>'s in the
Find box do not mean the < and > characters; they signify the
beginning and end of a word -- so this macro would *not* change
occurrences like, say, HOOP33 or A1234. (In the Replace box the
angle brackets have their usual meaning.) The ^& in the Replace
box means the text in the Find box.

The Beach and Caves notations in your sample contain a lowercase
L next to the G. If that's actually supposed to be a one, then
you'll either need to retrain whoever typed it, or slap your OCR
program around a bit to make it stop misrecognizing that. (And
good luck! Mine insists on spitting out lowercase L's just about
anywhere it sees something sorta vertical....)

Hope this helps.
 
J

J

THANKS MARK!!

I find also that if I change the 'replace with' to: <a
href="^&">^&</a> it will put thestring in link. That saved me days of
work.

How does one learn about things like this? There are good references
for things like this?

Mark Tangard said:
No need for a macro; you can do this with a single Find/Replace
(just record the Find/Replace if you need a macro regardless):

FIND: <[A-Z][0-9]{1,2}>
REPLACE WITH: <link>^&</a>
The Beach and Caves notations in your sample contain a lowercase
L next to the G.


That's real subtle. I never could tell the difference between the 1
and l - but now after looking more closely at it, it's pretty clear.
I'm just so excited about the other find and replace thing you could
tell me that my HD crashed and I wouldn't care.
I need all letters followed by numbers to be links.
For instance: B26 should a <link>B26</a>. Basically the macro would
have to find the combination of Letter/Number, then insert link. I've
a little experience with VB but not enough to figure this out.

Any help/ pointers would be appreciated.

Wire: B26, 12.14 [Bamboo NA: G3.5] [Beach: Gl4.52] [Bombardment vs:
C1.822] [Bypass: A4.3] [Caves: Gl1.931] [Clear
ance: B24.73] [Column: E11.534] [in PB, Crash drm for Glider
Landing: SSR PB20] [Double Time NA: B26.46] [Dozer vs:
G15.23] [in RB, NA for Russian HIP: SSR RB7] [Manhandling:
ClO.3] [OCEAN with Tetrahedrons: G14.52]
Withdrawal: A11.2 [AFV: A11.7] [Ambush: AllAl] [ENEMY:
 
M

Mark Tangard

J,
I find also that if I change the 'replace with' to: <a
href="^&">^&</a> it will put thestring in link. That saved me days of
work.

Rule of thumb: Whenever you face a task that looks like it might
take you "days", come here first! There's almost always an easier
way, especially when it's something that's in any way repetitious. ;)
How does one learn about things like this? There are good references
for things like this?

The vast majority of everything I've ever learned about Word and VBA
came from reading these newsgroups. You can try buying a book, and
pray you luck into the 2% that are worthwhile (I haven't), and even
then the learning process is slow compared to hands-on. One thing
to remember is that 'references' (in the strictest sense of the word)
aren't really what help you learn, because for stuff like this, you
usually don't know what you need. References are good for looking
up details whose existence you know about but haven't memorized.
(The Word and VBA helpfiles are great 'references' but awful for
learning concepts; if you ever forgot the specificx of the ^& trick,
you'd know where to look for it now, but only because you've seen it.)

If you're intent on learning this stuff thoroughly, it helps just
to set aside an hour a day to comb through maybe 3 or 4 groups and
test-drive stuff that looks interesting.
That's real subtle. I never could tell the difference between the 1
and l - but now after looking more closely at it, it's pretty clear.

It's a font thing. If you read the message in a font like Letter
Gothic, it jumps out at you plain as day.
 
J

J

Thanks again Mark,

Here's another one. I've tried changing the search parameters a little
but can't seem to get it right.

Here's an example:
Part of the document I am searching -

10.3 PUSHING: A limbered or
Its M# is 12 with +7

Find : <[10.][1-9]{1-9}> - or <[10.][.1-.9]{1-9}> Not sure if the
periods can be placed better
Replace <p><b><a name ="^&"></a>^&</b>

Will get me to 10.3 but it will also find ANY number. Even the 12 in
the above example of the text. If I have numbers from 8-10 in the
format "8.12" or "9.4", what is the best find string?

I'd like to be able to format the find so that I can hit replace all
or at least return fewer search results. I realize that replace all
might not be an option here.

One other thing, consider the 10.3 Pushing: example. Ideally the HTML
will read
<p><b><a name ="10.3"></a>10.3 PUSHING:</b> Are there option in the
find string to get this in one pass? I've been doing <[10.][1-9]{1,2}>
replace: <p><b><a name ="^&"></a>^& Then doing a find ":" replace with
":</b>. Works ok but is subject to human error and multiple passes
through.

Thanks again
 
M

Mark Tangard

J,

The Find/Replace feature will do almost anything if you know where
to stroke it. From your post it's clear you haven't had the
wildcard feature explained very well, esp. the square brackets.
Brackets match any of the characters inside them - singly. So
[10.] would match a one, a zero, or a period (just one of them,
one time). That's not what you want.

The curly braces denote the minimum and maximum number of times
the immediately preceding character occurs. So {1-9} is
meaningless (the separator between min/max separator is a comma,
not a hyphen. And even if that that were done right {1,9} it
would still mean it matches as many as NINE occurrence of a
digits after the decimal point. (That's not your aim, is it?)

The key to doing this all in one swoop is using parentheses,
which in a wildcard replace can be used to establish 'groups'
that can be reused in the Replace box (or even in the Find box,
but let's not go there now). Try this, with Use Wildcards
checked:

FIND: (<[0-9]{1,2}.[1-9]>) ([A-Z]@:)

REPLACE: <p><b><a name ="\1"></a>\1 \2<b>

This assumes the numbered items never exceed 99, the numbers
to the right of the period) never exceed 9, and the word or
title (like PUSHING here) always ends with a colon and is
always all-capped. The parentheses establish 2 'groups'
which are indicated by 1 and \2 in the Replace string.
(The \1 group is reused twice.) The [A-Z] matches any
capital letter and the @ means any number of the preceding
character.

Needless to say this is not one of the more widely recognized
features of Word, but knowing how to use it can save you lots
of time.

--
Mark Tangard <[email protected]>, Microsoft Word MVP
Please reply only to the newsgroup, not by private email.
Note well: MVPs do not work for Microsoft.
"Life is nothing if you're not obsessed." --John Waters


Thanks again Mark,

Here's another one. I've tried changing the search parameters a little
but can't seem to get it right.

Here's an example:
Part of the document I am searching -

10.3 PUSHING: A limbered or
Its M# is 12 with +7

Find : <[10.][1-9]{1-9}> - or <[10.][.1-.9]{1-9}> Not sure if the
periods can be placed better
Replace <p><b><a name ="^&"></a>^&</b>

Will get me to 10.3 but it will also find ANY number. Even the 12 in
the above example of the text. If I have numbers from 8-10 in the
format "8.12" or "9.4", what is the best find string?

I'd like to be able to format the find so that I can hit replace all
or at least return fewer search results. I realize that replace all
might not be an option here.

One other thing, consider the 10.3 Pushing: example. Ideally the HTML
will read
<p><b><a name ="10.3"></a>10.3 PUSHING:</b> Are there option in the
find string to get this in one pass? I've been doing <[10.][1-9]{1,2}>
replace: <p><b><a name ="^&"></a>^& Then doing a find ":" replace with
":</b>. Works ok but is subject to human error and multiple passes
through.

Thanks again
 
M

Mark Tangard

You're right, that's pretty clear :) There's nothing in the help and
this is the first time I've ever even heard of the wildcard.

Actually it is in the help, it's just squirreled away under some
unlikely heading like most other useful stuff. (And this would
be in the Word help, not the VBA help.)
Thanks again Mark - YOU RULE! I only wish I had posted this question
last week before doing forty HTML pages with copy and paste.

Rule of thumb: If you ever find yourself doing *anything* in Word
40 times (or even preparing to), run-don't-walk to the newsgroups
before you get to #10, because it's almost guaranteed there's an
easier way.

--
Mark Tangard <[email protected]>, Microsoft Word MVP
Please reply only to the newsgroup, not by private email.
Note well: MVPs do not work for Microsoft.
"Life is nothing if you're not obsessed." --John Waters

Mark Tangard said:
The Find/Replace feature will do almost anything if you know where
to stroke it. From your post it's clear you haven't had the
wildcard feature explained very well, esp. the square brackets.

You're right, that's pretty clear :) There's nothing in the help and
this is the first time I've ever even heard of the wildcard.
Try this, with Use Wildcards
checked:

FIND: (<[0-9]{1,2}.[1-9]>) ([A-Z]@:)

REPLACE: <p><b><a name ="\1"></a>\1 \2<b>

Thanks again Mark - YOU RULE! I only wish I had posted this question
last week before doing forty HTML pages with copy and paste.
 
Top