Macro to Remove text between two characters.

B

Barry Goodthrall

I am sure this is probably easy but I need a macro that removes all
text between < >

Like if I have the following.

<a href="http://www.thisis what I want to delete">Don't want to delete
this.</a>

I want this just to remain

Don't want to delete this.



Of course I can remove all the </a> by a simple search and replace
replacing the </a> with a blank.

But its the other part I am having problem with.

I need all of them to be deleted so I will need the macro to start
from the top of the document and end at the bottom.

I am sure this is probably a very simple macro but I don't know VBA
that well.
 
G

Greg Maxey

Barry,

Your post is a bit confussing. What do you start with and what to you want
left?
 
B

Bear

Barry:

You could probably do the first part using a wildcard find and replace. In
the Find and Replace dialog box, click More, then check the Use Wildcards
check box. Use these values:

Find: \<a*\>
Replace: <^047a>

This should convert all the anchor text to </a>. Then you can do the regular
Find and Replace you suggested to delete all of them.

In the Find argument:
\< = escape left brace, i.e. use the "<" as an actual character
a = the actual letter "a"
* = any number of any character
\> = escape right brace, i.e. ">"

In the Replace argument:
< = the left brace character
^047 = the "/" character (to avoid interpreting "/" as an expression
a = "a"
= the right brace character

Bear
 
Y

yves

Hi Bear,

I'm also generally in favour of using a global Find-Replace method
whenever possible. But as Barry asked for a macro, you'd help him more
by explaing the Selection. (or Range.) Find.Execute method, rather
than a dialog box - but Barry will figure it out, I'm sure.

Now a document may contain malformed text (one tag with no closing >).
Most HTML processors have a policy of defaulting on the side of
safety. It is safe to test whether the text to delete contains, for
example, a paragraph mark - or a second opening < before > is found,
of course observing possible nested levels of < and >'s), because in a
many cases, that's an indicator that *perhaps* we hit malformed HTML,
and that it's worth asking the user before deleting 10 pages. Only a
macro can test that, IMO.

Next is the question of whether the window is in a view mode that
display field codes. Problem is, if the document contains fields, a
Find-Replace will either act on the field's "visible" result or on the
field's code, depending on which view mode you're in. A macro can do
the necessary verification, and if necessary, will change the view
mode.

Cheers,
Yves Champollion
 
B

Barry Goodthrall

Why not convert or "File, save as" the .htm file as a txt file ?

Richard

I think I tried that.

I would also like to know how to do it like when I might have used
other characters to surround text such as *Text to delete* or (text to
delete).

I guess my question is more generally how does one search for a
character in a macro and then select everything (including the
character) until it reaches another character, and do it throughout
the document.
 
R

Richard Relpht

Quick... dirty...
This works in VB Script.
VBA will something very similar.



dim fso, f
const FORREADING = 1, FORWRITING = 2, FORAPPENDING = 8
set fso = createobject("scripting.filesystemobject")

Set InFile = fso_OpenTextFile ("c:\Test\Infile.htm", ForReading, False)
set OutFile1 = fso.createtextfile ("C:\Test\Outfile1.txt",FORWRITING)
set OutFile2 = fso.createtextfile ("C:\Test\Outfile2.txt",FORWRITING)

s = ""
Justhtm = ""
JustText = ""

Do While Infile.AtEndOfStream <> True
s = InFile.read(1)
if s = "<" then
Do while s <> ">"
Justhtm = Justhtm & s
s = InFile.read(1)
Loop
Justhtm = Justhtm & s
else JustText = JustText & s
end if
Loop


OutFile1.write JustText
OutFile2.write Justhtm
 
R

Russ

Barry,
Why are you marking for deletion and not deleting immediately?
Is the bigger question really, "What is the best way to mark text for
suggested deletion and then after approval, delete everything marked for
deletion?" Word's built-in track changes system allows for different editors
(or delayed decisions) before final acceptance of changes, if that is what
you want.

But there are other ways to mark text that may be less confusing to search
for, too. You could use a highlight color to mark text and then delete all
text highlighted with that color, for example. Or create a "DeleteMe" style
to mark text and later find text with that style and delete that styled
text.

Or play around with setting some marked text font visibility to false. The
marked text then can be toggled to show or not show; and toggled to print or
not print.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top