deleting all format commands

L

LindaG.

I'm working with a long (150 page) gnarly document that was scanned in as a
word document. I'm using Word 2003. How do I delete all formatting in this
original document. Most specifically, all column commands and all page
breaks?
 
J

Jay Freedman

I'm working with a long (150 page) gnarly document that was scanned in as a
word document. I'm using Word 2003. How do I delete all formatting in this
original document. Most specifically, all column commands and all page
breaks?

In the Replace dialog, click the More button. Click in the Find What
box and then click the Special button. Select "Manual Page Break" and
click OK. This puts the code ^m in the Find What box. Leave the
Replace With box blank, and click Replace All. That will remove all
manual page breaks.

Select the entire document (Ctrl+A), go to the Columns dialog, and
choose "One".

To remove all direct (non-style) paragraph formatting, you can select
everything and press Ctrl+A.

To remove all direct font formatting, select everything and press
Ctrl+spacebar.

If you just want to flatten the whole document to Normal style, select
all and press Ctrl+Shift+N.

There's probably another problem that's not so easily dealt with.
Scanning/OCR software often puts blocks of text into text boxes to try
to maintain absolute position. Although it's possible to write a macro
to get the text out of the boxes, it often isn't possible for the
macro to know where to put the text, so it winds up as a worse jumble
than the original.
 
S

Suzanne S. Barnhill

A said:
To remove all direct (non-style) paragraph formatting, you can select
everything and press Ctrl+A.

Should be Ctrl+Q (Ctrl+A to Select All first).
Scanning/OCR software often puts blocks of text into text boxes to try
to maintain absolute position.

If you're lucky, the software will use frames instead of text boxes, and
Ctrl+Q will remove those. The main problem then becomes that the order of
the text when the frames are removed will reflect the order of the
paragraphs to which the frames were anchored.



Jay Freedman said:
In the Replace dialog, click the More button. Click in the Find What
box and then click the Special button. Select "Manual Page Break" and
click OK. This puts the code ^m in the Find What box. Leave the
Replace With box blank, and click Replace All. That will remove all
manual page breaks.

Select the entire document (Ctrl+A), go to the Columns dialog, and
choose "One".

To remove all direct (non-style) paragraph formatting, you can select
everything and press Ctrl+A.

To remove all direct font formatting, select everything and press
Ctrl+spacebar.

If you just want to flatten the whole document to Normal style, select
all and press Ctrl+Shift+N.

There's probably another problem that's not so easily dealt with.
Scanning/OCR software often puts blocks of text into text boxes to try
to maintain absolute position. Although it's possible to write a macro
to get the text out of the boxes, it often isn't possible for the
macro to know where to put the text, so it winds up as a worse jumble
than the original.

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup
so all may benefit.
 
Top