Word style cleaner

A

adMjb

Example I want to clean word documents, so we get a consistant
structured but basic word doc, from our clients, they are not to great
in word, so we get a real mess, below is my basic requirements:


What is needed:

Take an unstructured document and create structure, so if Heading 1 is
not using H1, then start to recognise what should be H1 and convert it
to H1

So if the largest font on one to two lines (maybe more lines) is
found, it must be H1, then work down from there, H2, H3, h4…..

examples, there could be many unstructured styles below is only
examples:

unstructured document structured document
Heading 1
Heading 1 + Bold = Heading 1
Normal + 20 pt, Bold = Heading 1
Heading 1 + 22 pt, Bold = Heading 1

Heading 2
Heading 2 + Bold = Heading 2
Normal + 16 pt, Bold = Heading 2
Heading 2 + 18 pt, Bold = Heading 2

& Heading 3, 4, 5, 6

Paragraph text:

This is where the fun starts, we need to recognise what is a
paragraph, numbered and normal body text, by eliminating the headings,
and other styles (see below) I think it will be easer:

So if its NOT a heading, or quote text, or bulleted text, not in a
table, it could be a paragraph, then a basic BodyText style should be
applied, if it has a number at the front, it should be a List
Paragraph, and the correct numbering should be followed, but style
changed to numbered List Paragraph.

Quote text

If a paragraph is found with all italic, we will assume its a quote,
so the basic "quote" style should be applied.

Character Styles

There are 4 basic Character Styles we want to use, Bold, Italic,
BoldItalic and Underline, if a single word, or part of a paragraph has
been styled with the bold button, italic button, underline button or a
combination of bold and italic, the styles in the attached document
should be applied.


Can anyone help me with this? or point me in the right direction?


Many thanks,


Adam
 
S

Stefan Blom

In recent versions of Word, you can use the Select All X Instances command
to quickly select similar formatting and then apply a style to that
selection. In any version of Word, you can use Find and Replace to replace
direct formatting with a particular style.

However, if a document is in very bad shape, formatting-wise, the best
approach is to select all text (Ctrl+A), apply Normal style (Ctrl+Shift+N),
and then simply start from the beginning, carefully reformatting every
paragraph. This will be a time-consuming task, but it ensures that you get
the end result that you want.

Of course, you should preserve (possibly in printout) a copy of the original
document for reference.

-- 
Stefan Blom
Microsoft Word MVP




---------------------------------------------
"adMjb" wrote in message

Example I want to clean word documents, so we get a consistant
structured but basic word doc, from our clients, they are not to great
in word, so we get a real mess, below is my basic requirements:


What is needed:

Take an unstructured document and create structure, so if Heading 1 is
not using H1, then start to recognise what should be H1 and convert it
to H1

So if the largest font on one to two lines (maybe more lines) is
found, it must be H1, then work down from there, H2, H3, h4…..

examples, there could be many unstructured styles below is only
examples:

unstructured document structured document
Heading 1
Heading 1 + Bold = Heading 1
Normal + 20 pt, Bold = Heading 1
Heading 1 + 22 pt, Bold = Heading 1

Heading 2
Heading 2 + Bold = Heading 2
Normal + 16 pt, Bold = Heading 2
Heading 2 + 18 pt, Bold = Heading 2

& Heading 3, 4, 5, 6

Paragraph text:

This is where the fun starts, we need to recognise what is a
paragraph, numbered and normal body text, by eliminating the headings,
and other styles (see below) I think it will be easer:

So if its NOT a heading, or quote text, or bulleted text, not in a
table, it could be a paragraph, then a basic BodyText style should be
applied, if it has a number at the front, it should be a List
Paragraph, and the correct numbering should be followed, but style
changed to numbered List Paragraph.

Quote text

If a paragraph is found with all italic, we will assume its a quote,
so the basic "quote" style should be applied.

Character Styles

There are 4 basic Character Styles we want to use, Bold, Italic,
BoldItalic and Underline, if a single word, or part of a paragraph has
been styled with the bold button, italic button, underline button or a
combination of bold and italic, the styles in the attached document
should be applied.


Can anyone help me with this? or point me in the right direction?


Many thanks,


Adam
 
S

Stefan Blom

To clean out redundant styles, you can copy the content, minus the final
paragraph mark, into a new, blank document.

Alternatively, use the "Style Report" macro created by Klaus Linke and Greg
Maxey; see http://gregmaxey.mvps.org/Style_Report.htm.

-- 
Stefan Blom
Microsoft Word MVP




---------------------------------------------
"Stefan Blom" wrote in message
In recent versions of Word, you can use the Select All X Instances command
to quickly select similar formatting and then apply a style to that
selection. In any version of Word, you can use Find and Replace to replace
direct formatting with a particular style.

However, if a document is in very bad shape, formatting-wise, the best
approach is to select all text (Ctrl+A), apply Normal style (Ctrl+Shift+N),
and then simply start from the beginning, carefully reformatting every
paragraph. This will be a time-consuming task, but it ensures that you get
the end result that you want.

Of course, you should preserve (possibly in printout) a copy of the original
document for reference.

-- 
Stefan Blom
Microsoft Word MVP




---------------------------------------------
"adMjb" wrote in message

Example I want to clean word documents, so we get a consistant
structured but basic word doc, from our clients, they are not to great
in word, so we get a real mess, below is my basic requirements:


What is needed:

Take an unstructured document and create structure, so if Heading 1 is
not using H1, then start to recognise what should be H1 and convert it
to H1

So if the largest font on one to two lines (maybe more lines) is
found, it must be H1, then work down from there, H2, H3, h4…..

examples, there could be many unstructured styles below is only
examples:

unstructured document structured document
Heading 1
Heading 1 + Bold = Heading 1
Normal + 20 pt, Bold = Heading 1
Heading 1 + 22 pt, Bold = Heading 1

Heading 2
Heading 2 + Bold = Heading 2
Normal + 16 pt, Bold = Heading 2
Heading 2 + 18 pt, Bold = Heading 2

& Heading 3, 4, 5, 6

Paragraph text:

This is where the fun starts, we need to recognise what is a
paragraph, numbered and normal body text, by eliminating the headings,
and other styles (see below) I think it will be easer:

So if its NOT a heading, or quote text, or bulleted text, not in a
table, it could be a paragraph, then a basic BodyText style should be
applied, if it has a number at the front, it should be a List
Paragraph, and the correct numbering should be followed, but style
changed to numbered List Paragraph.

Quote text

If a paragraph is found with all italic, we will assume its a quote,
so the basic "quote" style should be applied.

Character Styles

There are 4 basic Character Styles we want to use, Bold, Italic,
BoldItalic and Underline, if a single word, or part of a paragraph has
been styled with the bold button, italic button, underline button or a
combination of bold and italic, the styles in the attached document
should be applied.


Can anyone help me with this? or point me in the right direction?


Many thanks,


Adam
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top