Batch conversion of Publisher files to Word

T

technogenii

I went through the half dozen or so posts of conversion from Publisher to
Word but they didn't address my following question.

I have 240 MS Publisher files that were created by someone else. I need to
extract the content and import it into MS Word. The original files were used
for the publication of a manual... yes you guess it, a 240 page manual and
why they decided to make 240 seperate files for each page is way beyond me
but perhaps you MS Publisher experts would have a rationale for this (I've
used Quark in the past and just made one big file... but hey! to each their
own).

Now as I mentionned, I need to get the content which is text and images to a
Word document. I'm dreading the idea of copy/paste of 240 files, and as
mentionned, the save to doc feature doesn't export images (otherwise I'd have
created a macro to do this for me).

Any tips? Or should I just bite the bullet and copy/paste?

Many thanks,
K.
 
E

Ed Bennett

technogenii said:
I need to extract the content and import it into MS Word.

Why? I'm sure you have a perfectly sensible reason (you seem like a
relatively un-clueless person), but perhaps multiple heads can be better
than one and think of a viable alternative to the dreaded Word document.
The original
files were used for the publication of a manual... yes you guess it,
a 240 page manual and why they decided to make 240 seperate files for
each page is way beyond me but perhaps you MS Publisher experts would
have a rationale for this

I certainly can't.
(otherwise I'd have
created a macro to do this for me).

You could automate copy and paste by using COM (or COM Interop) in an
application. If you don't have an application that will develop against
COM, you can get Visual Basic 2005 Express Edition for free from
http://msdn.microsoft.com/vstudio/express/vb/. The language and syntax are
very similar to macro development. If you fancy learning a more widely-used
professional language, you might want to get Visual C# or C++ 2005 Express
Edition instead. http://msdn.microsoft.com/vstudio/express/default.aspx has
all the available languages.
 
T

technogenii

Ed Bennett said:
Why? I'm sure you have a perfectly sensible reason (you seem like a
relatively un-clueless person), but perhaps multiple heads can be better
than one and think of a viable alternative to the dreaded Word document.
Yeah I figured someone would ask why :)
Well the deal is that the content is going through a total revamp which
requires massive editing (spelling, grammar, sections being re-written),
something that Word does much better, and then will be put into an online
content management system (database fields) for online distribution. So this
is step one of many.

So perhaps this sheds a little light on "why".
I certainly can't.
Yes. Well I'm assuming they were not experts because every single bit of
text is in it's own seperate box. Aren't nightmares lovely?
You could automate copy and paste by using COM (or COM Interop) in an
application. If you don't have an application that will develop against
COM, you can get Visual Basic 2005 Express Edition for free from
http://msdn.microsoft.com/vstudio/express/vb/. The language and syntax are
very similar to macro development. If you fancy learning a more widely-used
professional language, you might want to get Visual C# or C++ 2005 Express
Edition instead. http://msdn.microsoft.com/vstudio/express/default.aspx has
all the available languages.

Thanks for this. I'll get one of my programmers look into the macro
solution. But I'm thinking for this project, it might actually be quicker
(everything being relative of course) to copy/paste. Blech!

Thanks for your advice!
K.
 
E

Ed Bennett

technogenii said:
Yeah I figured someone would ask why :)
Well the deal is that the content is going through a total revamp
which requires massive editing (spelling, grammar, sections being
re-written), something that Word does much better, and then will be
put into an online content management system (database fields) for
online distribution. So this is step one of many.

Spelling is as easy in Publisher as in Word; grammar is more Word's
department I'll admit. Sections being rewritten is just as easy in either.

Note that a copy/pasted publication will be harder to edit in Word than to
edit the source in Publisher, as it will be in a mass of text boxes which
will jump if you move them the slightest amount. They also may not import
into your content management system correctly.
Yes. Well I'm assuming they were not experts because every single bit
of text is in it's own seperate box. Aren't nightmares lovely?

Often, many different bits of text belong in separate boxes. I can't see
the publication itself, so I couldn't comment.
 
T

technogenii

Ed Bennett said:
Spelling is as easy in Publisher as in Word; grammar is more Word's
department I'll admit. Sections being rewritten is just as easy in either.

Well, for someone who is much more used to Word, I'll say that the
rewritting will be easier in Word.
Note that a copy/pasted publication will be harder to edit in Word than to
edit the source in Publisher, as it will be in a mass of text boxes which
will jump if you move them the slightest amount. They also may not import
into your content management system correctly.

Well I actually have to get rid of the text boxes... I know this sounds
complicated but the whole structure, headings breakdown and everything is
wrong.
Often, many different bits of text belong in separate boxes. I can't see
the publication itself, so I couldn't comment.

I understand when this is useful... as I mentioned, I have worked with
Quark, but in this case, it's absolutely useless. A new text box is created
for each paragraph.

And new information will need to be added... And since every page is a new
document.... need I say more?

Again, thanks so much for your help.

K.
 
M

Mike Koewler

Just a thought - it may sound like a lot of work but might be much
quicker in the long run. I figure the amount of time it would take to
copy/paste a text box full of text from Pub to Word would be, at most,
10 seconds per text box (Click on box, CTRL/A, CTRL/C, click on the Word
Doc, place the cursor, CTRL/V). That would be 30 seconds per page. Add
in 20 more seconds to open each Pub file or about two hours for all of
them. A really adroit person should be able to easily accomplish this in
six hours or less. I suspect that writing a macro, testing it, applying
it and then tweaking everything would take longer. I'm not sure how Pub
works, but if the text frames in each file are linked, it may be
possible to Copy/Paste all the text in that file in one swoop.

I always look for the simplest way to do something first and if I find
it isn't feasible, move on to something more complicated.

Mike
 
T

technogenii

Mike Koewler said:
I always look for the simplest way to do something first and if I find
it isn't feasible, move on to something more complicated.

Mike

We think alike. It was a very time consuming task, but I got through 1/4 in
a few hours. I'm going to continue this way as I'm also getting the advantage
of of acquainting myself with the content.

Thanks for your input.
K.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top