formatting text

dk · Nov 10, 2004

We are scanning text & drawing text frames and saves it as rtf when we open
microsoft wofd everthing looks OK, but we want to have all the text as plain
text it saves the frame text #1 to be last in the row Ex: the page comes in
as 4 text frames when it saves to asccii frame #1 is going to be page 4 ,
frame #2 is page 1 etc

Jezebel · Nov 10, 2004

Is there a question in there? Or even a main verb?

dk · Nov 11, 2004

it's plain a list of numbers (not in ascending order)

Daiya Mitchell · Nov 11, 2004

It's still not clear what you're talking about, but I'm going to make a wild
guess and say that frames are a Word feature that RTF or ASCII can't handle.
Maybe you could explain more clearly, with one important fact per sentence.

DM

dk · Nov 11, 2004

We are scanning text & drawing text frames and saves it as rtf when we open
microsoft word everthing looks OK, but we want to have it saved text as
plain text it saves the frame text #1 to be last in the row Ex: the page
comes in as 4 text frames when it saves to ascii frame #1 is going to be
page 4 , frame #2 is page 1 etc the wording is only numeric this is rtf
file that has frames & they look good as is but when we convert to ascii the
far left frame turns over to last in the row the last page if we can email a
file

1 23 12 11
2 55 13 19
21 34 17 24
4 5 18 25

Jezebel · Nov 11, 2004

I tried. I really tried to make sense of this. And I truly have not the
faintest idea what you're trying to say. Let alone what you're asking.

Is this perhaps a Babelfish translation from Korean?

Jay Freedman · Nov 11, 2004

The James Joyce idea isn't that far off... you have to divine the gestalt
and feel dk's pain...

It sounds like the OCR program that's being used to convert the scanner
output to editable text is making a mess of the result. Many OCR programs
try to match the positioning of text on the original page by putting chunks
of text into frames, which then float in the resulting Word document. The
problem with this is that all of the frames on a page are usually anchored
to the first paragraph on the page, so there's no permanent record of the
order in which the frames should appear. Any sort of processing other than
very simple editing or printing causes the frames to move around, and often
go out of order. Saving to RTF and reopening the document is certainly going
to do that.

One possibility -- not a good one -- is to insert a separate empty text
paragraph for each frame and move the anchors to those paragraphs, in the
order they should appear. Since the frames are still floating, though, they
can still move out of position. A better solution is to cut the text from
the frames and paste it into plain text paragraphs, then delete the frames.
Unfortunately, there is no quick or automatic way to do this tedious job.
(Yes, there's a Remove Frame button in the Format Frame dialog, but it will
dump the frame's contents wherever the frame's anchor happens to be. This is
not a solution.)

Possibly there's some control in the OCR program that lets you turn off the
"match original page position" effect and get unframed plain text to start
with. If so, that would be the best solution.

Joseph N. · Nov 11, 2004

I tried. I really tried to make sense of this.

At first I was intrigued by your subject line, then disappointed
that it really just concerned Word, until I read the underlying
message. Your subject line is exactly dead on! Thanks for the
chuckle.

formatting text

dk

Jezebel

dk

Daiya Mitchell

dk

Jezebel

Jay Freedman

Joseph N.