RTF document from WORD PAD versus RTF document from MSWORD

D

Diane

Group,
I am doing some programming in VB that reads an RTF document and appends
text to it. For testing, I've created RTF documents from WORDPAD and run my
VB program, all processing works great. Now when I test my code on existing
RTF documents that were originally created in MSWORD and "saved as" an RTF
document, I am reading lines and lines of garbage... such as listed
below......

I am trying to understand if I can clean up these MSWORD/RTF documents, or
the difference between an RTF document created in WORD PAD versus an RTF
document created from MSWORD and SAVED AS "RTF".

These MSWORD/RTF' documents are being "corrupted" after I try to use
system.IO functions in my VB program.

My MSWORD/RTF garbage.....
{\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\deff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deflang1033\deflangfe1033{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose
02020603050405020304}Times New Roman;}{\f36\froman\fcharset0\fprq2{\*\panose
02040602050305030304}Book Antiqua;}{\f246\froman\fcharset238\fprq2 Times New
Roman CE;}
{\f247\froman\fcharset204\fprq2 Times New Roman
Cyr;}{\f249\froman\fcharset161\fprq2 Times New Roman
Greek;}{\f250\froman\fcharset162\fprq2 Times New Roman Tur;}{\f251\fbidi
\froman\fcharset177\fprq2 Times New Roman (Hebrew);}

Thanks,
 
P

Peter Jamieson

In effect, each version of Word has introduced a new version of RTF that
is capable of encoding the features supported by that version. There
is a load of stuff to support Styles and various internationalisation
features, which is the "garbage" you are seeing. In contrast, WordPad is
"stuck" at an earlier version of RTF (around Word 6, I think).

If you are using VB.NET then I'd guess you could be using its richtext
classes, which purport to support RTF. However, the online documentation
is not specific about which version of RTF is supported, which suggests
to me that no-one on the dev. side actually knows or no-one on the
documentation side realised that it might be an issue.

So it is possible that you are encountering problems because these
classes do not recognise the RTF standard that Word is currently using.

But what you need to do depends in any case on what you are trying to
achieve -
a. if you need to preserve all the rtf that your application is
reading, but merely change some text content (for example) then it's
really a question of identifying which bits of RTF you need to change
and leaving everything else as is. Not that that is likely to be
straightforward. Or
b. maybe you need your output RTF to conform to an earlier RTF
standard (e.g. WordPad/Word 6). For that, I think you'd either
- have to read and understand the RTF standard (not my idea of fun)
and implement your own software to strip out the stuff tht you do not
need, or
- automate something like WordPad to open Word RTF, then save t
immediately. WordPad saves "WordPad RTF", i.e. slimmed down, but with
some WordPad additions. But AFAIK you cannot automate WordPad using COM
- the best you could do is to run it from a shell and try to use
SendKeys or some such to control it. Not good. Or perhaps you could
install a much earlier version of Word and control that via automation.
(Not much good, either), or
- find a third party library, object, or class that can extract the
"siple" RTF you need from any standard or widely used version of RTF,
and/or find one library that can convert to a rich format and another
that can convert from a rich format to the level of RTF you need.
Perhaps one of those ,NET Framework classes can help you there -
however, you might be better off looking for a suitable .NET group for
advice on that.

Peter Jamieson

http://tips.pjmsn.me.uk
Visit Londinium at http://www.ralphwatson.tv
 
D

Diane

Peter,
I am using VS 2005, my goal is to keep existing text, and in the end, I want
to keep the formatting - although I was just focusing on getting the text
properly placed (i.e inserting text on line 3, moving existing text down the
page). Although I'm using VS code, I was hoping this forum could help me
understand what actually happens in MS WORD with RTF format. Conforming to
an earlier standard isn't anything that I need to do, and automating WordPad
doesn't seem to be an appealing choice either. With your post, in VS2005,
and as you suggest, I will try my luck with the rich text classes, (this is
new to me), but probably my better choice for what I am accomplishing, which
is:

Read a folder (all rtf documents),
create a new rtf document if it doesn't exist,or position on line 3 of an
existing document, add text and move the existing contents down the page.

My problem is that if I have an existing document, I have text on lines 1,
2, & 3, I need to "insert" my new text on line 3, pushing all other text down
the page.

Now knowing what I am trying to accomplish in Studio, and if you have any
other suggestions for me, I'm interested.

As always, I appreciate your post!
 
P

Peter Jamieson

Well, I know little about the .NET rich text classes/controls other than
the stuff I mentioned.

If I were attempting this I would
a. work out exactly what it is I needed to do (e.g. are we talking
about line 3 or paragraph 3? Could it be either? Could the lines be in a
frame or text box? Are the documents I'm changing consistently laid out
so that I always need to do the same thing? What about formatting?)
b. start by looking for \par }, perhaps \pard }, and/or \line (not
"\line }" and counting. There may well be a couple of other things you
would need to look for
c. insert either "my text \line" after a \line or { my text \par }
after a \par
d. make the simplifying assumption that whatever opens/prints these
..rtf files next knows how to paginate.
e. see how that goes.

Peter Jamieson

http://tips.pjmsn.me.uk
Visit Londinium at http://www.ralphwatson.tv
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top