The Mysterious ^p and CR/LF

J

JimEagleOne

I'm trying to unravel the mysteries of the ^p "character" and trying using
search and replace.

I have numerous ASCII text files. They unfortunately have solitary CR
characters (Hex 0D) buried in the text, often in series, as well as the
standard CR/LF. So, I thought I would use word to delete them. It's turned
out to be a bigger task than I thought.

These Hex 0D characters are shown in Word two ways using the Courier New font:
1. In the middle of a line, it appears as a box.
2. At the end of a line, it disappears next to the end of line marker even
when the show formatting marks is turned on, so you can never really see
them. A hex dump shows the characters to be Hex 0D0D0A.

I have huge files I have to process, so I've made myself a macro to try to
delete these evasive little characters. But it's been rather difficult.

But here's one process I created that sort of works. It's a three pass
routine:

1. Replace all LFs with a character string of "~~~%". This method exposes
all the solitary CRs and Word internally coverts their display to paragraph
marks.
2. Next step is to delete all the solitary CRs.
3. Finally, revert the true end-of-line markers "~~~%" back to "^p" and we
are done.

After that, the appearance of the text is correct and all the solitary CRs
are gone. But there's a catch. All the CRs are now solitary CRs in the code
and are no longer CR/LF even though it displays correctly. And I don't see a
way to convert them back using a macro. I've tried replacing ^p with ^13^10,
but that just adds an extra box character. Maybe it doesn't matter after the
text file is saved.

I'm stumped. I'm stumped. Is there a better way?
 
H

Helmut Weber

Hi Jim,

I think the best way would be not to use Word at all,
at least not Word's surface or a Word document.
Too many interferences.
If you want to clean a file from chr(13),
except chr(13) followed by chr(10),
then I would read the file byte by byte,
check the next byte, and write a new file byte by byte.

But I don't feel qualified to say more on this.

Anyway, if I run this code on an ordinary Word file:

Sub test9001()
Dim rDcm As Range
Set rDcm = ActiveDocument.Range
With rDcm.Find
.Text = "^13"
While .Execute
rDcm.Text = vbCrLf
rDcm.Start = rDcm.End + 2
rDcm.End = ActiveDocument.Range.End - 2
Wend
End With
End Sub

and open it again, then chr(13)
has been replaced with chr(13) chr(10), = vbcrlf

Except the last character in the doc.
No way, it seems, for the end of doc.

Sorry.

--
Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Win XP, Office 2003
"red.sys" & Chr$(64) & "t-online.de"
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top