Range.Text method and problem with special chars

N

Neo

Hello All,

I have asked this earlier but would like to put this question again in
new perspective.

I want text from given word doc to insert into datbase. I am using
Range.Text method hoping that I get Text version of formatted content.
but I am wrong. Range.Text gives absurd values for special chars as
shown in follwing doc. Possibly, just first one byte.

Have a look at following doc.

http://www.ccs.neu.edu/home/pravin/Doc1.doc

If you run the macro inside, which prints the only special symbol in
doc "->", Range.text gives "(".

Now, you if you save this document as Text, you will see text doc
contains "-->" This is precisely, what I want. Is there any way I can
achieve this in Range object. Getting equivalent Text expression for
given special char??

Best Regards,
Pravin A. Sable
 
K

Klaus Linke

Hi Neo,

"Save as text" will do a number of conversions from symbols to text, such as copyright symbol © to (C) ...

They don't seem to be documented anywhere, and there's probably no way to achieve the same result without actually saving as text.

Wouldn't that be an option? You could create a dummy temp file, and delete it after you are done.

In the text, you could replace the Wingdings character with -> ... then get the text, then undo the replacement (ActiveDocument.Undo).
You can find it looking for ^u224 in "Find what" (which is pretty stupid, because the character definitely isn't Unicode character 224 -- but there are a lot of stupid things and problems associated with characters from decorative fonts).

You'll get the same problem for many other symbols though, and would have to write special code for each of them.

Regards,
Klaus
 
N

Neo

Hello Klaus,

I don't like idea of saving file as Text first because I want to parse
file using word Range.Find etc. I also realize that I can opened saved
text file in word and then parse but there are lot of characters which
are not US-ASCII in saved text document :) so I have to replace
special chars anyway. So why not do it in word itself.
Another reason, I don't want text file is, I loose all formatting in
text file. I am seraching file and I don't want to loose formatted text
serach. Although finally I insert only US-ASCII text in database, I
serach word doc with necessary formatting.

Currently this is what I am doing, and then I replaces any special
chars in returned string (which are pretty strange chars though! like
·ð Ò à ). The range is very small part of doc, so this seems to be
pretty fast.

Looks like Clipboard does retain more bytes in char than Range.text

Any thoughts?

Function getRangeNumbedText(r As Range) As String

On Error GoTo error_handler

Dim dataobj As New DataObject
If r.Text = "" Then
GoTo exit_handler
End If
Dim sOldClipBoardData As String
'Getting old data from clipboard
dataobj.GetFromClipboard
sOldClipBoardData = dataobj.GetText

r.Copy
dataobj.GetFromClipboard
getRangeNumbedText = dataobj.GetText

'putting back old data in clipboard
dataobj.SetText sOldClipBoardData
dataobj.PutInClipboard

Set dataobj = Nothing
exit_handler:
On Error Resume Next
Exit Function
error_handler:
informError "Error Desc : " & Err.Description & vbCrLf & "Error Num
: " & Err.Number, vbInformation, "Error"
Resume exit_handler
Resume

End Function
 
N

Neo

Hello Klaus,

Well, it finally wasn't that great :-( It broke, if clipboard had data
in other format than text like Picture etc. That too while giving demo
:-( If I am suppose to use this then I should just give up on old
clipboard data. Whis is not a bad idea, but I don't want to do that..

But I found another solution, this is cool one. It creates new temp
document and saves as text format, but only given range, not whole
documet .. for various reasons, one of them I am not interested in most
of the part of doc, but just small range in that, want to use word
searching with format etc. etc.


Public tempDoc As Document

sub main()

createTempDoc

msgbox getRangeNumbedText(ThisDocument.Range)

deleteTempDoc

end sub

Function getRangeNumbedText(r As Range) As String

On Error GoTo error_handler

tempDoc.Content.FormattedText = r.FormattedText
tempDoc.SaveAs , wdFormatText, , , False
tempDoc.Reload
Set tempDoc = Documents("temp.doc")
ThisDocument.Activate
DoEvents
getRangeNumbedText = tempDoc.Content.Text

exit_handler:
On Error Resume Next
Exit Function
error_handler:
informError "Error Desc : " & Err.Description & vbCrLf & "Error Num
: " & Err.Number, vbInformation, "Error"
Resume exit_handler
Resume

End Function


Sub createTempDoc()
Dim doc As Variant
Dim Found As Boolean

Found = False
For Each doc In Documents
If doc.Name = "temp.doc" Then
Set tempDoc = doc
Found = True
Exit For
End If
Next doc

If Not Found Then
Set tempDoc = Word.Documents.Add
tempDoc.SaveAs addSlash(ThisDocument.Path) & "temp.doc", , , ,
False
ThisDocument.Activate
DoEvents
End If
End Sub

Sub deleteTempDoc()
On Error Resume Next
tempDoc.Close wdDoNotSaveChanges
Set tempDoc = Nothing
Kill addSlash(ThisDocument.Path) & "temp.doc"
End Sub
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top