Word 2007 byte value representation of Unicode characters

W

wjm

I've inserted a hex 0419 character in my Word 2007 document using Insert
Symbol with the Arial Unicode MS font.

When I append ".zip" to the document, extract it into a separate folder, and
open the "document.xml" file, I see the following coding:

<w:r>
<w:rPr>
<w:rFonts w:ascii="Arial Unicode MS" w:eastAsia="Arial Unicode MS"
w:hAnsi="Arial Unicode MS" w:cs="Arial Unicode MS" w:hint="eastAsia"/>
</w:rPr>
<w:t>Й</w:t>
</w:r>

When I put my cursor on the Unicode character and view the hex binary
representation, it appears as "D0 99".

What is the logic used to translate "0419" as "D0 99"?
 
P

Pesach Shelnitz

Hi,

You didn't say what software you used to reveal the Hex codes of the
characters, so I couldn't reproduce exactly what you observed. Instead, I
copied and pasted the coding in your posting into a blank Word document,
placed the cursor after the Cyrillic uppercase ee kratkoye (Й), and pressed
Alt+X to reveal the hexidecimal value of this character. The result is 0419.
Thus, the character that you inserted (Unicode Hex 0419) is the same
character that appears in the coding in your posting, and no translation has
been performed. Am I missing something in your question?

Thanks,
Pesach Shelnitz
 
T

Tony Jollans

I can't reproduce this either. Where are you looking at the (D099) hex code
and how are you exposing it?
 
T

Tony Jollans

OK. I understand now. What you are seeing is the actual encoded data as
stored, and it is stored in UTF-8 format (as declared at the beginning of
the file), and U+0419 is 0xD099, when converted to UTF-8.
 
W

wjm

Great. Thanks.



Tony Jollans said:
OK. I understand now. What you are seeing is the actual encoded data as
stored, and it is stored in UTF-8 format (as declared at the beginning of
the file), and U+0419 is 0xD099, when converted to UTF-8.

--
Enjoy,
Tony

www.WordArticles.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top