Non-breaking hyphen disappears when using Paste Special

M

Mary

I frequently use Paste Special (as unformatted text) to copy text between
documents. Today I noticed that non-breaking hyphens disappear if they have
been created in the source file by using CTRL+Shift+Hyphen or by using
Insert>Symbol>Non-breaking hyphen. They do come over intact if they have
been inserted using ALT+0173. Otherwise they are replaced by a regular
space. Why does this happen? I'm using Word 2000 on Windows XP. Does this
happen in other versions of Word also? You would think that Microsoft's
suggested shortcut would not cause this problem.

Has anyone experienced similar problems with other symbols such as
em-dashes, en-dashes, non-breaking spaces?

Thanks.
 
K

Klaus Linke

Mary said:
I frequently use Paste Special (as unformatted text) to copy text
between documents. Today I noticed that non-breaking hyphens
disappear if they have been created in the source file by using
CTRL+Shift+Hyphen or by using Insert>Symbol>Non-breaking
hyphen. They do come over intact if they have been inserted using
ALT+0173. Otherwise they are replaced by a regular space.
Why does this happen? I'm using Word 2000 on Windows XP.
Does this happen in other versions of Word also? You would think
that Microsoft's suggested shortcut would not cause this problem.

Has anyone experienced similar problems with other symbols
such as em-dashes, en-dashes, non-breaking spaces?

Thanks.


Hi Mary,

Yes it's a bug. And it's a real bummer consisting of several bugs.

The non-breaking hyphen is coded as ^30 (U+001E) in Word.
It does paste as a space when you use "Paste Special > Unformatted text".
The proper code would be U+2011, as far as I can see.

Word should at the very least paste a hyphen instead of a space.

Alt+0173 (U+00AD) should really be a soft hyphen, according to the Unicode
Standard.
Word seems to use Alt+31 (U+001F) for the soft hyphen, while Alt+0173 acts
much like a non-breaking hyphen.

¬ = Alt+0172 = U+00AC = "not sign" and
¶ = Alt+0182 = U+00B6 = "paragraph sign" are disappearing completely when
you copy them and paste as unformatted text.
Alt+0160 = U+00A0 = "no-break space" turns into a regular space.

Manual line breaks in Word use the code ^11 (U+000B) instead of the Unicode
character U+2028 ("line separator"). They turn into paragraph marks.

If you save text from Word as Unicode text files, the replacements are even
much worse: Smart quotes turn into straight quotes, the En-dash into a
regular dash.
If you save as plain text, all characters with codes between 129 and 160
are botched up: the copyright symbol turns into (c), ...

I doubt that any of this is fixed in Word2002 or 2003: Even obvious bugs
like these (some of which may be considered strange design decisions)
usually aren't fixed because of "compatibility reasons".

:-( Klaus
 
M

Mary

Thanks Klaus,

In view of what you've said, would it be better then to always insert
non-breaking hyphens as ALT+0173 (I've forgotten -- is this ANSI code?). I'm
not in the habit of saving Word files in unicode format, so that's not an
issue.

A kind of related question -- do symbols such as Wingdings appear
differently in Word XP vs. Word 97/2000? Are there problems with symbols
such as copyright, trademark, em-dashes or others depending on how they've
been inserted in the source document. If so, what's the best way to avoid
conflict? Although I work on Word 2000, most of my colleagues are still on
Word 97, while many or our clients are on Word XP. They say they are
experiencing problems with symbols changing. I'm wondering if it's just due
to chooosing symbols from fontsets that are not available on client systems.
 
K

Klaus Linke

Hi Mary,
In view of what you've said, would it be better then to always insert
non-breaking hyphens as ALT+0173 (I've forgotten -- is this ANSI code?).

As far as I can see, Alt+0173 is supposed to be a soft hyphen (both in ANSI
and in Unicode), and MS messed it up.

I personally would be reluctant to use it:

-- The bug may be fixed (perhaps it already is fixed in Word2002/2003; I
can't check that right now).
In this case, Alt+0173 would vanish.

-- Pasting Alt+0173, the character may get lost, too.
I posted a list of characters in a newsgroup message a while ago, and the
character Alt+0173 disappeared:
http://www.google.com/groups?threadm=#[email protected]

Unfortunately this means that there isn't a good work-around for the "Paste
Special > Unformatted text" bug.

:-(
I'm not in the habit of saving Word files in unicode format, so that's not
an issue.

Word uses Unicode in it's docs. You don't have to specify anything.

A kind of related question -- do symbols such as Wingdings appear
differently in Word XP vs. Word 97/2000? [...]

As you say, the main thing to watch out for is that the necessary fonts are
installed.

The Copyright and Trademark, Em-dashes, ... are available in just about
every font, so there isn't much that can go wrong: Even if another font is
substituted, they should display fine.

For more exotic symbols (greek letters, bullets, arrows, ...) I prefer to
use Unicode characters rather than symbols from Symbol fonts (=
"decorative" fonts) like "Symbol", "Wingdings", "Zapf Dingbats" ...

If you exchange documents with others, and they don't have your fonts
installed, another font will be substituted.

If that substituted font doesn't contain a symbol you used, they will see a
square box instead.

But since the code behind the character is still for the proper Unicode
character, they just have to apply a font that does contain the character.
If in doubt, applying "Arial Unicode MS" will surely work, since it contain
s all characters in Unicode, Standard 2.

Finding/replacing symbols is also much easier with Unicode symbols than
with symbols from symbol fonts: In most cases, you can just copy the
character into the dialog.
Finding/replacing symbols from symbol fonts can be a real pain.

The only reason I can see to still use Symbol fonts would be compatibility
considerations with old software that doesn't support Unicode.

Regards,
Klaus
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top