Opening UTF8-encoded CSV file (Excel 2004 for Mac)

A

Antonio

Is there a workaround to this issue yet? I see no option to specify
the character encoding of a CSV file when it is being opened. I've
seen several posts in this group about this issue over the last few
years, with no resolution. There seems to be no useful knowledge base
information either.

Using the "Data/Get External Data/Import Text File..." menu does not
work either, as the list of encodings it gives is limited to a sorry
list of Macintosh, Windows, or DOS.

Help?

[Ironically, I've been using Open Office to do this with no trouble. I
was going to switch to Excel because it's faster, but this is a deal
breaker.]

Thanks.
 
J

Joel

One possible reason for your problem could be the way the different operating
system handles the carriage returns and line feed. The macro below fixes the
carriage returns and linefeed so the file so it is compatable with Windows.
Modify the Path and input and output file names as needed.

The macro runs on an input and output file and does not bring any data into
excel. After the macro is run take the output file and import the file into
Excel.

On UNIX systems there are two programs that automatically perform the
conversion which are unix2dos and dos2unix. Another way of correting the
problem is if you use FTP to transfer files between systems (not in binary
mode). FTP should correct the differences in formats when the files are
transfered.

I alswo thought that Macintosh computers had utilites to do the conversion
to windows. Try converting the files on the Mac before bringing them to
windows.


Sub fixcrlf()

Const ForReading = 1, ForWriting = 2, ForAppending = 3
Const MyPath = "C:\temp\"
Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0

Dim CR As String
Dim LF As String

CR = Chr(13)
LF = Chr(10)


Set fsread = CreateObject("Scripting.FileSystemObject")
Set fswrite = CreateObject("Scripting.FileSystemObject")

ReadFileName = "intext.txt"
WriteFileName = "outtext.txt"


'open files
ReadPathName = MyPath + ReadFileName
Set fread = fsread.GetFile(ReadPathName)
Set tsread = fread.OpenAsTextStream(ForReading, TristateUseDefault)

WritePathName = MyPath + WriteFileName
fswrite.CreateTextFile WritePathName
Set fwrite = fswrite.GetFile(WritePathName)
Set tswrite = fwrite.OpenAsTextStream(ForWriting, TristateUseDefault)

OutputLine = ""
FoundCR = False
Do While tsread.atendofstream = False

MyChar = tsread.Read(1)

Select Case MyChar

Case LF
If FoundCR = False Then
tswrite.write CR
Else
FoundCR = False
End If
Case CR
If FoundCR = True Then
tswrite.write LF
Else
FoundCR = True
End If
Case Else
If FoundCR = True Then
tswrite.write LF
FoundCR = False
End If
End Select
tswrite.write MyChar
Loop

tswrite.Close
tsread.Close

End Sub

Antonio said:
Is there a workaround to this issue yet? I see no option to specify
the character encoding of a CSV file when it is being opened. I've
seen several posts in this group about this issue over the last few
years, with no resolution. There seems to be no useful knowledge base
information either.

Using the "Data/Get External Data/Import Text File..." menu does not
work either, as the list of encodings it gives is limited to a sorry
list of Macintosh, Windows, or DOS.

Help?

[Ironically, I've been using Open Office to do this with no trouble. I
was going to switch to Excel because it's faster, but this is a deal
breaker.]

Thanks.
 
A

Antonio

One possible reason for your problem could be the way the different operating
system handles the carriage returns and line feed. The macro below fixes the
carriage returns and linefeed so the file so it is compatable with Windows.
Modify the Path and input and output file names as needed.

This is not a CR-LF vs. LF issue. It is an encoding issue. Excel does
not (as far as I can see) give the option of opening a CSV file in the
standard UTF-8 encoding for Unicode. It has no trouble finding the
ends of lines and separating the lines into columns according to the
commas. It simply misinterprets all non-ASCII UTF-8 characters. The
result is a spreadsheet with all the numerical data correct, but all
the textual data corrupted.

But thanks.

(In any case, OS X *is* a Unix operating system, and my source files
come from another flavor of Unix. So this wouldn't be an issue.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top