Gary Smith wrote...
So far as I know, there is no standard. Every software producer seems to
have come up with its own variant, and some use more than one. The best
are configurable so you can produce whatever you like.
Further, CSV files are text files. There's no more standardization in
text files than there is in CSV files. On IBM mainframes, text files
use EBCDIC character encoding, while on most other systems they use
ASCII or Unicode. On Unix systems, lines end with a single linefeed
character, on Macs with a single carridge return, and on PCs/Windows
systems a carridge return-linefeed combination. Programmers call each
of these newlines.
Some software uses Unix escape character conventions for embedding
double quotes, e.g., "The next character \" is a double quote.", while
Excel uses doubled double quotes, e.g., "The next character "" is a
double quote." Most software doesn't accept newlines in fields, but
Excel does as long as they're embedded in double quote delimited
fields.
Finally, the comma isn't always the field separator. Most continental
European countries use commas as the decimal point in numbers with
fractional parts. Excel uses the Windows list separator character as
the field separator in CSV files.
Unfortunately, there are no published specifications for CSV files. As
general rules, it's safest to put double quotes around all strings and
never embed newlines. As for embedding double quotes in text fields,
you'll need to experiment.