Removing an EOF marker in a text file for importation

B

berybery

Hello,
I'm using ACCESS 2003.

Whenever I try to import a text file into ACCESS, the sample display in the
Import/Export wizard shows me garbage. If I remove the EOF marker in the text
file it comes into ACCESS nicely.

Is there a way for ACCESS to remove the EOF marker for me, without me having
to do it manually?
Thank you,
Karlene
 
J

John Nurick

Hi Karlene,

Windows text files don't normally have or need special terminating
symbols (their size is stored in the directory entry). Anyway, using
Access 2003 I have no problems importing text files terminated with or
including the CP/M-DOS EOF character (\x1A), and the Unix equivalent
(\x04) seems to be treated as an ordinary character.

So I'm not sure what you mean by "the EOF marker", or what sort of text
files you are trying to import. Can you give some more details?
 
B

berybery

Hello John,
The file is coming from a VSE\ESA main frame environment, using the file
transfer feature, from an EBCIDEC to ASCII format with this EOF marker, that
looks like a little square.

Believe it or not, Access 2.0 takes this file with the EOF marker with no
worries. I'm actually trying to upgrade an old Access 2.0 application to
Access 2003. The EOF marker is giving me problems. I do not want to manually
remove it. Any help will be appreciated. Thank you
Karlene
 
J

John Nurick

Karlene,

Recent versions of Access are fussier about the text files they'll
import, though the problems I've seen to date involve end-of-line
markers rather than end-of-file ones.

But I can't imagine any single character at the *end* of a file mucking
up the preview in the text import wizard, and I've tested the EOF
characters I know of without provoking the trouble. So unless you tell
me which actual character is used as EOF in your system, I can't
reproduce the problem. (One way of finding out is to open the text file
with a hex editor such as Hexedit.)

I don't know a neat way of removing the last character from a file using
VBA. Usually it's necessary to write all but the last character to a new
file, and I'd probably do this by filtering it through a Perl one-liner,
e.g.
perl -0 -pe"chop" old.txt > new.txt
if the file is small enough to fit in memory. Maybe you could get the
mainframe people to do something like this as part of generating the
ASCII file for you.
 
B

berybery

Hello John,
Opened the file in Hexedit and it is a "Control-Z" character with a "/A" hex
value and an ASCII value of 26.

I am going to explore the option of the mainframe throwing out the file
minus the EOF marker.

What I discovered though, is that if I open and save the file as a notepad
file, then the file goes into ACCESS nicely with the end of file marker. This
is one way of working around the problem, but I'm still not too happy about
having to open the file with "live" data to do that.

Karlene
 
J

John Nurick

Karlene,

Now I'm mystified. Ctrl-Z (hex 1A, ASCII 26) is Windows's ancestral EOF
marker and, as I said in my first message, it doesn't normally cause any
problems and certainly doesn't screw up the preview in the text import
wizard. So I'm now fairly confident that your problem has nothing to do
with the EOF marker.

Instead, I think it's the record separators (line breaks). As the file
has come from a non-Windows system they may well be something other than
the Windows-standard hex 0D 0A (CR LF). Access versions up to 97
accepted plain CR or plain LF, but more recent versions require the two
characters CR LF.

What's more, when you open such a file in Notepad, edit it and save it,
Notepad writes it back to disk with CR LF pairs, after which Access can
handle it OK.

Please check with Hexedit what record separator the text file uses. If
it's just CR or just LF, you need to get it changed to CR LF. The best
way to do this (if possible) would be to get the mainframe people to
modify the script that creates the file.

If that's not possible, post back here: you'll need either to run the
file through a filter before importing it, or to write custom VBA code
to read and import the non-standard file.
 
B

berybery

Good morning John,
The record separator is identified by both the "OD and OA" characters. At
the end of the file we have the OD, OA and 1A characters.

I am interested though, in your last suggestion, running the file through a
filter or using some VB code.
Thank you,
Karlene
 
J

John Nurick

Karlene,

0D 0A at the end of each line is the Windows standard, and Access 2003
isn't bothered by a 1A at the end of the file, so something else must be
going on.

In your place, the next thing I'd do is run some tests on the text file,
checking for things such as the wrong number of fields in one or more
records, missing delimiters/qualifiers, line breaks within the data,
wrongly escapedd quote marks, etc. (Though if as you say the file can be
imported after you have opened it and saved it in Notepad, that doesn't
sound plausible either.)

If the file isn't confidential and isn't too large, I'd be happy to run
some tests on it if you zip it up and email it (omitting the bogus
middle name from my email address).
 
B

berybery

Hello John,
Will send a sample file via e-mail and will continue to explore my options
of producing the main frame file and getting rid of the EOF marker.
Karlene
 
J

John Nurick

Hi Karlene,

I've received the sample file. As I expected, the problem has nothing to do
with the EOF marker; it is all the null characters (Ascii 0, \x0) that are
upsetting things. Replace them with spaces and things work as normal.
Opening and saving in Notepad does this replacement without telling you.

If you can, get the mainframe guys to alter their export routine. If they
(or you) have Perl installed, all it takes is a one-liner like this to do
the replacement, leaving the original file renamed with .bak:

perl -i.bak -pe"tr/\0/ /" "D:\folder\filename.dat"

Otherwise it's possible to write VBA code to replace the nulls. You have to
read the text file line by line, replacing the null characters and writing
the edited line to a new file.

berybery said:
Hello John,
Will send a sample file via e-mail and will continue to explore my options
of producing the main frame file and getting rid of the EOF marker.
Karlene

John Nurick said:
Karlene,

0D 0A at the end of each line is the Windows standard, and Access 2003
isn't bothered by a 1A at the end of the file, so something else must be
going on.

In your place, the next thing I'd do is run some tests on the text file,
checking for things such as the wrong number of fields in one or more
records, missing delimiters/qualifiers, line breaks within the data,
wrongly escapedd quote marks, etc. (Though if as you say the file can be
imported after you have opened it and saved it in Notepad, that doesn't
sound plausible either.)

If the file isn't confidential and isn't too large, I'd be happy to run
some tests on it if you zip it up and email it (omitting the bogus
middle name from my email address).


Good morning John,
The record separator is identified by both the "OD and OA" characters.
At
the end of the file we have the OD, OA and 1A characters.

I am interested though, in your last suggestion, running the file
through a
filter or using some VB code.
Thank you,
Karlene

:

Karlene,

Now I'm mystified. Ctrl-Z (hex 1A, ASCII 26) is Windows's ancestral
EOF
marker and, as I said in my first message, it doesn't normally cause
any
problems and certainly doesn't screw up the preview in the text import
wizard. So I'm now fairly confident that your problem has nothing to
do
with the EOF marker.

Instead, I think it's the record separators (line breaks). As the file
has come from a non-Windows system they may well be something other
than
the Windows-standard hex 0D 0A (CR LF). Access versions up to 97
accepted plain CR or plain LF, but more recent versions require the
two
characters CR LF.

What's more, when you open such a file in Notepad, edit it and save
it,
Notepad writes it back to disk with CR LF pairs, after which Access
can
handle it OK.

Please check with Hexedit what record separator the text file uses. If
it's just CR or just LF, you need to get it changed to CR LF. The best
way to do this (if possible) would be to get the mainframe people to
modify the script that creates the file.

If that's not possible, post back here: you'll need either to run the
file through a filter before importing it, or to write custom VBA code
to read and import the non-standard file.






On Wed, 11 May 2005 12:14:04 -0700, "berybery"

Hello John,
Opened the file in Hexedit and it is a "Control-Z" character with a
"/A" hex
value and an ASCII value of 26.

I am going to explore the option of the mainframe throwing out the
file
minus the EOF marker.

What I discovered though, is that if I open and save the file as a
notepad
file, then the file goes into ACCESS nicely with the end of file
marker. This
is one way of working around the problem, but I'm still not too happy
about
having to open the file with "live" data to do that.

Karlene

:

Karlene,

Recent versions of Access are fussier about the text files they'll
import, though the problems I've seen to date involve end-of-line
markers rather than end-of-file ones.

But I can't imagine any single character at the *end* of a file
mucking
up the preview in the text import wizard, and I've tested the EOF
characters I know of without provoking the trouble. So unless you
tell
me which actual character is used as EOF in your system, I can't
reproduce the problem. (One way of finding out is to open the text
file
with a hex editor such as Hexedit.)

I don't know a neat way of removing the last character from a file
using
VBA. Usually it's necessary to write all but the last character to
a new
file, and I'd probably do this by filtering it through a Perl
one-liner,
e.g.
perl -0 -pe"chop" old.txt > new.txt
if the file is small enough to fit in memory. Maybe you could get
the
mainframe people to do something like this as part of generating
the
ASCII file for you.




On Tue, 10 May 2005 06:28:02 -0700, "berybery"

Hello John,
The file is coming from a VSE\ESA main frame environment, using
the file
transfer feature, from an EBCIDEC to ASCII format with this EOF
marker, that
looks like a little square.

Believe it or not, Access 2.0 takes this file with the EOF marker
with no
worries. I'm actually trying to upgrade an old Access 2.0
application to
Access 2003. The EOF marker is giving me problems. I do not want
to manually
remove it. Any help will be appreciated. Thank you
Karlene

:

Hi Karlene,

Windows text files don't normally have or need special
terminating
symbols (their size is stored in the directory entry). Anyway,
using
Access 2003 I have no problems importing text files terminated
with or
including the CP/M-DOS EOF character (\x1A), and the Unix
equivalent
(\x04) seems to be treated as an ordinary character.

So I'm not sure what you mean by "the EOF marker", or what sort
of text
files you are trying to import. Can you give some more details?


On Mon, 9 May 2005 07:33:11 -0700, "berybery"

Hello,
I'm using ACCESS 2003.

Whenever I try to import a text file into ACCESS, the sample
display in the
Import/Export wizard shows me garbage. If I remove the EOF
marker in the text
file it comes into ACCESS nicely.

Is there a way for ACCESS to remove the EOF marker for me,
without me having
to do it manually?
Thank you,
Karlene
 
B

berybery

Good Morning John,
Thank you for all the advice given. Yes, it is possible for the people from
the main frame to output that file with spaces at the end. So I think that's
where I will go. Thank you again,
Have a good day,
Karlene

John Nurick said:
Hi Karlene,

I've received the sample file. As I expected, the problem has nothing to do
with the EOF marker; it is all the null characters (Ascii 0, \x0) that are
upsetting things. Replace them with spaces and things work as normal.
Opening and saving in Notepad does this replacement without telling you.

If you can, get the mainframe guys to alter their export routine. If they
(or you) have Perl installed, all it takes is a one-liner like this to do
the replacement, leaving the original file renamed with .bak:

perl -i.bak -pe"tr/\0/ /" "D:\folder\filename.dat"

Otherwise it's possible to write VBA code to replace the nulls. You have to
read the text file line by line, replacing the null characters and writing
the edited line to a new file.

berybery said:
Hello John,
Will send a sample file via e-mail and will continue to explore my options
of producing the main frame file and getting rid of the EOF marker.
Karlene

John Nurick said:
Karlene,

0D 0A at the end of each line is the Windows standard, and Access 2003
isn't bothered by a 1A at the end of the file, so something else must be
going on.

In your place, the next thing I'd do is run some tests on the text file,
checking for things such as the wrong number of fields in one or more
records, missing delimiters/qualifiers, line breaks within the data,
wrongly escapedd quote marks, etc. (Though if as you say the file can be
imported after you have opened it and saved it in Notepad, that doesn't
sound plausible either.)

If the file isn't confidential and isn't too large, I'd be happy to run
some tests on it if you zip it up and email it (omitting the bogus
middle name from my email address).


On Thu, 12 May 2005 04:55:21 -0700, "berybery"

Good morning John,
The record separator is identified by both the "OD and OA" characters.
At
the end of the file we have the OD, OA and 1A characters.

I am interested though, in your last suggestion, running the file
through a
filter or using some VB code.
Thank you,
Karlene

:

Karlene,

Now I'm mystified. Ctrl-Z (hex 1A, ASCII 26) is Windows's ancestral
EOF
marker and, as I said in my first message, it doesn't normally cause
any
problems and certainly doesn't screw up the preview in the text import
wizard. So I'm now fairly confident that your problem has nothing to
do
with the EOF marker.

Instead, I think it's the record separators (line breaks). As the file
has come from a non-Windows system they may well be something other
than
the Windows-standard hex 0D 0A (CR LF). Access versions up to 97
accepted plain CR or plain LF, but more recent versions require the
two
characters CR LF.

What's more, when you open such a file in Notepad, edit it and save
it,
Notepad writes it back to disk with CR LF pairs, after which Access
can
handle it OK.

Please check with Hexedit what record separator the text file uses. If
it's just CR or just LF, you need to get it changed to CR LF. The best
way to do this (if possible) would be to get the mainframe people to
modify the script that creates the file.

If that's not possible, post back here: you'll need either to run the
file through a filter before importing it, or to write custom VBA code
to read and import the non-standard file.






On Wed, 11 May 2005 12:14:04 -0700, "berybery"

Hello John,
Opened the file in Hexedit and it is a "Control-Z" character with a
"/A" hex
value and an ASCII value of 26.

I am going to explore the option of the mainframe throwing out the
file
minus the EOF marker.

What I discovered though, is that if I open and save the file as a
notepad
file, then the file goes into ACCESS nicely with the end of file
marker. This
is one way of working around the problem, but I'm still not too happy
about
having to open the file with "live" data to do that.

Karlene

:

Karlene,

Recent versions of Access are fussier about the text files they'll
import, though the problems I've seen to date involve end-of-line
markers rather than end-of-file ones.

But I can't imagine any single character at the *end* of a file
mucking
up the preview in the text import wizard, and I've tested the EOF
characters I know of without provoking the trouble. So unless you
tell
me which actual character is used as EOF in your system, I can't
reproduce the problem. (One way of finding out is to open the text
file
with a hex editor such as Hexedit.)

I don't know a neat way of removing the last character from a file
using
VBA. Usually it's necessary to write all but the last character to
a new
file, and I'd probably do this by filtering it through a Perl
one-liner,
e.g.
perl -0 -pe"chop" old.txt > new.txt
if the file is small enough to fit in memory. Maybe you could get
the
mainframe people to do something like this as part of generating
the
ASCII file for you.




On Tue, 10 May 2005 06:28:02 -0700, "berybery"

Hello John,
The file is coming from a VSE\ESA main frame environment, using
the file
transfer feature, from an EBCIDEC to ASCII format with this EOF
marker, that
looks like a little square.

Believe it or not, Access 2.0 takes this file with the EOF marker
with no
worries. I'm actually trying to upgrade an old Access 2.0
application to
Access 2003. The EOF marker is giving me problems. I do not want
to manually
remove it. Any help will be appreciated. Thank you
Karlene

:

Hi Karlene,

Windows text files don't normally have or need special
terminating
symbols (their size is stored in the directory entry). Anyway,
using
Access 2003 I have no problems importing text files terminated
with or
including the CP/M-DOS EOF character (\x1A), and the Unix
equivalent
(\x04) seems to be treated as an ordinary character.

So I'm not sure what you mean by "the EOF marker", or what sort
of text
files you are trying to import. Can you give some more details?


On Mon, 9 May 2005 07:33:11 -0700, "berybery"

Hello,
I'm using ACCESS 2003.

Whenever I try to import a text file into ACCESS, the sample
display in the
Import/Export wizard shows me garbage. If I remove the EOF
marker in the text
file it comes into ACCESS nicely.

Is there a way for ACCESS to remove the EOF marker for me,
without me having
to do it manually?
Thank you,
Karlene

--
John Nurick [Microsoft Access MVP]

Please respond in the newgroup and not by email.
 
J

John Nurick

It's not the end of the file that's the problem, Karlene. The sample you
sent me has lots of nulls within the data: it's these that need to be
replaced with spaces.


berybery said:
Good Morning John,
Thank you for all the advice given. Yes, it is possible for the people
from
the main frame to output that file with spaces at the end. So I think
that's
where I will go. Thank you again,
Have a good day,
Karlene

John Nurick said:
Hi Karlene,

I've received the sample file. As I expected, the problem has nothing to
do
with the EOF marker; it is all the null characters (Ascii 0, \x0) that
are
upsetting things. Replace them with spaces and things work as normal.
Opening and saving in Notepad does this replacement without telling you.

If you can, get the mainframe guys to alter their export routine. If they
(or you) have Perl installed, all it takes is a one-liner like this to do
the replacement, leaving the original file renamed with .bak:

perl -i.bak -pe"tr/\0/ /" "D:\folder\filename.dat"

Otherwise it's possible to write VBA code to replace the nulls. You have
to
read the text file line by line, replacing the null characters and
writing
the edited line to a new file.

berybery said:
Hello John,
Will send a sample file via e-mail and will continue to explore my
options
of producing the main frame file and getting rid of the EOF marker.
Karlene

:

Karlene,

0D 0A at the end of each line is the Windows standard, and Access 2003
isn't bothered by a 1A at the end of the file, so something else must
be
going on.

In your place, the next thing I'd do is run some tests on the text
file,
checking for things such as the wrong number of fields in one or more
records, missing delimiters/qualifiers, line breaks within the data,
wrongly escapedd quote marks, etc. (Though if as you say the file can
be
imported after you have opened it and saved it in Notepad, that
doesn't
sound plausible either.)

If the file isn't confidential and isn't too large, I'd be happy to
run
some tests on it if you zip it up and email it (omitting the bogus
middle name from my email address).


On Thu, 12 May 2005 04:55:21 -0700, "berybery"

Good morning John,
The record separator is identified by both the "OD and OA"
characters.
At
the end of the file we have the OD, OA and 1A characters.

I am interested though, in your last suggestion, running the file
through a
filter or using some VB code.
Thank you,
Karlene

:

Karlene,

Now I'm mystified. Ctrl-Z (hex 1A, ASCII 26) is Windows's ancestral
EOF
marker and, as I said in my first message, it doesn't normally
cause
any
problems and certainly doesn't screw up the preview in the text
import
wizard. So I'm now fairly confident that your problem has nothing
to
do
with the EOF marker.

Instead, I think it's the record separators (line breaks). As the
file
has come from a non-Windows system they may well be something other
than
the Windows-standard hex 0D 0A (CR LF). Access versions up to 97
accepted plain CR or plain LF, but more recent versions require the
two
characters CR LF.

What's more, when you open such a file in Notepad, edit it and save
it,
Notepad writes it back to disk with CR LF pairs, after which Access
can
handle it OK.

Please check with Hexedit what record separator the text file uses.
If
it's just CR or just LF, you need to get it changed to CR LF. The
best
way to do this (if possible) would be to get the mainframe people
to
modify the script that creates the file.

If that's not possible, post back here: you'll need either to run
the
file through a filter before importing it, or to write custom VBA
code
to read and import the non-standard file.






On Wed, 11 May 2005 12:14:04 -0700, "berybery"

Hello John,
Opened the file in Hexedit and it is a "Control-Z" character with
a
"/A" hex
value and an ASCII value of 26.

I am going to explore the option of the mainframe throwing out the
file
minus the EOF marker.

What I discovered though, is that if I open and save the file as a
notepad
file, then the file goes into ACCESS nicely with the end of file
marker. This
is one way of working around the problem, but I'm still not too
happy
about
having to open the file with "live" data to do that.

Karlene

:

Karlene,

Recent versions of Access are fussier about the text files
they'll
import, though the problems I've seen to date involve
end-of-line
markers rather than end-of-file ones.

But I can't imagine any single character at the *end* of a file
mucking
up the preview in the text import wizard, and I've tested the
EOF
characters I know of without provoking the trouble. So unless
you
tell
me which actual character is used as EOF in your system, I can't
reproduce the problem. (One way of finding out is to open the
text
file
with a hex editor such as Hexedit.)

I don't know a neat way of removing the last character from a
file
using
VBA. Usually it's necessary to write all but the last character
to
a new
file, and I'd probably do this by filtering it through a Perl
one-liner,
e.g.
perl -0 -pe"chop" old.txt > new.txt
if the file is small enough to fit in memory. Maybe you could
get
the
mainframe people to do something like this as part of generating
the
ASCII file for you.




On Tue, 10 May 2005 06:28:02 -0700, "berybery"

Hello John,
The file is coming from a VSE\ESA main frame environment, using
the file
transfer feature, from an EBCIDEC to ASCII format with this EOF
marker, that
looks like a little square.

Believe it or not, Access 2.0 takes this file with the EOF
marker
with no
worries. I'm actually trying to upgrade an old Access 2.0
application to
Access 2003. The EOF marker is giving me problems. I do not
want
to manually
remove it. Any help will be appreciated. Thank you
Karlene

:

Hi Karlene,

Windows text files don't normally have or need special
terminating
symbols (their size is stored in the directory entry).
Anyway,
using
Access 2003 I have no problems importing text files
terminated
with or
including the CP/M-DOS EOF character (\x1A), and the Unix
equivalent
(\x04) seems to be treated as an ordinary character.

So I'm not sure what you mean by "the EOF marker", or what
sort
of text
files you are trying to import. Can you give some more
details?


On Mon, 9 May 2005 07:33:11 -0700, "berybery"

Hello,
I'm using ACCESS 2003.

Whenever I try to import a text file into ACCESS, the sample
display in the
Import/Export wizard shows me garbage. If I remove the EOF
marker in the text
file it comes into ACCESS nicely.

Is there a way for ACCESS to remove the EOF marker for me,
without me having
to do it manually?
Thank you,
Karlene

--
John Nurick [Microsoft Access MVP]

Please respond in the newgroup and not by email.
 
B

berybery

Hello John,
OK, I will look at that as well. That should not be too much of a problem.
Karlene
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top