Can i import a PDF file to a mdb ?

M

Mota

Hi;
Is it possible to make a table from an Adobe Acrobat file that contains just
a table with borders (no picture,no additional text...)?Any tricks?
I will be grateful to you if you show me a solution.Thank you in advance.
 
B

Bruce Rusk

Two approaches to try:

1) selecting the table and pasting it into Excel. If the formatting is set
up nicely, it should fall into neat columns and the file should be readable
by Access.

2) if you need to automate the process a little more, you could extract the
text from the PDF and see if it is delimited with tabs or something
equivalent so that Access can read it.

You can use the Extract text tool in the Multivalent toolkit:

http://multivalent.sourceforge.net/

The tool is java-based, so you'd have to put the .jar it your classpath and
run

java tool.pdf.Extract [arguments]

from the command line (or batch file).

There may be other things out there that extract text, but this is a
reliable and free one.

HTH,

Bruce Rusk
 
M

Mota

Dear Bruce;
After i crop the table from the source pdf page,i have a table with borders
on the page.So,there is no delimiters.But when i convert it to a text file
(RTF),some nonsense fonts appears in a Word document instead of the
table.Acrobat doesnt have "Import to a simple txt file" for such a cropped
tables.
The main source is a fax document opened by Microsoft (office) Fax Viewer.I
converted it to a PDF file for better handling in order to make a table
from.Do you know any usual way to make a table from a Fax document or a TIFF
file?
Thank you for your help.

Bruce Rusk said:
Two approaches to try:

1) selecting the table and pasting it into Excel. If the formatting is set
up nicely, it should fall into neat columns and the file should be
readable by Access.

2) if you need to automate the process a little more, you could extract
the text from the PDF and see if it is delimited with tabs or something
equivalent so that Access can read it.

You can use the Extract text tool in the Multivalent toolkit:

http://multivalent.sourceforge.net/

The tool is java-based, so you'd have to put the .jar it your classpath
and run

java tool.pdf.Extract [arguments]

from the command line (or batch file).

There may be other things out there that extract text, but this is a
reliable and free one.

HTH,

Bruce Rusk


Mota said:
Hi;
Is it possible to make a table from an Adobe Acrobat file that contains
just a table with borders (no picture,no additional text...)?Any tricks?
I will be grateful to you if you show me a solution.Thank you in advance.
 
B

Bruce Rusk

It sounds like you're working with a graphics file rather than native text.

In that case you would need to use an OCR program (possible the Document
Imaging supplied with Office 2003), but that's a whole other level of
complication -- and this wouldn't be the newsgroup to resolve this in. If
it's a graphic file, putting it into a PDF will only make it harder, not
easier, to convert it to text -- start from your TIFF file, and figure out
an OCR solution from there.

Mota said:
Dear Bruce;
After i crop the table from the source pdf page,i have a table with
borders on the page.So,there is no delimiters.But when i convert it to a
text file (RTF),some nonsense fonts appears in a Word document instead of
the table.Acrobat doesnt have "Import to a simple txt file" for such a
cropped tables.
The main source is a fax document opened by Microsoft (office) Fax
Viewer.I converted it to a PDF file for better handling in order to make a
table from.Do you know any usual way to make a table from a Fax document
or a TIFF file?
Thank you for your help.

Bruce Rusk said:
Two approaches to try:

1) selecting the table and pasting it into Excel. If the formatting is
set up nicely, it should fall into neat columns and the file should be
readable by Access.

2) if you need to automate the process a little more, you could extract
the text from the PDF and see if it is delimited with tabs or something
equivalent so that Access can read it.

You can use the Extract text tool in the Multivalent toolkit:

http://multivalent.sourceforge.net/

The tool is java-based, so you'd have to put the .jar it your classpath
and run

java tool.pdf.Extract [arguments]

from the command line (or batch file).

There may be other things out there that extract text, but this is a
reliable and free one.

HTH,

Bruce Rusk


Mota said:
Hi;
Is it possible to make a table from an Adobe Acrobat file that contains
just a table with borders (no picture,no additional text...)?Any tricks?
I will be grateful to you if you show me a solution.Thank you in
advance.
 
M

Mota

In fact my first problem was working with OCR and i thought converting it to
a PDF can make it easier ! :)
I couldnt find any solution with OCR or TIFF files.
Thank you for your help.

Bruce Rusk said:
It sounds like you're working with a graphics file rather than native
text.

In that case you would need to use an OCR program (possible the Document
Imaging supplied with Office 2003), but that's a whole other level of
complication -- and this wouldn't be the newsgroup to resolve this in. If
it's a graphic file, putting it into a PDF will only make it harder, not
easier, to convert it to text -- start from your TIFF file, and figure out
an OCR solution from there.

Mota said:
Dear Bruce;
After i crop the table from the source pdf page,i have a table with
borders on the page.So,there is no delimiters.But when i convert it to a
text file (RTF),some nonsense fonts appears in a Word document instead of
the table.Acrobat doesnt have "Import to a simple txt file" for such a
cropped tables.
The main source is a fax document opened by Microsoft (office) Fax
Viewer.I converted it to a PDF file for better handling in order to make
a table from.Do you know any usual way to make a table from a Fax
document or a TIFF file?
Thank you for your help.

Bruce Rusk said:
Two approaches to try:

1) selecting the table and pasting it into Excel. If the formatting is
set up nicely, it should fall into neat columns and the file should be
readable by Access.

2) if you need to automate the process a little more, you could extract
the text from the PDF and see if it is delimited with tabs or something
equivalent so that Access can read it.

You can use the Extract text tool in the Multivalent toolkit:

http://multivalent.sourceforge.net/

The tool is java-based, so you'd have to put the .jar it your classpath
and run

java tool.pdf.Extract [arguments]

from the command line (or batch file).

There may be other things out there that extract text, but this is a
reliable and free one.

HTH,

Bruce Rusk


Hi;
Is it possible to make a table from an Adobe Acrobat file that contains
just a table with borders (no picture,no additional text...)?Any
tricks?
I will be grateful to you if you show me a solution.Thank you in
advance.
 
Top