scrub file before submitting

G

goss

Hi ng
Using xl 2003

Need to scrub a text file before submitting

Desired output:

Inv # || Inv Amnt || Date || FAS? || FAS#

Problem:

Input file is invoice detail dump from Progress database
Having a hard time using FIND and MID to extract the Inv Amnt
Problem is truncation of decimal where .xx = .00

Sample lines from dump:
158005 1 0 28 "" "Contract # 1062--Event Date: 05/27/2004" ? "158005"
"" "" 70043000
[Note - Amount = 28.00]

158006 1 0 45.75 "Kim Gardner" "Contract # 1001--Event Date: 06/1/200
Fas# 264430" ? "158009" 0 "" "" 70043000

These are 2 small samples from the file. I have not been able to fin
repetitive source common to all records to use FIND/MID

Drop me an e-mail if you would like to see the full dump
[email protected]
 
B

Biff

Hi goss!

I see a common source in at least the samples you provided.
0space and space"

This formula returns text values.

=MID(A21,FIND("0 ",A21)+2,FIND(" """,A21)-FIND("0 ",A21)-1)

If you need to have the values returned as numeric
numbers, just add a double unary to the formula:

=--MID(A21,FIND("0 ",A21)+2,FIND(" """,A21)-FIND("0 ",A21)-
1)

Then you can just format the cell to display two decimal
places = 28.00 as a number.

Biff
 
H

Harlan Grove

goss > said:
Need to scrub a text file before submitting

Not a task Excel is particularly good at, but . . .
Desired output:

Inv # || Inv Amnt || Date || FAS? || FAS#

Problem:

Input file is invoice detail dump from Progress database
Having a hard time using FIND and MID to extract the Inv Amnt
Problem is truncation of decimal where .xx = .00

Sample lines from dump:
158005 1 0 28 "" "Contract # 1062--Event Date: 05/27/2004" ? "158005" 0
"" "" 70043000
[Note - Amount = 28.00]

158006 1 0 45.75 "Kim Gardner" "Contract # 1001--Event Date: 06/1/2004
Fas# 264430" ? "158009" 0 "" "" 70043000

These are 2 small samples from the file. I have not been able to find
repetitive source common to all records to use FIND/MID

Looks like both records have 11 fields, not counting the bare ?, separated
by whitespace, with text fields enclosed in double quotes. If the whitespace
between fields are tabs, then you could use Data > Text to Columns,
Delimited to parse into fields in separate columns. On the other hand, if
whitespace between fields are spaces, it's more difficult.

Generally, text processing is easiest to do with good text processing tools.
Excel lacks them. However, if you have a recent version of either Windows or
Internet Explorer, you almost certainly have VBScript installed. If so, you
could make use of its regular expression class to handle parsing these
records. See the following for one user-defined function (udf) that provides
access to it.

http://groups.google.com/[email protected]

You could then use this udf to add field delimiters to your records. For
example, the following call replaces spaces between fields with newlines,
but leaves spaces inside quoted fields as-is.

=subst(A1,"([^ ""]*(""[^""]*"")*[^ ""]*) +","$1"&CHAR(10))

You could them parse the records into fields at the newlines. Then you could
extract the fields you want to send downstream.
 
J

Jamie Collins

Not a task Excel is particularly good at, but . . .

I know I'm going to get a reputation for suggesting a SQL solution for
everything but...

.... you may be able to query the text file using ADO and SQL,
utilizing a schema.ini file to define the format, fields, data types,
etc. But don't just take my word for it: a convincing argument is put
by the following article (if you can excuse the pun in the title):

Much ADO About Text Files
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnclinic/html/scripting03092004.asp

I love it when an MS article has section captions such as 'Why Can't I
Just Use the FileSystemObject to Read Text Files?' and 'Can't I Just
Use Split to Get at Individual Values?'

Jamie.

--
 
G

goss

Thanks all for your replies.
I would like to continue to focus on an xl solution for the moment.
I am in the process of teaching myself vb script which I'm sure ca
eventually do the job I need.

However, I need a solution immed. while I figure vb script out.

Where I'm at:
Biff points out there is a repetitive pattern in each string of "0 "
(w/o "")

So I find the position of "0 " :
=FIND("0 ",A4)

Then I extract the amount of the invoice by starting with the positio
of the 0 and adding 2:
=VALUE(MID(A4,(B4+2),6))

Problem:
There may be more or less than 6 char's in the amount field
max amount may be 9999999.99, though very rare. Many of the amounts ar
coming up truncated (no decimals) Ex. 99
So my Mid formula above is yielding results such as these:

28 ""
6 "Ste
4 "Jam

I tried running the Trimall macro on these results, no impact

Any ideas on how I can alter my formula to handle amounts of varyin
lengths?

Drop me an e-mail if you would like to see the full dump file
Remove nospam

[email protected]
 
B

Biff

Hi Goss!

Did you try my formula? It worked for me based on your
sample data.

The problem your now having is that you need two points of
reference. That's what my formula does. It finds the first
point which is - 0space, and the second point which is -
space" and returns *everything* in between whether it's 2
characters or 100 characters.

Biff
 
Top