Converting .txt file to rows

J

JohnG1965

I have a text document that I'm trying to import into Excel and have it
formatted where the text stops at the "]" symbol and creates a new row for
the next segment.

Example: text doc starts like this
ISA*00**00**09*00507000*000043213*0*P*\]GS*AG*47*F5332B0*090220*0506*655*X*003010]ST*824*000001110]BGN*00*1*090220]N1*ST**92*47]N1*SF**92*F5332B0]OTI*IE*SI*P045219*******856]REF*PK*PS-42637]REF*PO*HES
C93364]

I want it to transform in to this
ISA*00**00**09*00507000*000043213*0*P*\]GS*AG*47*F5332B0*090220*0506*655*X*003010]
ST*824*000001110]
BGN*00*1*090220]
N1*ST**92*47]
N1*SF**92*F5332B0]
OTI*IE*SI*P045219*******856]
REF*PK*PS-42637]
REF*PO*HESC93364]

Is this possible?
 
R

Rick Rothstein

If your file is a Word document (you show a .doc extension of your
filename), then I'm not sure how you would do what you want. However, if
that is really a pure text document (normally having a .txt extension), then
you can use this macro to do what you want. Change the two Const statement
assignments to match your actual conditions (cell address for the start of
your list and the full path plus filename for the file you want to import).

Sub BreakTextUp()
Dim X As Long
Dim FileNum As Long
Dim TotalFile As String
Dim LinesOfData() As String

Const StartCellAddress As String = "B3"
Const PathAndFileName As String = "c:\temp\text.txt"

FileNum = FreeFile
Open PathAndFileName For Binary As #FileNum
TotalFile = Space(LOF(FileNum))
Get #FileNum, , TotalFile
Close #FileNum
TotalFile = Replace(TotalFile, "]", "]" & vbLf)
LinesOfData = Split(TotalFile, vbLf)
For X = 0 To UBound(LinesOfData)
ActiveSheet.Range(StartCellAddress).Offset(X).Value = LinesOfData(X)
Next
End Sub
 
J

Joel

the code below reads your file and writes the data to a new file with the new
line added. I could put the data into a worksheet if you like.

Sub AddNewLine()

Const ForReading = 1, ForWriting = -2, _
ForAppending = 3


ReadFile = Application _
.GetOpenFilename(FileFilter:="Text Files (*.txt), *.txt", _
Title:="Select Read File")
If ReadFile = False Then
MsgBox ("No file Selected - Exiting Macro")
End If

WriteFile = Application _
.GetSaveAsFilename(FileFilter:="Text Files (*.txt), *.txt", _
Title:="Select Write File")
If WriteFile = False Then
MsgBox ("No file Selected - Exiting Macro")
End If

Set fs = CreateObject("Scripting.FileSystemObject")
Set fin = fs.OpenTextFile(ReadFile, _
ForReading, TristateFalse)
Set fout = fs.CreateTextFile _
(Filename:=WriteFile, overwrite:=True)

FoundBackSlash = False
Do While fin.AtEndOfStream <> True
ReadData = fin.read(1)
Select Case ReadData

Case "]":
If FoundBackSlash = True Then
FoundBackSlash = False
fout.write "]"
Else
fout.writeline "]"
End If
Case "\":
If FoundBackSlash = True Then
'incase there are 2 backslashees in a row
FoundBackSlash = False
Else
FoundBackSlash = True
End If
fout.write "\"
Case Else
fout.write ReadData
End Select
Loop
fin.Close
fout.Close
End Sub
 
J

Joel

If yoou have other backshlash character in your data make this change

from
Case Else
fout.write ReadData

to
Case Else
FoundBackSlash = False
fout.write ReadData

Joel said:
the code below reads your file and writes the data to a new file with the new
line added. I could put the data into a worksheet if you like.

Sub AddNewLine()

Const ForReading = 1, ForWriting = -2, _
ForAppending = 3


ReadFile = Application _
.GetOpenFilename(FileFilter:="Text Files (*.txt), *.txt", _
Title:="Select Read File")
If ReadFile = False Then
MsgBox ("No file Selected - Exiting Macro")
End If

WriteFile = Application _
.GetSaveAsFilename(FileFilter:="Text Files (*.txt), *.txt", _
Title:="Select Write File")
If WriteFile = False Then
MsgBox ("No file Selected - Exiting Macro")
End If

Set fs = CreateObject("Scripting.FileSystemObject")
Set fin = fs.OpenTextFile(ReadFile, _
ForReading, TristateFalse)
Set fout = fs.CreateTextFile _
(Filename:=WriteFile, overwrite:=True)

FoundBackSlash = False
Do While fin.AtEndOfStream <> True
ReadData = fin.read(1)
Select Case ReadData

Case "]":
If FoundBackSlash = True Then
FoundBackSlash = False
fout.write "]"
Else
fout.writeline "]"
End If
Case "\":
If FoundBackSlash = True Then
'incase there are 2 backslashees in a row
FoundBackSlash = False
Else
FoundBackSlash = True
End If
fout.write "\"
Case Else
fout.write ReadData
End Select
Loop
fin.Close
fout.Close
End Sub




JohnG1965 said:
I have a text document that I'm trying to import into Excel and have it
formatted where the text stops at the "]" symbol and creates a new row for
the next segment.

Example: text doc starts like this:
ISA*00**00**09*00507000*000043213*0*P*\]GS*AG*47*F5332B0*090220*0506*655*X*003010]ST*824*000001110]BGN*00*1*090220]N1*ST**92*47]N1*SF**92*F5332B0]OTI*IE*SI*P045219*******856]REF*PK*PS-42637]REF*PO*HES
C93364]

I want it to transform in to this:
ISA*00**00**09*00507000*000043213*0*P*\]GS*AG*47*F5332B0*090220*0506*655*X*003010]
ST*824*000001110]
BGN*00*1*090220]
N1*ST**92*47]
N1*SF**92*F5332B0]
OTI*IE*SI*P045219*******856]
REF*PK*PS-42637]
REF*PO*HESC93364]

Is this possible?
 
J

JohnG1965

Rick,
Thanks for the response (you too Joel, but Rick's seemed easier). It worked
well except where my text document line ended at DEVELOP Excel imported it as
follows:

N1*PN*INBOUND LOGISTICS*92*92]
"N1*SU*SHAW DEVELOP"
MENT, LLC*92*F5332B0]
(there's actually a little question mark inside a square at the end of
DEVELOP on my worksheet)

This is a ton of help as is, but if you could make it do this it'd be perfect:

N1*PN*INBOUND LOGISTICS*92*92]
N1*SU*SHAW DEVELOPMENT, LLC*92*F5332B0]

Thanks again,
 
R

Rick Rothstein

I think the problem is that there is a "hard" return between the text
"DEVELOP" and "MENT". How to handle this problem depends on how your file is
composed. You showed us a sample text that looked like this...

ISA*00**00**09*00507000*000043213*0*P*\]GS*AG*47*F5332B0*090220*0506*655*X*003010]ST*824*000001110]BGN*00*1*090220]N1*ST**92*47]N1*SF**92*F5332B0]OTI*IE*SI*P045219*******856]REF*PK*PS-42637]REF*PO*HES
C93364]

Are there more lines of text that look like that in your file? If not,
solving the problem is easy; if so, then it may not be so easy to solve. If
there are multiple lines in the same file, are they separated from each
other by "hard" returns (or does one follow the other with some delimiter
between them)? If they are separated from each other by a "hard" return,
then is there anything about the construction of the individual lines that
would allow me to identify one line from another (for example, does each
line start with ISA like your example line did or is each line of a
guaranteed fixed length)? If you could post the text file on line somewhere
(so more than just I could see it), that would be helpful. Otherwise, you
can send the file to me if you want so that I can see exactly how it is
constructed.
 
J

JohnG1965

Hey Rick,
I tried e-mailing you but it did not go thru. Can you e-mail me and I'll
send you the file?

(e-mail address removed)
 
R

Rick Rothstein

The OP sent me his file and this was my response to him...

The text in the file you sent me had a "hard" return located after every 133
characters... this is why your line broke at the word you asked about (there
were several other locations where that happened as well). In addition to
that, it looks like you had multiple lines of data in the file also
separated by "hard" returns which might have complicated things except that
each separate lines worth of data appears to always start with the letters
"ISA"... that is good because it gave me something regular about your data
that I could build a solution around. Give this macro a try and see if it
does what you want (note... there is a blank row between every individual
line's worth of data; that is, a blank row immediately before each ISA piece
of text)...

Sub BreakTextUp()
Dim X As Long
Dim FileNum As Long
Dim TotalFile As String
Dim LinesOfData() As String

Const StartCellAddress As String = "A1"
Const PathAndFileName As String = "c:\temp\text.txt"

FileNum = FreeFile
Open PathAndFileName For Binary As #FileNum
TotalFile = Space(LOF(FileNum))
Get #FileNum, , TotalFile
Close #FileNum
TotalFile = Replace(TotalFile, vbCrLf & "ISA", Chr(1) & "ISA")
TotalFile = Replace(TotalFile, vbCrLf, "")
TotalFile = Replace(TotalFile, Chr(1), vbLf)
TotalFile = Replace(TotalFile, "]", "]" & vbLf)
LinesOfData = Split(TotalFile, vbLf)
For X = 0 To UBound(LinesOfData)
ActiveSheet.Range(StartCellAddress).Offset(X). _
Value = RTrim(LinesOfData(X))
Next
End Sub

Rick (MVP - Excel)
 
J

JohnG1965

This works great. You saved me over an hour per day, plus eliminated any
chance of error.

Thank you so much!!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top