Cleaning up a file

S

Steven Drenker

I've got clean up a large file comprised of returns from antispam DNSBL
servers. The data is comprised of records like those shown below (two full
records shown).

I want to remove all the lines EXCEPT the one starting ";; ANSWER SECTION:"
and the line(s) immediately that one. I want to resume deleting lines with
the next line after ANSWER SECTION starting with ";; res options:"

I was thinking of the following logic:
1. Position the cursor at the start of the file
2. Do the first 9 characters of the line match ";; ANSWER"
3. If no, delete the line
4. If yes, move cursor down until a match is found with ";; res options"
5. Resume deleting lines until the next match of ";; ANSWER"

Although this sounds conceptually simple, I can't come up with the code to
execute it.

Any help would be greatly appreciated!

Steve
- - - - - - - - - - - - - - - - - - - - - -
Two DNS records shown below. Delete everything but the two lines marked
"KEEP"

;; res options: init recurs defnam dnsrch
;; got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15173
;; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2
;; QUERY SECTION:
;; 218.0.249.210.bl.deadbeef.org, type = ANY, class = IN

;; ANSWER SECTION: <-- KEEP
218.0.249.210.bl.deadbeef.org. 1D IN A 69.6.215.63 <-- KEEP


;; res options: init recurs defnam dnsrch <-- RESTART DELETE
;; got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 54857
;; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUERY SECTION:
;; 218.0.249.210.blackhole.securitysage.com, type = ANY, class = IN


;; res options: init recurs defnam dnsrch
 
H

Helmut Weber

Hi Steven,
hmm...,

if you are talking about 1 line paragraphs,
and if I understand the structure correctly
despite what my newsreader shows me,
then check whether a paragraph starts with ";; ANSWER SECTION:"
or starts with a digit.
If not so, delete it.

You may send me a sample file, if you like.

--
Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Win XP, Office 2003
"red.sys" & Chr$(64) & "t-online.de"
 
J

Jezebel

It's not quite clear what you're trying to do (please proof-read your
posts!) ... but you can probably use Find and Replace to do the whole thing.
As I understand it, you want to keep any sequence that begins ";; ANSWER" up
to, but not including, the next ";; res options". Make two passes, one to
identify all the text you want to keep, the second to delete everything
else --

1. With 'Use wildcards' checked --
Find: (;; ANSWER[!;]@);
Replace: \1 Format = Bold

This assumes that there are no other semi-colons between the answer and the
";; res" bit.

2. With 'Use wildcards' unchecked --
Find: ^? Format = Not Bold
Replace: (leave blank, no format)
 
S

Steven Drenker

Jezebel...excellent! That works perfectly. I hadn't thought about using
expressions with wildcards. I like the [!;]@ expression to find any number
of characters up to the next ";".

Sorry my description wasn't clear. It should have read " the line(s)
immediately FOLLOWING that one." You figured it out though -- I want to keep
the ANSWER section and delete everything else.

I just need to make sure that there is no ";" in any of the information I
want to keep.

I'll automate your solution to clean up the files.
Steve

----------------------------------------------------


It's not quite clear what you're trying to do (please proof-read your
posts!) ... but you can probably use Find and Replace to do the whole thing.
As I understand it, you want to keep any sequence that begins ";; ANSWER" up
to, but not including, the next ";; res options". Make two passes, one to
identify all the text you want to keep, the second to delete everything
else --

1. With 'Use wildcards' checked --
Find: (;; ANSWER[!;]@);
Replace: \1 Format = Bold

This assumes that there are no other semi-colons between the answer and the
";; res" bit.

2. With 'Use wildcards' unchecked --
Find: ^? Format = Not Bold
Replace: (leave blank, no format)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top