Batch search and replace

S

Stahlhut1956

I'm a biologist, and have been working with a set insect collection
data. Most of my requirements are simple -- enter the data into Word
or Excel, save it as a text file, and use the text as input into
several data-analysis freeware programs.

The problem: The data came from multiple sites and collectors, and
the original specimen IDs did not follow a standard format. I've
since assigned every specimen a new, more informative ID, but now have
to replace the IDs in multiple text files.

Made-up example: Replace "SQZRKL435-2" with "Texas-2006-Fly".

I've created an Excel file (let's call it "rosetta_stone.xls") with
one column containing the old, confusing IDs and a matching adjacent
column of new, informative IDs.

My question: Can anyone recommend a program with which I can make a
translation key by reading the file "rosetta_stone.xls" (or a text
version of it), and then input "confusing_file.txt" and convert all
the old IDs to new ones?

I've seen both free and commercial utilities around that can do this
under Windows. I have Office X on my iBook, and also have access to
Office 2004 on a Mac desktop in my lab.

Thanks,
-- Julie Stahlhut
 
J

John McGhie

Hi Julie:

You can do this in Word itself, but it takes a little setting up.

In Word 2004, if you open the Find dialog, you will see a little blue button
with a black arrow on the lower left. Click it to reveal the advanced
options. In the Search box that appears, you can find the "All open
documents" option.

That enables you to search/replace multiple files.

Now: How you proceed depends a lot on the scale of your problem.

* If you have less than about 25 files to search, you can open them
directly.

* If you have more files than that, create them as a master document (note:
create the master document from a blank document: and do not put any text in
it, only subdocuments. You want to throw the master away at the end of the
process.)

Now you need to fire search/replaces, one for each name.

* If you have less than about 200 names, you can copy them into the Find and
Replace dialogs by hand.

* If you have more than 200 names, you can make a VBA macro to read your
Excel spreadsheet and do it all for you. There's a couple of days' effort
in making the macro unless you are a good VBA coder. If you do not know VBA
at all, ask us. We'll help you to do this.

So get back to us with the number of documents and the number of names, and
we can talk you through this.

However, if you are comfortable with Unix, and you are going to convert the
data into Text files anyway, you would be much better to convert the files
first, then do this with the Unix command-line utility "sed" (which is
built-in to OS X).

See
http://www.osxfaq.com/Tutorials/LearningCenter/UnixTutorials/GrepSedRegexp/i
ndex.ws

http://www.osxfaq.com/Tutorials/LearningCenter/UnixTutorials/GrepSedRegexp/p
age2.ws

Grep and sed are a bit geeky, but I assume you work for a university or some
such: grab your nearest Comp Sci major and ask them to show you how. Beer
or food are usually effective in getting a comp sci person to do almost
anything you need...

Hope this helps

I'm a biologist, and have been working with a set insect collection
data. Most of my requirements are simple -- enter the data into Word
or Excel, save it as a text file, and use the text as input into
several data-analysis freeware programs.

The problem: The data came from multiple sites and collectors, and
the original specimen IDs did not follow a standard format. I've
since assigned every specimen a new, more informative ID, but now have
to replace the IDs in multiple text files.

Made-up example: Replace "SQZRKL435-2" with "Texas-2006-Fly".

I've created an Excel file (let's call it "rosetta_stone.xls") with
one column containing the old, confusing IDs and a matching adjacent
column of new, informative IDs.

My question: Can anyone recommend a program with which I can make a
translation key by reading the file "rosetta_stone.xls" (or a text
version of it), and then input "confusing_file.txt" and convert all
the old IDs to new ones?

I've seen both free and commercial utilities around that can do this
under Windows. I have Office X on my iBook, and also have access to
Office 2004 on a Mac desktop in my lab.

Thanks,
-- Julie Stahlhut

--
Don't wait for your answer, click here: http://www.word.mvps.org/

Please reply in the group. Please do NOT email me unless I ask you to.

John McGhie, Consultant Technical Writer
McGhie Information Engineering Pty Ltd
http://jgmcghie.fastmail.com.au/
Sydney, Australia. S33°53'34.20 E151°14'54.50
+61 4 1209 1410, mailto:[email protected]
 
S

Stahlhut1956

Thanks, John!

I've used the "all open documents" feature a few times, and find it a
bit clunky. (Is there a way to turn off the annoying warning that
you'll have to undo it in each document separately?)

I currently need to change only 64 names. I've done a bit of VBA
programming a long time ago (mostly by recording macros and tweaking
some of the commands) but am extremely rusty. There are four or five
files I need to fix soon, but there will almost certainly be more
lurking in various directories that I'll have to edit before my data
is publishable. I could conceivably do the whole thing by hand while
waiting in airports during some upcoming holiday travel. Then again,
given that I'll undoubtedly keep working with data collections of this
kind, it might be worth investing some time in making my own tool for
the job.

So, watch this space. I may be calling in the reinforcements
soon! :)

Regards,
-- Julie Stahlhut
 
C

Carl Witthoft

Grep and sed are a bit geeky, but I assume you work for a university or some
such: grab your nearest Comp Sci major and ask them to show you how. Beer
or food are usually effective in getting a comp sci person to do almost
anything you need...
If you want to go that way, I strongly recommend downloading the (free)
TextWrangler from Bare Bones Software. It will let you apply a global
search and replace operation across a group of files, uses full regexp,
and more. Great text tool in general.

Carl
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top