Extracting Text from Word document

W

wjgaddis

I am really new to the VBA (Work, Excel, etc.) world so any guidance and/or
help will be much appreciated. Here's my dilemna. I currently open up
several MS Word documents (around 150) and copy/paste text within those
documents into another MS Word document (a summary document) on a weekly
basis. This situation begs for automation! How can I use VB within Word to
access all the other MS Word documents, search for the text I need to
extract, and copy that data into the MS Word document that initiated the
program?

Help!
 
E

Ed

What you've proposed is very do-able. I use something similar to break
extracting 22 data points from hundreds of docs to populate an Excel file.
BUT - this was possible for my limited VBA skills *only* because each
document is *exactly the same*. So what I search for to get my data points
is *exactly the same* from one doc to the next.

How are yours set up? Is there a consistent, constant "flag" you can search
for every time?

Ed
 
D

Doug Robbins

This is definitely do-able, but you would need to give us some idea of the
criteria for finding the text to be extracted to give you specific advice.

--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP
 
W

wjgaddis

Each document containing the information I need is the same (standard
template sort of thing). In fact, I need what the PM types into a table
starting with the 3rd row . . . here's an example:

The 1st row of the table is labeled "Issues/Risks To Be Escalated"
The 2nd row contains headers for 8 columns
The PM starts entering text into the 3rd row of the table . . . there can be
several rows of text depending on whether the project is going to heck . . .
:->
 
D

Doug Robbins

Something like this should do that:

Dim mydoc As Document
Dim target As Document
Dim myrange As Range

'let user select a path
With Dialogs(wdDialogCopyFile)
If .Display() <> -1 Then Exit Sub
MyPath = .Directory
End With

'strip quotation marks from path
Set target = Documents.Add
If Len(MyPath) = 0 Then Exit Sub

If Asc(MyPath) = 34 Then
MyPath = Mid$(MyPath, 2, Len(MyPath) - 2)
End If

'get files from the selected path
'and insert them into the doc
MyName = Dir$(MyPath & "*.*")
Do While MyName <> ""
Set mydoc = Documents.Open(MyPath & MyName)
With mydoc.Tables(1)
Set myrange = .Cell(3, 1).Range
myrange.End = .Cell(.Rows.Count, 8).Range.End
End With
target.Range.InsertAfter myrange & vbCr
mydoc.Close wdDoNotSaveChanges
MyName = Dir$
Loop



--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP
 
G

Greg

Doug,

I've been monkeying around with our macro. The information in the
table is brought into the summary document as ordinary text (i.e., each
cell range is a separate paragrap). How you would revise this code to
bring in the contents as a table?

I tried set myRange = Tables(1).Range and I get 10 separate paragraphs
in the summary document vice a 5X2 table.

I am still trying to solve, but thought I would ask.
 
G

Greg

Doug,

I dug around (no pun intended ;-)) in some old posts and now believe
that copying the table vice trying to use a range might work better. I
have adapted your code a bit to pull both a bookmarked text and a table
from multiple source documents into a summary document. Feedback
always appreciated:

Sub PullContent()

'This macro is an adaption of code posted in a public newsgroup by Doug
Robbins.
'It provides a means to pull content from multiple files located in a
common folder
'into a summary document.

'Source documents are similarily formatted with a text element
bookmarked "Input"
'and a table element "Table1"

Dim sourceDoc As Document
Dim summaryDoc As Document
Dim oBkmRng As Range
Dim oTblRng As Range
Dim sourcePath As String
Dim sourceFile As String


'Select the folder containing the source files
With Dialogs(wdDialogCopyFile)
If .Display() <> -1 Then Exit Sub
sourcePath = .Directory
End With
'Strip extraneous quotation marks from the source path
If Len(sourcePath) = 0 Then Exit Sub
If Asc(sourcePath) = 34 Then
sourcePath = Mid$(sourcePath, 2, Len(sourcePath) - 2)
End If

'Create a new blank summary document
Set summaryDoc = Documents.Add
'Open each source file and pull content into the summary document
sourceFile = Dir$(sourcePath & "*.*")
Do While sourceFile <> ""
Set sourceDoc = Documents.Open(sourcePath & sourceFile)
With sourceDoc
'Set a range for the bookmarked text
Set oBkmRng = .Bookmarks("Input").Range
'Copy the table text and structure
.Tables(1).Range.FormattedText.Copy
End With
With summaryDoc.Range
'Insert the range text in the summary document
.InsertAfter oBkmRng & vbCr & vbCr
.Collapse wdCollapseEnd
'Paste the table in the summary document
.Paste
.Collapse wdCollapseEnd
.InsertAfter vbCr
End With
sourceDoc.Close wdDoNotSaveChanges
sourceFile = Dir$
Loop

End Sub
 
D

Doug Robbins

Hi Greg,

After declaring another Range object - trange, using the following for the
Do While Loop should do it

Do While MyName <> ""
Set mydoc = Documents.Open(MyPath & MyName)
With mydoc.Tables(1)
Set myrange = .Cell(3, 1).Range
myrange.End = .Cell(.Rows.Count, 8).Range.End
End With
myrange.Copy
Set trange = target.Range
trange.Collapse wdCollapseEnd
trange.Paste
mydoc.Close wdDoNotSaveChanges
MyName = Dir$
Loop


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP
 
K

Keri

I have a similar situation, and am using the "Record a Macro" to create the
VBA for me. But I am stuck.
Here is what I am trying to do:

1) Find all text that equals "T-" within a document
2) Select that text, along with the 25 characters that follow
3) Copy and paste the full 27 character string into a new document (so that
I can create a table from that extraced data)

I figured out Step 1 and Step 3, but can't figure out how to do Step 2. Help?
 
T

Tony Jollans

I don't see the context for this, but assuming you're using Find / Replace
for this - instead of looking for just "T-", select Use Wildcards and look
for "T-?{25}" (without the quotes, T, hyphen, question mark, left brace,
eleven, right brace)
 
K

Keri

This didn't work. When I changed the "T-" to "T-?{25}" the macro doesn't do
anything anymore. The context is that I want to extract requirements from a
document to be able to verify that all of the business requirements have been
satisifed. The 25 characters that follow the T- are the traced requirements.
 
K

Keri

I believe so. Here is the code:
Selection.Find.ClearFormatting
With Selection.Find
.MatchWildcards = True
.Text = "T-?{25}"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
End Sub
 
D

Doug Robbins - Word MVP

Use the following:

Dim Source As Document, Target As Document, frange As Range
Set Source = ActiveDocument
Set Target = Documents.Add
Source.Activate
Selection.HomeKey wdStory
Selection.Find.ClearFormatting
With Selection.Find
Do While .Execute(FindText:="T-?{25} ", MatchWildcards:=True,
Wrap:=wdFindStop, Forward:=True) = True
Set frange = Trim(Selection.Range)
Target.Range.InsertAfter frange & vbCr
Loop
End With
Target.Range.ConvertToTable vbCr, , 1


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top