Copying TextBox text

J

jerem

I've literally spent hours (googling every which way of posing this question
-- select text in textbox, highlight text from textbox, copy text from
textbox, set focus to textbox, etc.) trying to figure out how to do this with
no success:

Have a macro scan through an entire document looking for text boxes -- at
every occurrence of finding a textbox, copy the text in the textbook, delete
the text box and paste the text just copied from the textbox right back into
the position that the textbox resided. (And yes, I've looked in the website
for copying from the clipboard), So, with the code below I've been able to
find the textboxes and delete them, but I cannot for the life of me figure
how to actually get into those text boxes to copy the text. H E L P! I
think I'm going to have to drink a whole bottle of wine now to console myself
and if you're in a delightful mood, documentation would be lovely, but not
absolutely necessary. As always, thanks in advance for your help.


Sub Macro1()
'
' Macro1 Macro


Dim i As Long
For i = ActiveDocument.Shapes.Count To 1 Step -1
Selection.GoTo What:=wdGoToGraphic, Which:=wdGoToFirst, Count:=i, Name:=""

'ActiveDocument.Shapes("Text Box", i).Select
'Selection.WholeStory
'Selection.Copy

Selection.Delete

Next i

End Sub
 
G

Greg Maxey

jerem,

It is the "right back into the position the textbox resided" piece that may
prove impossible. You see a textbox is "anchored" to a paragraph and that
paragraph may or may not be where the textbox is. I have read what you say
you want to do, but I don't really no why. If you want the text right where
the textbox was then why not just removed the borders from the box?

Sub Macro1()
Dim oShp As Shape
For Each oShp In ActiveDocument.Shapes
If oShp.Type = msoTextBox Then
oShp.TextFrame.TextRange.Copy
oShp.Anchor.Paste
oShp.Delete
End If
Next
End Sub

I've literally spent hours (googling every which way of posing this
question -- select text in textbox, highlight text from textbox, copy
text from textbox, set focus to textbox, etc.) trying to figure out
how to do this with no success:

Have a macro scan through an entire document looking for text boxes
-- at every occurrence of finding a textbox, copy the text in the
textbook, delete the text box and paste the text just copied from the
textbox right back into the position that the textbox resided. (And
yes, I've looked in the website for copying from the clipboard), So,
with the code below I've been able to find the textboxes and delete
them, but I cannot for the life of me figure how to actually get into
those text boxes to copy the text. H E L P! I think I'm going to
have to drink a whole bottle of wine now to console myself and if
you're in a delightful mood, documentation would be lovely, but not
absolutely necessary. As always, thanks in advance for your help.


Sub Macro1()
'
' Macro1 Macro


Dim i As Long
For i = ActiveDocument.Shapes.Count To 1 Step -1
Selection.GoTo What:=wdGoToGraphic, Which:=wdGoToFirst, Count:=i,
Name:=""

'ActiveDocument.Shapes("Text Box", i).Select
'Selection.WholeStory
'Selection.Copy

Selection.Delete

Next i

End Sub

--
Greg Maxey

See my web site http://gregmaxey.mvps.org
for an eclectic collection of Word Tips.

"It is not the critic who counts, not the man who points out how the
strong man stumbles, or where the doer of deeds could have done them
better. The credit belongs to the man in the arena, whose face is
marred by dust and sweat and blood, who strives valiantly...who knows
the great enthusiasms, the great devotions, who spends himself in a
worthy cause, who at the best knows in the end the triumph of high
achievement, and who at the worst, if he fails, at least fails while
daring greatly, so that his place shall never be with those cold and
timid souls who have never known neither victory nor defeat." - TR
 
J

jerem

Ah, life is good again thanks to Allen Wyatt. I'm posting this here in case
someone else is in need of this macro (and doesn't want to spend 3 days
looking for it).

Copies the Text From Each Textbox in document, deletes the text box and
pastes the text where each text box once was:

Sub RemoveTextBoxText()
Dim shp As Shape
Dim oRngAnchor As Range
Dim sString As String
For Each shp In ActiveDocument.Shapes
If shp.Type = msoTextBox Then
' copy text to string, without last paragraph mark
sString = Left(shp.TextFrame.TextRange.Text, _
shp.TextFrame.TextRange.Characters.Count - 1)
If Len(sString) > 0 Then
' set the range to insert the text
Set oRngAnchor = shp.Anchor.Paragraphs(1).Range
' insert the textbox text before the range object
oRngAnchor.InsertBefore _
"Textbox start << " & sString & " >> Textbox end"
End If
shp.Delete
End If
Next shp
End Sub

Deletes All Textboxes:
Sub RemoveTextBox1()
Dim shp As Shape
For Each shp In ActiveDocument.Shapes
If shp.Type = msoTextBox Then shp.Delete
Next shp
End Sub
 
J

jerem

Hey Greg,

Nice to hear from you. The reason why I want to strip the text from the
text box is for this reason: I predominantly work with large textual legal
documents. Sometimes these documents come in the form of pdf's and what's
needed is to convert the pdf into a Word document. Sometimes in the
conversion process, the pdf conversion software has trouble identifying some
text properly and will convert paragraphs into text boxes. So, I may end up
with a 100 page document that has 20 or more text boxes in them which
presents a problem when I now need to style the document with numbering
schemes and other styles. Very big nuisance to have to manually go into each
text box and grab the text out and delete the textbox. So to be able to use
a macro to strip out all the text from the textboxes, then copy the entire
conversion, do a paste special so that I now have nothing but unadulterated
text is very helpful.

I've tried your macro and the results I get are gray shaded areas of text
(the text that was in the textboxs) and right below those paragraphs empty
gray shaded text boxes. This is the result for all the text boxes.

The code below (I only added the last part to take out the beginning and
ending markings of each textbox) is right on target for taking the text out
of the textbox, placing the text right where the text box was and, finally,
deleting the textbox. The only problem I had with one document when using
this macro was on one really funky pdf conversion - it halted the macro at a
table and I got a message of somtehing to the effect of "this is not a shape"
- I'm making that up but it was something along the lines of "this doesn't
fall into the Shape category" which makes me wonder then why did it halt on
it at all and why didn't it bypass it? The only other problem with the macro
below was that if the drawing canvas is around any text box, it ignores it
entirely, which spawns another question -- how do you get rid of that
nuisance of e drawing canvas? I hit escape which makes it disappear, but
then when you click on the textbox, it pops right back up.

Anyway, try this macro out - it works quite nicely.

Sub RemoveTextBoxText()

' GrabTextFromTextBox Macro
'Copies the Text From Each Textbox in document, deletes the text box
' and pastes the text where each text box once was:

Dim shp As Shape
Dim oRngAnchor As Range
Dim sString As String
For Each shp In ActiveDocument.Shapes
If shp.Type = msoTextBox Then
' copy text to string, without last paragraph mark
sString = Left(shp.TextFrame.TextRange.Text, _
shp.TextFrame.TextRange.Characters.Count - 1)
If Len(sString) > 0 Then
' set the range to insert the text
Set oRngAnchor = shp.Anchor.Paragraphs(1).Range
' insert the textbox text before the range object
oRngAnchor.InsertBefore _
"Textbox start << " & sString & " >> Textbox end"
End If
shp.Delete
End If
Next shp
'Strip out beginning and ending textbox markers
Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "Textbox start << "
.Replacement.Text = ""
.Forward = True
' .Wrap = wdFindContinue
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ">> Textbox end"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub
 
G

Graham Mayor

Are you sure they are text boxes and not frames?

PDF was always intended as a read only graphical format. Conversion to Word
document can, as you have found, be problematical. You need better OCR
software to avoid the issue in the first place. Try Finereader which does
not use text boxes to format the document and can be fine tuned to produce
better results - but conversion from PDF to Word is always going to be a
laborious process.

If you are entitled to edit these legal documents, couldn't you ask the
originator for the document from which the PDF was created?

--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP


<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
 
J

jerem

Hey Graham,

Nice to hear from you too. In answer to your questions -- in terms of
getting a better pdf converter: the converter I work with works excellently
when it has a clean or even a decent copy to work with. Unfortunately, many
times the pdf's that we're asked to convert are grainy and sometimes even
have watermarks that one has to contend with (which probably causes this
misinterpretation in the conversion). Of course, we're not working with the
original pdf (so can't remove the watermark either). In answer to " If you
are entitled to edit these legal documents, couldn't you ask the originator
for the document from which the PDF was created?" We always ask, they never
comply - something to do with client-attorney privilege. Bunch of baloney
because they know we're going to reproduce the document anyway.

In terms of text boxes or frames: never thought of that one. What's the
distinguishing factor between the two? When I looked up on Word Help about
text boxes and frames, it talked about when it is best to use text boxes and
when to use frames [and when they talked about frames it was in the context
of web pages]. I guess it's very possible that a pdf was gotten right off
the web and that was indeed that funky pdf I was talking about earlier.

But since I have your attention, can you tell me how to get rid of that
pesky drawing canvas surrounding one of my text boxes without getting rid of
the textbox itself?

Thanks as always for your help.
 
G

Graham Mayor

The principle difference between a frame and a text box is that a frame is
in the text layer of the document (a paragraph formatting parameter) and a
text box is in the drawing layer.

As for the drawing canvas, it might be simpler to configure Word not to
create the canvas, but if you select then cut the content to the clipboard,
delete the canvas then paste the content back again, then that should work.

--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP


<>>< ><<> ><<> <>>< ><<> <>>< <>><<>

Hey Graham,

Nice to hear from you too. In answer to your questions -- in terms of
getting a better pdf converter: the converter I work with works
excellently when it has a clean or even a decent copy to work with.
Unfortunately, many times the pdf's that we're asked to convert are
grainy and sometimes even have watermarks that one has to contend
with (which probably causes this misinterpretation in the
conversion). Of course, we're not working with the original pdf (so
can't remove the watermark either). In answer to " If you are
entitled to edit these legal documents, couldn't you ask the
originator for the document from which the PDF was created?" We
always ask, they never comply - something to do with client-attorney
privilege. Bunch of baloney because they know we're going to
reproduce the document anyway.

In terms of text boxes or frames: never thought of that one. What's
the distinguishing factor between the two? When I looked up on Word
Help about text boxes and frames, it talked about when it is best to
use text boxes and when to use frames [and when they talked about
frames it was in the context of web pages]. I guess it's very
possible that a pdf was gotten right off the web and that was indeed
that funky pdf I was talking about earlier.

But since I have your attention, can you tell me how to get rid of
that pesky drawing canvas surrounding one of my text boxes without
getting rid of the textbox itself?

Thanks as always for your help.

Graham Mayor said:
Are you sure they are text boxes and not frames?

PDF was always intended as a read only graphical format. Conversion
to Word document can, as you have found, be problematical. You need
better OCR software to avoid the issue in the first place. Try
Finereader which does not use text boxes to format the document and
can be fine tuned to produce better results - but conversion from
PDF to Word is always going to be a laborious process.

If you are entitled to edit these legal documents, couldn't you ask
the originator for the document from which the PDF was created?

--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP


<>>< ><<> ><<> <>>< ><<> <>>< <>><<>




.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top