Extracting Text From Visio Documents

B

Biggaford

How can I extract the text from shapes within a Visio document. After
getting the text i write it out to a text file.

Using Visio Professional 2003.
 
P

Paul Herber

How can I extract the text from shapes within a Visio document. After
getting the text i write it out to a text file.

Using Visio Professional 2003.

I have a utility that will do that for you, including all the text
within grouped shapes and text fields.
Details below.
 
B

Biggaford

Paul Herber said:
I have a utility that will do that for you, including all the text
within grouped shapes and text fields.
Details below.

Very nice tool; however I am automating visio and need to do the text
extraction from shapes programatically. my options, loop through the shapes
collection and get the text; but a problem arises with protected and and or
grouped shapes.

Is there another method, such as using xml; converting the visio document to
xml then read appropriate tags.
thank you for your response.
 
J

John... Visio MVP

Biggaford said:
Very nice tool; however I am automating visio and need to do the text
extraction from shapes programatically. my options, loop through the
shapes
collection and get the text; but a problem arises with protected and and
or
grouped shapes.

Is there another method, such as using xml; converting the visio document
to
xml then read appropriate tags.
thank you for your response.


If you are extracting through code, the protection should not be a problem.
The protection is there to prevent tampering of the shape through the UI.

As to the Group shapes, you need to use recursion to access all the
elements.

John... Visio mVP
 
D

David Parker

Of course, the Reports addin can output shape and sub-shape text (within a
group) out to an XML file.

If you are doing your own code, then do not forget to extract the
shape.Characters.Text rather than the shape.Text
 
J

John... Visio MVP

David Parker said:
Of course, the Reports addin can output shape and sub-shape text (within a
group) out to an XML file.

If you are doing your own code, then do not forget to extract the
shape.Characters.Text rather than the shape.Text

Sounds like a good topic for a blog. Even I have trouble playing with text.

John... Visio MVP
 
P

Paul Herber

Very nice tool; however I am automating visio and need to do the text
extraction from shapes programatically. my options, loop through the shapes
collection and get the text; but a problem arises with protected and and or
grouped shapes.

I'll donate some code for this, it's in Delphi so if you want it in C#
or VBA it should be very easy to translate.

======

unit TextExport;

interface

procedure doDocTextExport();
procedure doPageTextExport();
procedure doSelectionTextExport();

implementation

uses DAVSL, DAVAO, Classes, Forms, Controls, SysUtils,,
Visio_TLB, Math, SyntaxCheckOutput, Utils, Dialogs;

const pageTextExport = 0;
const docTextExport = 1;
const selectionTextExport = 2;

//------------------------------------------------------
procedure doTextExport(mode: integer);
//------------------------------------------------------
var
docObj: Visio_TLB.Document;
pagObj: Visio_TLB.Page;
shpsObj: Visio_TLB.Shapes;
shpObj: Visio_TLB.Shape;
SymbolListDlg: TOutputDlg;
pageCounter: integer;
shapeCounter: integer;

procedure doShapeText(theShape: Visio_TLB.Shape);
var
theText: wideString;
groupedShapeCounter: integer;
groupShape: Visio_TLB.Shape;
begin
theText := theShape.Characters.TextAsString;
if (theText <> '') then
begin
SymbolListDlg.OutputMemo.Lines.Add(theText);
end;
if (theShape.Shapes.Count > 0) then
begin
for groupedShapeCounter := 1 to theShape.Shapes.Count do
begin
groupShape := theShape.Shapes.Item[groupedShapeCounter];
doShapeText(groupShape); // recurse down groups of shapes
end;
end;
end;

begin
// Create memo
SymbolListDlg := TOutputDlg.Create(Forms.Application);
SymbolListDlg.Show;
SymbolListDlg.Caption := 'Text Export';
SymbolListDlg.OutputMemo.Lines.Clear;
SymbolListDlg.BringToFront;
SymbolListDlg.OutputCloseBtn.Visible := false;
SymbolListDlg.OutputSaveTextBtn.Visible := false;
SymbolListDlg.StopButton.Visible := true;
try
if (mode = docTextExport) then
begin
docObj := VSL.visApp.ActiveDocument;
for pageCounter := 1 to docObj.Pages.Count do
begin
pagObj := docObj.Pages.ItemU[pageCounter];
shpsObj := pagObj.Shapes;
if (shpsObj.Count > 0) then
begin
for shapeCounter := 1 to shpsObj.Count do
begin
try
shpObj := shpsObj.ItemU[shapeCounter];
doShapeText(shpObj);
except
end;
end;
end;
end;
end
else if (mode = pageTextExport) then
begin
pagObj := VSL.visApp.ActivePage;
shpsObj := pagObj.Shapes;
if (shpsObj.Count > 0) then
begin
for shapeCounter := 1 to shpsObj.Count do
begin
try
shpObj := shpsObj.ItemU[shapeCounter];
doShapeText(shpObj);
except
end;
end;
end;
end
else if (mode = selectionTextExport) then
begin
if (VSL.visApp.ActiveWindow.Selection.Count > 0) then
begin
for shapeCounter := 1 to
VSL.visApp.ActiveWindow.Selection.Count do
begin
try
shpObj :=
VSL.visApp.ActiveWindow.Selection.Item[shapeCounter];
doShapeText(shpObj);
except
end;
end;
end
else
SymbolListDlg.OutputMemo.Lines.Add('No shapes selected.');
end;
finally
SymbolListDlg.OutputCloseBtn.Visible := true;
SymbolListDlg.OutputSaveTextBtn.Visible := true;
SymbolListDlg.StopButton.Visible := false;
end;
end;

procedure doDocTextExport();
begin
doTextExport(docTextExport);
end;

procedure doPageTextExport();
begin
doTextExport(pageTextExport);
end;

procedure doSelectionTextExport();
begin
doTextExport(selectionTextExport);
end;


end.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top