Read text from Word Documents, allowing for Field Codes, Tables,Shapes etc

J

jon.d.morrell

I am writing a tool that is an in-house checking mechanism for Word documents. The tool is written in c#. By the nature of the tests we need to read the text of the document into C#. The problem we have is that when we find something in our string, we need to show it on the document and so the indexof any text in the string must match the corresponding position in the document.

But after a lot of time trying to work out adjustments for field codes and their different behaviours, as well as the many different shapes and tables, etc., in a document, I cannot find a fast reliable way to do this.

I know a slow way to do it reliably, by reading in each word in the document, together with its start position then pad with some blanks, but this is far too slow.

Please don't comment if you are going to suggest doing my checking within the document itself rather than reading a string into code, I have been downthis road and there are things that cannot be done.

Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top