How to parse document.xml of DOCX File and other things

M

miztaken

Hi there,
I know that DOCX is a zip file, when i open it, there are lots of xml
files and also all pictures and embedded objects are arranged in
respective folder.

But how can we know from document.xml or any other existing xml file
that the document contains images or not? If yes, how many and what
are their names.

Also non-office files are stored as oleobject1.bin, oleobject2.bin and
so on.
So how do i know the format and name of these files embedded inside
DOCX.
When i looked at those binary files then in thing the names are there,
but i need to know the structure of those binary files.

Please help me

Thank You
miztaken
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top