How to parse document.xml of DOCX File and other things

M

miztaken

Hi there,
I know that DOCX is a zip file, when i open it, there are lots of xml
files and also all pictures and embedded objects are arranged in
respective folder.

But how can we know from document.xml or any other existing xml file
that the document contains images or not? If yes, how many and what
are their names.

Also non-office files are stored as oleobject1.bin, oleobject2.bin and
so on.
So how do i know the format and name of these files embedded inside
DOCX.
When i looked at those binary files then in thing the names are there,
but i need to know the structure of those binary files.

Please help me

Thank You
miztaken
 
Top