Xsltproc and Office web archive

I'm stuck with a problem with \[code\]xsltproc\[/code\].I have a file saved with \[code\]Word 2010\[/code\] as \[code\]"Web page in a single file"\[/code\] (Click me for source)in which there's only a string "TEST TEST".I would like to substitute the string with something taken from and XML so i thought \[code\]xsltproc\[/code\] was good.Since the \[code\]mht\[/code\] file is not a proper xml-formatted file i'm already stuck.My question about \[code\]xsltproc\[/code\] are:
  • How can i make xsltproc accept xml entities like \[code\]<meta name=3DGenerator content=3D"Microsoft Word 14">\[/code\]? The \[code\]3D\[/code\] outside quotes make it angry.
  • Can I skip parts with no xml formatting? Like the very first lines:
\[code\]MIME-Version: 1.0Content-Type: multipart/related; boundary="----=_NextPart_01CDFED3.EFE26A80"..\[/code\]Thanks in advance
 
Back
Top