Paige --
I've been doing some work reverse engineering the newly generated Office 2000 HTML to use as a format for a database application.
The reason that you get more "stuff" in Office 2000 is that HTML has been adopted as a "native file format" in this version. That means that the Office HTML format has about the same richness as the original native formats of Word, Excel and Powerpoint. As you can imagine, that means including lots of extra stuff. The HTML is probably pretty similar to what you saw in v 7 (except that style sheets are now used extensively). The XML output basically exposes the document properties which would be available through VBA. Actually, quite cool. However, an agent other than an office application will not make use of most of that.
If you need a leaner HTML, there's a utility at the MS Office Developer's site to strip for you. That may give you what you need.
Jay
Previous
Next
Reply
View the map of this thread
View the map of this thread starting from this message only
View all messages of this thread
View all messages of this thread starting from this message only