<?xml version='1.0' encoding='utf-8'?> <SOAP-ENV:Envelope xmlns:SOAP-ENV='http://schemas.xmlsoap.org/soap/envelope/' xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:SOAP-ENC='http://schemas.xmlsoap.org/soap/encoding/'> <SOAP-ENV:Body> <ns1:downloadResponse SOAP-ENV:encodingStyle='http://schemas.xmlsoap.org/soap/encoding/' xmlns:ns1='urn:TMSWebServices'> <xtvdResponse xsi:type='ns1:xtvdResponse'>some of the sections look like this:
<stations> <station id='10021'> <callSign>AMC</callSign> <name>AMC</name> <affiliate>Satellite</affiliate> </station> <station id='16331'> <callSign>ANIMAL</callSign> <name>Animal Planet</name> <affiliate>Satellite</affiliate> </station> </stations> <schedules> <schedule program='EP1151270200' station='11867' time='2004-11-17T01:00:00Z' duration='PT00H30M' tvRating='TV-PG' stereo='true' closeCaptioned='true'/> <schedule program='EP1151270201' station='11867' time='2004-11-17T01:30:00Z' duration='PT00H30M' tvRating='TV-PG' stereo='true' closeCaptioned='true'/> <schedule program='EP2654380045' station='11867' time='2004-11-17T02:00:00Z' duration='PT00H30M' tvRating='TV-14' stereo='true' closeCaptioned='true'/> <schedule program='EP2654380046' station='11867' time='2004-11-17T02:30:00Z' duration='PT00H35M' tvRating='TV-14' stereo='true' closeCaptioned='true'/> <schedule program='EP6892960005' station='11867' time='2004-11-17T03:05:00Z' duration='PT01H00M'/> <schedule program='EP4638260091' station='12131' time='2004-11-17T01:30:00Z' duration='PT00H30M' tvRating='TV-Y7' stereo='true' closeCaptioned='true'> <part number='1' total='2'/> </schedule> <schedule program='MV1032330000' station='11867' time='2004-11-17T04:05:00Z' duration='PT01H45M' tvRating='TV-14' closeCaptioned='true'/> <schedule program='MV1032330000' station='11867' time='2004-11-17T05:50:00Z' duration='PT01H45M' tvRating='TV-14' closeCaptioned='true'/> <schedule program='MV0280340000' station='11867' time='2004-11-17T07:35:00Z' duration='PT02H00M' tvRating='TV-PG' closeCaptioned='true'/> <schedule program='SH2148780000' station='11867' time='2004-11-17T09:35:00Z' duration='PT00H25M'/> <schedule program='EP1282610060' station='11867' time='2004-11-17T10:00:00Z' duration='PT00H30M' tvRating='TV-PG' stereo='true' closeCaptioned='true'/> <schedule program='EP1900270045' station='11867' time='2004-11-17T10:30:00Z' duration='PT00H30M' tvRating='TV-PG' stereo='true' closeCaptioned='true'/> </schedules> <programs> <program id='MV0008290000'> <title>Rooster Cogburn</title> <description>One-eyed Marshal Cogburn (John Wayne) helps a Bible-toting spinster (Katharine Hepburn) find the men who killed her preacher father.</description> <mpaaRating>PG</mpaaRating> <starRating>**</starRating> <runTime>PT01H47M</runTime> <year>1975</year> <series></series> <advisories> <advisory>Adult Situations</advisory> <advisory>Violence</advisory> </advisories> </program> </programs>There are others but these pretty well illustrate what I have. So the first question is : since this is UTF-8 and not windows coding, can I write a schema that will work with it? What would be the proper header for the schema/dtd?
mystr = filetostr('dddata.raw') mystr = "<tv"> + strextract(mystr,"<schedules>","</schedules>") + "</tv?" = xmltocursor(mystr,'tempxml')but I get a parse error presumably because of the bad format of the source. I think the cause of the inconsistant format is due to the length of the line which seems to split when it gets too many data fields.