Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Parsing XML with DOM
Message
De
21/11/2004 00:02:30
 
 
À
20/11/2004 18:52:37
Cetin Basoz
Engineerica Inc.
Izmir, Turquie
Information générale
Forum:
Visual FoxPro
Catégorie:
XML, XSD
Versions des environnements
Visual FoxPro:
VFP 7
Divers
Thread ID:
00962465
Message ID:
00963285
Vues:
9
Hi Cetin -

Wow! It looks like you did a lot of work there for me. I'll give it a look and try to understand it. The overall problem is a personal learning exercise for me that starts with a project called 'xmltv'. This is a linux based open source project that allows the user to d/l TV listings for just about anywhere in the world. There is a windows port that I am using to capture the data stream. I was hoping to write a VFP utility that I could make available to others who might be interested.

The xmltv zip comes with a set of utilities to d/l from the web service and do some processing on the result file. There is a parser in the package that converts the raw data stream into a modified xml file. Thus I have the option of working with the raw data file or the parsed result. The file samples you have seen in my posts are from the raw or main data download. I'm not sure which would be easier to work with (I have tried both and there are issues either way).

The built in parser comes with a dtd that could probably be modified if I knew what I was doing but I thought it would be easier to start with simpler subsections of the raw file.

There is also a utility to convert the parsed xml file into a text file but the conversion loses some of the data which is why I'm trying to write my own handler. I can process the text file but some of the interesting descriptive data isn't there.

You can download the package at:
http://sourceforge.net/project/showfiles.php?group_id=39046

Be sure to get the win32 version and not the linux tar.

The setup is pretty simple. Just unzip the files into a directory created for the project. Then log on to www.labs.zap2it.com and set up a free account that includes the tv stations you want listings for. There is a code bundled in the docs for the project that you will need to set up the account.

The program runs in a DOS window. Your first command will be something like

xmltv tv_grab_na_dd --configure

followed by

xmltv tv_grab_na_dd --days 2 --dd-data dddata.RAW --output mylistings.xml

The call to "tv_grab_na_dd" is for north america, so if you want some other location like Turkey or France there will be a different syntax. Anyway, the above call creates 2 days worth of listings in 2 files. The RAW file is the primary data stream and the xml is the result of the built in parser. Their content is the same but with different formats.

The xmltv.dtd file is fairly well documented but I'm so new to this subject I still couldn't make much sense out of it. Its probably a good place to start though.

Let me know if you decide to take a look at it. Since this is supposed to be a learning exercise for me, I suppose the first question would be should I work with a dtd or a schema and why?

My approach thus far has been to split the RAW file up into pieces and try to create individual cursors from them. Some of these have worked and some haven't. Based on your previous post I have tried:

schedules.xml
<?xml version='1.0' encoding='windows-1252' standalone='no'?>
<!DOCTYPE tv SYSTEM 'schedules.dtd'>
<tv>
<schedule program='MV0067420000' station='10021' time='2004-11-15T04:45:00Z' duration='PT03H30M' tvRating='TV-PG'/>
<schedule program='MV0055740000' station='10021' time='2004-11-15T08:15:00Z' duration='PT03H45M'/>
<schedule program='SH3489670000' station='10021' time='2004-11-15T12:00:00Z' duration='PT01H00M' tvRating='TV-PG' closeCaptioned='true'/>
<schedule program='MV0041190000' station='10021' time='2004-11-15T13:00:00Z' duration='PT02H00M' tvRating='TV-G'/>
<schedule program='MV0114980000' station='10021' time='2004-11-15T15:00:00Z' duration='PT01H45M' tvRating='TV-G'/>
<schedule program='MV0091490000' station='10021' time='2004-11-15T16:45:00Z' duration='PT02H15M' tvRating='TV-PG'/>
<schedule program='MV0200780000' station='10021' time='2004-11-15T19:00:00Z' duration='PT02H00M' tvRating='TV-PG'/>
<schedule program='MV0188590000' station='10021' time='2004-11-15T21:00:00Z' duration='PT02H00M'/>
<schedule program='MV0526170000' station='10021' time='2004-11-15T23:00:00Z' duration='PT02H00M' tvRating='TV-14' closeCaptioned='true'/>
<schedule program='MV0543400000' station='10021' time='2004-11-16T01:00:00Z' duration='PT02H35M' tvRating='TV-PG'/>
<schedule program='MV0320090000' station='10021' time='2004-11-16T03:35:00Z' duration='PT02H15M' tvRating='TV-14'/>
<schedule program='MV0543400000' station='10021' time='2004-11-16T05:50:00Z' duration='PT02H30M' tvRating='TV-PG'/>
<schedule program='MV0526170000' station='10021' time='2004-11-16T08:20:00Z' duration='PT02H00M'/>
</tv>
with a dtd that looks like:
<?xml version="1.0" encoding="windows-1252"?>
<!ELEMENT tv (schedule)>
<!ELEMENT schedule (#PCDATA)>
<!ATTLIST schedule program CDATA #REQUIRED
	station CDATA #REQUIRED
	time CDATA #REQUIRED
	duration CDATA #REQUIRED
	tvRating CDATA #IMPLIED
	stereo CDATA #IMPLIED
	closeCaptioned CDATA #IMPLIED>
	
but this gives me a parse error that says
"element content is invalid according to the dtd/schema Line 5 position 99"

So far I haven't been able to figure out what that means.

Thanks again
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform