Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Coding question
Message
 
À
19/10/2012 09:23:12
Information générale
Forum:
Visual FoxPro
Catégorie:
Codage, syntaxe et commandes
Titre:
Divers
Thread ID:
01555334
Message ID:
01555341
Vues:
48
>>>>>With the help of some skilled VFP coders on the UT, I am having some fun with a web bot.
>>>>>I had success by splitting it into two parts. The first, using a homepage URL retrieved the sub URLs from the href HTML in the a home page
>>>>>The second part used the (saved in a table) URLs to navigate to the pages and retreive the text.
>>>>>
>>>>>Now I am having trouble merging the concept into one piece of code. Here's what's happening:
>>>>>
>>>>>I use this segment of code to get the href URLS
>>>>>
>>>>>o = CREATEOBJECT("InternetExplorer.Application")
>>>>>o.VISIBLE=.F.
>>>>>o.NAVIGATE(lcURL)
>>>>>FOR EACH loLink IN o.DOCUMENT.Links
>>>>>lcURL = [ ] + loLink.Href + [ ]
>>>>>
>>>>>o.NAVIGATE(lcURL) 
>>>>>* The error generated is OLE Error code 0x80070005: Access is denied.
>>>>>
>>>>>* I want to navigate to each lcURL as it is retreived, process the text and then navigate to the next href lcURL
>>>>>* until the website's href URLs have been processed, then move on to the next homepage and process its href URLs
>>>>>
>>>>>ENDFOR && below a lot of filtering code etc.
>>>>>
>>>>>Is there a solution for this or should I collect all the href URLs from all the sites into a table and then proceed with return visits to collect text?
>>>>
>>>>
>>>>Maybe you should wait till ReadyState = 4 :
>>>>
>>>>...
>>>>o.NAVIGATE(lcURL)
>>>>DO WHILE NOT o.ReadyState == 4
>>>>   InKey(0.1)
>>>>ENDDO
>>>>..
>>>>
>>>
>>>ReadyState may never become 4 and the program will stuck in endless loop
>>
>>OK, :-)
>>And some timeout.
>>
>>
>>ldDateTime = DATETIME()
>>
>>o.NAVIGATE(lcURL)
>>lnTimeOut = .f.
>>DO WHILE NOT o.ReadyState == 4
>>   IF DATETIME() - ldDateTime < 10 && 10 secs timeot
>>      lnTimeOut = .t.
>>      EXIT
>>   ENDIF
>>   InKey(0.1)
>>ENDDO
>>IF m.lnTimeOut
>>   ****
>>ENDIF
>>
>
>
>New error - The requested resource is in use.
>I may have to destroy and recreate the object.
>More code below
>
>
>
>SELECT seedURLs
>GO TOP
>SCAN && FOR ALLTRIM(subject) = "business"
>	lcURL = Seed
>	TRANSFORM(lcURL)
>	o.NAVIGATE(lcURL)
>	WAIT 'Navigating!' WINDOW TIMEOUT 3
>	FOR EACH loLink IN o.DOCUMENT.Links
>		lcURL = [ ] + loLink.Href + [ ]
>
>ldDateTime = DATETIME()
>o.NAVIGATE(lcURL)
>
>lnTimeOut = .f.
>DO WHILE NOT o.ReadyState == 4
>   IF DATETIME() - ldDateTime < 10 && 10 secs timeot
>      lnTimeOut = .t.
>      EXIT
>   ENDIF
>   InKey(0.1)
>ENDDO
>IF m.lnTimeOut
>? "GOT HERE   ****"
>ENDIF		
>
You use FOR EACH loLink IN o.DOCUMENT.Links

then you suddenly change o.DOCUMENT

Maybe you should use second instance of IE to get child links?
Against Stupidity the Gods themselves Contend in Vain - Johann Christoph Friedrich von Schiller
The only thing normal about database guys is their tables.
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform