Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Coding question
Message
De
19/10/2012 14:22:22
 
Information générale
Forum:
Visual FoxPro
Catégorie:
Codage, syntaxe et commandes
Titre:
Divers
Thread ID:
01555334
Message ID:
01555358
Vues:
46
>>>>>>With the help of some skilled VFP coders on the UT, I am having some fun with a web bot.
>>>>>>I had success by splitting it into two parts. The first, using a homepage URL retrieved the sub URLs from the href HTML in the a home page
>>>>>>The second part used the (saved in a table) URLs to navigate to the pages and retreive the text.
>>>>>>
>>>>>>Now I am having trouble merging the concept into one piece of code. Here's what's happening:
>>>>>>
>>>>>>I use this segment of code to get the href URLS
>>>>>>
>>>>>>o = CREATEOBJECT("InternetExplorer.Application")
>>>>>>o.VISIBLE=.F.
>>>>>>o.NAVIGATE(lcURL)
>>>>>>FOR EACH loLink IN o.DOCUMENT.Links
>>>>>>lcURL = [ ] + loLink.Href + [ ]
>>>>>>
>>>>>>o.NAVIGATE(lcURL) 
>>>>>>* The error generated is OLE Error code 0x80070005: Access is denied.
>>>>>>
>>>>>>* I want to navigate to each lcURL as it is retreived, process the text and then navigate to the next href lcURL
>>>>>>* until the website's href URLs have been processed, then move on to the next homepage and process its href URLs
>>>>>>
>>>>>>ENDFOR && below a lot of filtering code etc.
>>>>>>
>>>>>>Is there a solution for this or should I collect all the href URLs from all the sites into a table and then proceed with return visits to collect text?
>>>>>
>>>>>
>>>>>Maybe you should wait till ReadyState = 4 :
>>>>>
>>>>>...
>>>>>o.NAVIGATE(lcURL)
>>>>>DO WHILE NOT o.ReadyState == 4
>>>>>   InKey(0.1)
>>>>>ENDDO
>>>>>..
>>>>>
>>>>
>>>>ReadyState may never become 4 and the program will stuck in endless loop
>>>
>>>OK, :-)
>>>And some timeout.
>>>
>>>
>>>ldDateTime = DATETIME()
>>>
>>>o.NAVIGATE(lcURL)
>>>lnTimeOut = .f.
>>>DO WHILE NOT o.ReadyState == 4
>>>   IF DATETIME() - ldDateTime < 10 && 10 secs timeot
>>>      lnTimeOut = .t.
>>>      EXIT
>>>   ENDIF
>>>   InKey(0.1)
>>>ENDDO
>>>IF m.lnTimeOut
>>>   ****
>>>ENDIF
>>>
>>
>>
>>New error - The requested resource is in use.
>>I may have to destroy and recreate the object.
>>More code below
>>
>>
>>
>>SELECT seedURLs
>>GO TOP
>>SCAN && FOR ALLTRIM(subject) = "business"
>>	lcURL = Seed
>>	TRANSFORM(lcURL)
>>	o.NAVIGATE(lcURL)
>>	WAIT 'Navigating!' WINDOW TIMEOUT 3
>>	FOR EACH loLink IN o.DOCUMENT.Links
>>		lcURL = [ ] + loLink.Href + [ ]
>>
>>ldDateTime = DATETIME()
>>o.NAVIGATE(lcURL)
>>
>>lnTimeOut = .f.
>>DO WHILE NOT o.ReadyState == 4
>>   IF DATETIME() - ldDateTime < 10 && 10 secs timeot
>>      lnTimeOut = .t.
>>      EXIT
>>   ENDIF
>>   InKey(0.1)
>>ENDDO
>>IF m.lnTimeOut
>>? "GOT HERE   ****"
>>ENDIF		
>>
>
>You use FOR EACH loLink IN o.DOCUMENT.Links
>
>then you suddenly change o.DOCUMENT
>
>Maybe you should use second instance of IE to get child links?

Yes, that's what worked for me.
I ain't skeert of nuttin eh?
Yikes! What was that?
Précédent
Répondre
Fil
Voir

Click here to load this message in the networking platform