Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Coding question
Message
De
19/10/2012 09:23:12
 
Information générale
Forum:
Visual FoxPro
Catégorie:
Codage, syntaxe et commandes
Titre:
Divers
Thread ID:
01555334
Message ID:
01555340
Vues:
57
>>>>With the help of some skilled VFP coders on the UT, I am having some fun with a web bot.
>>>>I had success by splitting it into two parts. The first, using a homepage URL retrieved the sub URLs from the href HTML in the a home page
>>>>The second part used the (saved in a table) URLs to navigate to the pages and retreive the text.
>>>>
>>>>Now I am having trouble merging the concept into one piece of code. Here's what's happening:
>>>>
>>>>I use this segment of code to get the href URLS
>>>>
>>>>o = CREATEOBJECT("InternetExplorer.Application")
>>>>o.VISIBLE=.F.
>>>>o.NAVIGATE(lcURL)
>>>>FOR EACH loLink IN o.DOCUMENT.Links
>>>>lcURL = [ ] + loLink.Href + [ ]
>>>>
>>>>o.NAVIGATE(lcURL) 
>>>>* The error generated is OLE Error code 0x80070005: Access is denied.
>>>>
>>>>* I want to navigate to each lcURL as it is retreived, process the text and then navigate to the next href lcURL
>>>>* until the website's href URLs have been processed, then move on to the next homepage and process its href URLs
>>>>
>>>>ENDFOR && below a lot of filtering code etc.
>>>>
>>>>Is there a solution for this or should I collect all the href URLs from all the sites into a table and then proceed with return visits to collect text?
>>>
>>>
>>>Maybe you should wait till ReadyState = 4 :
>>>
>>>...
>>>o.NAVIGATE(lcURL)
>>>DO WHILE NOT o.ReadyState == 4
>>>   InKey(0.1)
>>>ENDDO
>>>..
>>>
>>
>>ReadyState may never become 4 and the program will stuck in endless loop
>
>OK, :-)
>And some timeout.
>
>
>ldDateTime = DATETIME()
>
>o.NAVIGATE(lcURL)
>lnTimeOut = .f.
>DO WHILE NOT o.ReadyState == 4
>   IF DATETIME() - ldDateTime < 10 && 10 secs timeot
>      lnTimeOut = .t.
>      EXIT
>   ENDIF
>   InKey(0.1)
>ENDDO
>IF m.lnTimeOut
>   ****
>ENDIF
>
New error - The requested resource is in use.
I may have to destroy and recreate the object.
More code below
SELECT seedURLs
GO TOP
SCAN && FOR ALLTRIM(subject) = "business"
	lcURL = Seed
	TRANSFORM(lcURL)
	o.NAVIGATE(lcURL)
	WAIT 'Navigating!' WINDOW TIMEOUT 3
	FOR EACH loLink IN o.DOCUMENT.Links
		lcURL = [ ] + loLink.Href + [ ]

ldDateTime = DATETIME()
o.NAVIGATE(lcURL)

lnTimeOut = .f.
DO WHILE NOT o.ReadyState == 4
   IF DATETIME() - ldDateTime < 10 && 10 secs timeot
      lnTimeOut = .t.
      EXIT
   ENDIF
   InKey(0.1)
ENDDO
IF m.lnTimeOut
? "GOT HERE   ****"
ENDIF		
I ain't skeert of nuttin eh?
Yikes! What was that?
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform