>>>>>With the help of some skilled VFP coders on the UT, I am having some fun with a web bot.
>>>>>I had success by splitting it into two parts. The first, using a homepage URL retrieved the sub URLs from the href HTML in the a home page
>>>>>The second part used the (saved in a table) URLs to navigate to the pages and retreive the text.
>>>>>
>>>>>Now I am having trouble merging the concept into one piece of code. Here's what's happening:
>>>>>
>>>>>I use this segment of code to get the href URLS
>>>>>
>>>>>o = CREATEOBJECT("InternetExplorer.Application")
>>>>>o.VISIBLE=.F.
>>>>>o.NAVIGATE(lcURL)
>>>>>FOR EACH loLink IN o.DOCUMENT.Links
>>>>>lcURL = [ ] + loLink.Href + [ ]
>>>>>
>>>>>o.NAVIGATE(lcURL)
>>>>>* The error generated is OLE Error code 0x80070005: Access is denied.
>>>>>
>>>>>* I want to navigate to each lcURL as it is retreived, process the text and then navigate to the next href lcURL
>>>>>* until the website's href URLs have been processed, then move on to the next homepage and process its href URLs
>>>>>
>>>>>ENDFOR && below a lot of filtering code etc.
>>>>>
>>>>>Is there a solution for this or should I collect all the href URLs from all the sites into a table and then proceed with return visits to collect text?
>>>>
>>>>
>>>>Maybe you should wait till ReadyState = 4 :
>>>>
>>>>...
>>>>o.NAVIGATE(lcURL)
>>>>DO WHILE NOT o.ReadyState == 4
>>>> InKey(0.1)
>>>>ENDDO
>>>>..
>>>>
>>>
>>>ReadyState may never become 4 and the program will stuck in endless loop
>>
>>OK, :-)
>>And some timeout.
>>
>>
>>ldDateTime = DATETIME()
>>
>>o.NAVIGATE(lcURL)
>>lnTimeOut = .f.
>>DO WHILE NOT o.ReadyState == 4
>> IF DATETIME() - ldDateTime < 10
>> lnTimeOut = .t.
>> EXIT
>> ENDIF
>> InKey(0.1)
>>ENDDO
>>IF m.lnTimeOut
>> ****
>>ENDIF
>>
>
>
>New error - The requested resource is in use.
>I may have to destroy and recreate the object.
>More code below
>
>
>
>SELECT seedURLs
>GO TOP
>SCAN
> lcURL = Seed
> TRANSFORM(lcURL)
> o.NAVIGATE(lcURL)
> WAIT 'Navigating!' WINDOW TIMEOUT 3
> FOR EACH loLink IN o.DOCUMENT.Links
> lcURL = [ ] + loLink.Href + [ ]
>
>ldDateTime = DATETIME()
>o.NAVIGATE(lcURL)
>
>lnTimeOut = .f.
>DO WHILE NOT o.ReadyState == 4
> IF DATETIME() - ldDateTime < 10
> lnTimeOut = .t.
> EXIT
> ENDIF
> InKey(0.1)
>ENDDO
>IF m.lnTimeOut
>? "GOT HERE ****"
>ENDIF
>
You use FOR EACH loLink IN o.DOCUMENT.Links
then you suddenly change o.DOCUMENT
Maybe you should use second instance of IE to get child links?
Against Stupidity the Gods themselves Contend in Vain - Johann Christoph Friedrich von Schiller
The only thing normal about database guys is their tables.