Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Extracting web page main text
Message
From
11/10/2018 12:42:21
 
 
To
11/10/2018 06:31:11
Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia
General information
Forum:
Visual FoxPro
Category:
Coding, syntax & commands
Miscellaneous
Thread ID:
01662543
Message ID:
01662551
Views:
65
>>Hi All
>>
>>I want to retrieve a website page which I can do using the IE webcontrol. Is there a function which allows me to extract just the main body of the webpage, i.e. just extract the relevant text of the page which a user would see rather than all the html coding and formatting instructions, etc.?
>
>
lcBody=strextract(lcHtml, "<body>", "</body>",1,1)
>
>It's much easier if you go for the DOM of the IE, and do something like
>
>lcText=oIE.document.getElementByTag("body").innerText
>
>I may be slightly wrong in the names here (perhaps missing a hierarchy level or the function is named differently) but you can find those yourself - debugger and/or intellisense would know. Just suspend somewere and look inside.

Hi Dragan, hope you well. Yes, this is what I was trying to remember, the innerText thing. Thanks. Best regards.
In the End, we will remember not the words of our enemies, but the silence of our friends - Martin Luther King, Jr.
Previous
Reply
Map
View

Click here to load this message in the networking platform