Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Extracting web page main text
Message
From
11/10/2018 06:31:11
Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia
 
 
To
11/10/2018 02:40:32
General information
Forum:
Visual FoxPro
Category:
Coding, syntax & commands
Miscellaneous
Thread ID:
01662543
Message ID:
01662544
Views:
107
This message has been marked as the solution to the initial question of the thread.
>Hi All
>
>I want to retrieve a website page which I can do using the IE webcontrol. Is there a function which allows me to extract just the main body of the webpage, i.e. just extract the relevant text of the page which a user would see rather than all the html coding and formatting instructions, etc.?
lcBody=strextract(lcHtml, "<body>", "</body>",1,1)
It's much easier if you go for the DOM of the IE, and do something like

lcText=oIE.document.getElementByTag("body").innerText

I may be slightly wrong in the names here (perhaps missing a hierarchy level or the function is named differently) but you can find those yourself - debugger and/or intellisense would know. Just suspend somewere and look inside.

back to same old

the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform