Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Has anyone seen?
Message
De
08/05/1999 13:42:59
Dragan Nedeljkovich (En ligne)
Now officially retired
Zrenjanin, Serbia
 
 
À
29/04/1999 09:19:15
Information générale
Forum:
Visual FoxPro
Catégorie:
Applications Internet
Divers
Thread ID:
00213046
Message ID:
00216428
Vues:
27
>>>>Has anyone seen any code to strip all of the HTML tags to just leave plain text. It seems I have seen it before. I could easily write it, but why reinvent the wheel.
>>>>
>>>>Thanks in advance.
>>>
>>>You could pretty easily automate IE to do this. Just open the app and us the Document's InsideText property.
>>
>>Sorry. Open the file in IE, and use the Document object's InsideText property.
>
>
>Can't do. The html exists only in a variable in Foxpro. This is also at client sites (about 150 of them). They don't all have IE. I was hoping for some VFP code that could do it. Thanks.

I wanted to do some linguistic statistics, and I have a CD with 200 books on it, so I'm planning to extract just text. You could simply build a list of tags used within the text, something like this:
* hText contains the html
* define less than sign and greater than, may not be visible in HTML
#define C_LT "<"
#define C_GT ">"
do while occurs(C_LT, hText)#0
   lnFirstLeft=at(C_LT,htext)       && find the first 
   lcTag=wordnum(subs(htext, lnFirstLeft+1), 1, C_LT+C_GT)
   htext=strtran(htext, C_LT+lcTag+C_GT, "")
endd
In the end, hText should contain pure text with anything between < and > removed. Next thing, you should replace things like ampersand+"nbsp;" with space, etc etc.

back to same old

the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.
Précédent
Répondre
Fil
Voir

Click here to load this message in the networking platform