Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Removing tags from a long string
Message
From
19/01/2016 11:42:50
 
 
To
19/01/2016 09:20:57
General information
Forum:
ASP.NET
Category:
Coding, syntax and commands
Environment versions
Environment:
VB 9.0
OS:
Windows 8.1
Network:
Windows 2008 Server
Database:
MS SQL Server
Application:
Web
Miscellaneous
Thread ID:
01629858
Message ID:
01629942
Views:
33
>>You might want to look at an Html Parser like the one in HtmlAgilityPack. This will be more reliable as it can destructure the HTML into a DOM and then return the innerText. HtmlAgility can be pretty fast for lots of thing, but more importantly it generally will do a better job at pulling out text that is properly formatted, separating elements etc.
>>
>>It's not easy to get that logic right.
>
>Thanks, that is another interesting approach.
>
>I am not sure however if it can perform as fast as a simple RegEx() command.

But it will probably be more reliable. Consider the following HTML:
< form onsubmit="return NumberOfValidEmailAddresses() > 0;"> < /form>
The RegEx will fail because it finds the > in the onsubmit. A Html parser will see that as an attribute and handle it properly. If you want a RegEx that properly handles finding tags, it will be much more complex. See http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx/
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform