Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Removing tags from a long string
Message
From
18/01/2016 03:59:39
 
 
General information
Forum:
ASP.NET
Category:
Coding, syntax and commands
Environment versions
Environment:
VB 9.0
OS:
Windows 8.1
Network:
Windows 2008 Server
Database:
MS SQL Server
Application:
Web
Miscellaneous
Thread ID:
01629858
Message ID:
01629866
Views:
39
>>I have this very old method to remove tags from a string:
>>
>>
>>    ' Extract the tags
>>    ' expC1 Content
>>    Public Function ExtractTag(ByVal tcContent As String) As String
>>        Dim lcContent As String = ""
>>        Dim lnStart As Integer = 1
>>        Dim lnStop As Integer = 0
>>
>>        lcContent = Trim(tcContent)
>>
>>		' For as long as we have a beginning of a tag
>>		While lnStart > 0
>>			lnStart = InStr(lcContent, "<")
>>
>>			' If we found it
>>			If lnStart > 0 Then
>>				lnStop = InStr(Mid(lcContent, lnStart), ">")
>>
>>				' If we do not have the end of the tag
>>				If lnStop = 0 Then
>>					Exit While
>>				End If
>>
>>				' If this is at position 1
>>				If lnStart = 1 Then
>>					lcContent = Mid(lcContent, lnStop + 1)
>>				Else
>>					lcContent = Left(lcContent, lnStart - 1) + Mid(lcContent, lnStart + lnStop)
>>				End If
>>
>>			Else
>>				Exit While
>>            End If
>>
>>        End While
>>
>>        ' Now, for all the rest, we need to make sure, that we do not have a single < in the string
>>        lcContent = oApp.StrTran(lcContent, "<", "<")
>>
>>		Return lcContent
>>    End Function
>>
>>
>>The goal is simply to show a text in a Html page as is, without any tags.
>>
>>This was built a long time ago and it needs to be optimized.
>>
>>For a process of about 50 content having an average of 20 pages each, this takes about 15 seconds.
>>
>>I need to find a way to bring it under a second.
>
>Three options here (well, two really since one just uses a compiled regex for speed) : http://www.dotnetperls.com/remove-html-tags
>You'd still need the final StrTran().....

For what it's worth

The Regex solution in the link does not match when there are newlines between the left bracket and the right bracket

this does not match newlines
"<.*?>"
while this does
"<[^>]*>"
Gregory
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform