>>>>Hi All:
>>>>
>>>>How do I remove duplicate characters in a string:
>>>>
>>>>
>>>>Bigstring = 'cattttccv'
>>>>NonDupeString = NonDupe(BigString)
>>>>
>>>>* Result = 'catv'
>>>>
>>>>
>>>>Thanks,
>>>>
>>>>Yossi
>>>
>>>I must say that I am not comfortable posting my solution with all the backslash going around from useless people that only come here to confront or incite worthless fights... Maybe Fernando can time it to see if it worth pursuing it but I think it is very fast:
>>>
>>>
>>>lcText = REPLICATE('ccaaaaattttttttttvvvv', 102400)
>>>lnStart = SECONDS()
>>>loRegExp = Createobject('VBScript.RegExp')
>>>
>>>with loRegexp as VBScript.RegExp
>>> .IgnoreCase = .t.
>>> .Global = .t.
>>> .Multiline = .T.
>>> .Pattern = '(\w)(\1+)'
>>> lcNewText = .Replace(lcText, '$1')
>>>endwith
>>>
>>>? SECONDS() - lnStart, LEN(lcNewText)
>>>
>>>
>>>(and by the way, this solution is very generic, it can be implemented in any language)
>>
>>
>>As you have found this does not work because
>>- the \1 matches one char at a time
>>- the chars are not necessarily consecutive
>>
>>I think you have to loop
>>
>>This is case sensitive
>>
>>lcText = 'abcabcabcaaaaaaa.,aaaaaaaaaaaaaaaa.,'
>>
>>loRegExp = Createobject('VBScript.RegExp')
>>loRegexp.IgnoreCase = .f.
>>loRegexp.Global= .f.
>>loRegexp.Multiline = .f.
>>loRegexp.Pattern = '(.)(.*)(\1)'
>>
>>?lcText
>>do while .t.
>> matches = loRegExp.Execute(m.lcText)
>> if( matches.Count == 0 )
>> exit
>> endif
>> lcText = strtran(m.lcText, matches.Item[0].SubMatches[0], '', 2)
>>enddo
>>?lcText
>>
>
>Hi Gregory,
>
>this thread have some pearls, unbelivable.
>
>Have you done steps outside the VBScript RegExp? I think there are som limitations compared to other version . I only see never time to start tersting ...
>
>Lutz
Yes, I have done some steps outside VBScript RegExp - in .Net
Limitations ( non-exhaustive, and out of the top of my head - no time for testing now)
(1) there's no search for a word char in VBScript other than the a-zA-Z, accented chars won't match
(2) (positive and negative) lookbehind don't work. I think lookaheads do
(3) there's no way to name a (sub) match. The only thing you can do is work with submatches and count the parentheses
or write a group like class ( done that, it counts the parentheses)
(4) No balancing groups
http://msdn.microsoft.com/en-us/library/bs2twtah.aspx#balancing_group_definition
Gregory