Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
VFP vs C# String handling
Message
From
23/05/2011 07:42:59
 
 
General information
Forum:
Visual FoxPro
Category:
Other
Environment versions
Visual FoxPro:
VFP 9 SP2
OS:
Windows 7
Miscellaneous
Thread ID:
01511379
Message ID:
01511467
Views:
66
>>>>>>I remember about 10 years ago Steve Black giving a presentation at VFE Devcon in Las Vegas about string handling in VFP, using War and Peace from Project Gutenberg and I remember all of us going "Oooooh". Since that time I have taken it as an article of faith that VFP had string handling powers beyond the reach of other languages.
>>>>>>
>>>>>>So in light of my surprising results on the loop in VFP vs Other Languages(Python/Ruby) I thought I'd try something similar.
>>>>>>
>>>>>>First, the task was to load War and Peace ( over 3.2 million characters ) into a string and replace war with PEACE and peace with WAR. To give VFP the edge, I made the finding case insensitive ( for some perverse reason C# does not have a case insensitive string replace )
>>>>>>
>>>>>>Here's the VFP code : ( my VFP is rusty so if there is a better way to do this, I'm listening )
>>>>>>
>>>>>>
>>>>>>CD c:\users\pagan\documents\test
>>>>>>fname = "war and peace.txt"
>>>>>>
>>>>>>SET SAFETY OFF
>>>>>>
>>>>>>nStart = SECONDS()
>>>>>>
>>>>>>x = FILETOSTR(fname)
>>>>>>
>>>>>>x = STRTRAN(x,"war","www",-1,-1,1)
>>>>>>x = STRTRAN(x,"peace","ppp",-1,-1,1)
>>>>>>x = STRTRAN(x,"www","PEACE")
>>>>>>x = STRTRAN(x,"ppp","WAR")
>>>>>>
>>>>>>=STRTOFILE(x,"peace and war.txt",0)
>>>>>>
>>>>>>nStop = SECONDS()
>>>>>>
>>>>>>? nstop - nstart
>>>>>>
>>>>>>
>>>>>>
>>>>>>On my box this takes about 2.3 seconds. Pretty cool.
>>>>>>
>>>>>>Here's my C# - compensating as best I can for the case sensitivity :
>>>>>>
>>>>>>
>>>>>>        static void Main(string[] args)
>>>>>>        {
>>>>>>            Stopwatch stopwatch = new Stopwatch();
>>>>>>            stopwatch.Start();
>>>>>>
>>>>>>            string fname =@"C:\Users\Pagan\Documents\test\war and peace.txt";
>>>>>>            string result = @"C:\Users\Pagan\Documents\test\csharp result.txt";
>>>>>>            string foo = File.ReadAllText(fname);
>>>>>>            StringBuilder sb = new StringBuilder(foo);
>>
>>>>>>
>>>>>>            sb.Replace("war","www");
>>>>>>            sb.Replace("War","www");
>>>>>>            sb.Replace("WAR", "www");
>>>>>>            sb.Replace("peace","ppp");
>>>>>>            sb.Replace("Peace","ppp");
>>>>>>            sb.Replace("PEACE", "ppp");
>>>>>>            sb.Replace("www","PEACE");
>>>>>>            sb.Replace("ppp","WAR");
>>>>>>
>>>>>>            foo = sb.ToString();
>>>>>>            File.WriteAllText(result, foo);
>>>>>>
>>>>>>            stopwatch.Stop();
>>>>>>
>>>>>>            Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
>>>>>>            Console.Read();
>>>>>>
>>>>>>
>>>>>>six tenths of a second.
>>>>>>
>>>>>>It gets even more interesting when a loop is added to build the string to over 32 million characters by concatenating the original string ten times before doing the replacements, resulting in a 35 mb or so output file.
>>>>>>
>>>>>>C# - about 6 seconds
>>>>>>VFP just dies, shows as not responding in the task manager and has to be killed.
>>>>>>
>>>>>>It should be mentioned that the 10x test may not be entirely fair as the VS is 64 bit and the VFP is 32 bit and there are 6gb of ram on this box. But the single file version seems dispositive in any case.
>>>>>>
>>>>>>I'd encourage others to try this and tell me what I'm doing wrong.
>>>>>
>>>>>A thought: In VFP a case sensitive search will probably be quite a bit faster than a case insensitive one\ - and, in the C# version you aren't covering all possible case versions. Restricting both versions to the same set of case sensitive searches should reduce the difference - but I'd predict C# would still be way ahead.
>>>>>Oh - and in the C# using just File.WriteAllText(result,sb.ToString()) would avoid creating an extra 3 million character string.
>>>>
>>>>May have this wrong, but I modified the VFP to be case sensitive ala C# and got a return of over 3.5 seconds vs the 2.6 using the single case insensitive replace :
>>>>
>>>>
>>>>x = STRTRAN(x,"war","www",-1,-1,2)
>>>>x = STRTRAN(x,"War","www",-1,-1,2)
>>>>x = STRTRAN(x,"WAR","www",-1,-1,2)
>>>>
>>>>x = STRTRAN(x,"peace","ppp",-1,-1,2)
>>>>x = STRTRAN(x,"Peace","ppp",-1,-1,2)
>>>>x = STRTRAN(x,"PEACE","ppp",-1,-1,2)
>>>>
>>>>x = STRTRAN(x,"www","PEACE",-1,-1,2)
>>>>x = STRTRAN(x,"ppp","WAR",-1,-1,2)
>>>>
>>>Odd. It's surely axiomatic that writing a low level string search that is case sensitive would require less processing than a case insensitive one. I can't think of a case insensitive algorithm that would out perform a good case sensitive one ?
>>
>>True on one pass. But to get all three possiblities using case sensitive takes three passes. ( the way C# has to do it )
>
>Umm, no. If it's a case *sensitive* search you just look for, eg, 'war' - no other combination matters. Case *insensitive* on the same word has 8 (?) possible combinations and that increases dramatically for longer sequences.
>

My point was that in C# since there is no way to do a case insensitive search you have to do in effect three searches (yes, I realize there would be other possible combinations of upper and lower but these are in fact the three most likely )

You were saying that in VFP the case insensitive search would take longer than the case sensitive search which is true. But to get the same outcome, one needs to do as I did in C# - search three times ( sensitively <s> ). So, while each pass may be faster than the one insensitive pass, the cumulative time is still greater. No?


Charles Hankey

Though a good deal is too strange to be believed, nothing is too strange to have happened.
- Thomas Hardy

Half the harm that is done in this world is due to people who want to feel important. They don't mean to do harm-- but the harm does not interest them. Or they do not see it, or they justify it because they are absorbed in the endless struggle to think well of themselves.

-- T. S. Eliot
Democracy is two wolves and a sheep voting on what to have for lunch.
Liberty is a well-armed sheep contesting the vote.
- Ben Franklin

Pardon him, Theodotus. He is a barbarian, and thinks that the customs of his tribe and island are the laws of nature.
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform