Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
VFP vs C# String handling
Message
From
23/05/2011 07:26:11
 
 
General information
Forum:
Visual FoxPro
Category:
Other
Environment versions
Visual FoxPro:
VFP 9 SP2
OS:
Windows 7
Miscellaneous
Thread ID:
01511379
Message ID:
01511466
Views:
51
>>>>>>I remember about 10 years ago Steve Black giving a presentation at VFE Devcon in Las Vegas about string handling in VFP, using War and Peace from Project Gutenberg and I remember all of us going "Oooooh". Since that time I have taken it as an article of faith that VFP had string handling powers beyond the reach of other languages.
>>>>>>
>>>>>>So in light of my surprising results on the loop in VFP vs Other Languages(Python/Ruby) I thought I'd try something similar.
>>>>>>
>>>>>>First, the task was to load War and Peace ( over 3.2 million characters ) into a string and replace war with PEACE and peace with WAR. To give VFP the edge, I made the finding case insensitive ( for some perverse reason C# does not have a case insensitive string replace )
>>>>>>
>>>>>>Here's the VFP code : ( my VFP is rusty so if there is a better way to do this, I'm listening )
>>>>>>
>>>>>>
>>>>>>CD c:\users\pagan\documents\test
>>>>>>fname = "war and peace.txt"
>>>>>>
>>>>>>SET SAFETY OFF
>>>>>>
>>>>>>nStart = SECONDS()
>>>>>>
>>>>>>x = FILETOSTR(fname)
>>>>>>
>>>>>>x = STRTRAN(x,"war","www",-1,-1,1)
>>>>>>x = STRTRAN(x,"peace","ppp",-1,-1,1)
>>>>>>x = STRTRAN(x,"www","PEACE")
>>>>>>x = STRTRAN(x,"ppp","WAR")
>>>>>>
>>>>>>=STRTOFILE(x,"peace and war.txt",0)
>>>>>>
>>>>>>nStop = SECONDS()
>>>>>>
>>>>>>? nstop - nstart
>>>>>>
>>>>>>
>>>>>>
>>>>>>On my box this takes about 2.3 seconds. Pretty cool.
>>>>>>
>>>>>>Here's my C# - compensating as best I can for the case sensitivity :
>>>>>>
>>>>>>
>>>>>>        static void Main(string[] args)
>>>>>>        {
>>>>>>            Stopwatch stopwatch = new Stopwatch();
>>>>>>            stopwatch.Start();
>>>>>>
>>>>>>            string fname =@"C:\Users\Pagan\Documents\test\war and peace.txt";
>>>>>>            string result = @"C:\Users\Pagan\Documents\test\csharp result.txt";
>>>>>>            string foo = File.ReadAllText(fname);
>>>>>>            StringBuilder sb = new StringBuilder(foo);
>>>>>>
>>>>>>            sb.Replace("war","www");
>>>>>>            sb.Replace("War","www");
>>>>>>            sb.Replace("WAR", "www");
>>>>>>            sb.Replace("peace","ppp");
>>>>>>            sb.Replace("Peace","ppp");
>>>>>>            sb.Replace("PEACE", "ppp");
>>>>>>            sb.Replace("www","PEACE");
>>>>>>            sb.Replace("ppp","WAR");
>>>>>>
>>>>>>            foo = sb.ToString();
>>>>>>            File.WriteAllText(result, foo);
>>>>>>
>>>>>>            stopwatch.Stop();
>>>>>>
>>>>>>            Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
>>>>>>            Console.Read();
>>>>>>
>>>>>>
>>>>>>six tenths of a second.
>>>>>>
>>>>>>It gets even more interesting when a loop is added to build the string to over 32 million characters by concatenating the original string ten times before doing the replacements, resulting in a 35 mb or so output file.
>>>>>>
>>>>>>C# - about 6 seconds
>>>>>>VFP just dies, shows as not responding in the task manager and has to be killed.
>>>>>>
>>>>>>It should be mentioned that the 10x test may not be entirely fair as the VS is 64 bit and the VFP is 32 bit and there are 6gb of ram on this box. But the single file version seems dispositive in any case.
>>>>>>
>>>>>>I'd encourage others to try this and tell me what I'm doing wrong.
>>>>>
>>>>>A thought: In VFP a case sensitive search will probably be quite a bit faster than a case insensitive one - and, in the C# version you aren't covering all possible case versions. Restricting both versions to the same set of case sensitive searches should reduce the difference - but I'd predict C# would still be way ahead.
>>>>>Oh - and in the C# using just File.WriteAllText(result,sb.ToString()) would avoid creating an extra 3 million character string.
>>>>
>>>>May have this wrong, but I modified the VFP to be case sensitive ala C# and got a return of over 3.5 seconds vs the 2.6 using the single case insensitive replace :
>>>>
>>>>
>>>>x = STRTRAN(x,"war","www",-1,-1,2)
>>>>x = STRTRAN(x,"War","www",-1,-1,2)
>>>>x = STRTRAN(x,"WAR","www",-1,-1,2)
>>>>
>>>>x = STRTRAN(x,"peace","ppp",-1,-1,2)
>>>>x = STRTRAN(x,"Peace","ppp",-1,-1,2)
>>>>x = STRTRAN(x,"PEACE","ppp",-1,-1,2)
>>>>
>>>>x = STRTRAN(x,"www","PEACE",-1,-1,2)
>>>>x = STRTRAN(x,"ppp","WAR",-1,-1,2)
>>>>
>>>Odd. It's surely axiomatic that writing a low level string search that is case sensitive would require less processing than a case insensitive one. I can't think of a case insensitive algorithm that would out perform a good case sensitive one ?
>>
>>
>>The Boyer-Moore algorithm comes to mind - setting up the tables takes the same time for case sensitive or case insensitive
>>
>>http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm
>>
>>Visual: http://www-igm.univ-mlv.fr/~lecroq/string/node18.html
>
>Interesting. That might also explain why, in Charles test. 'Anna' was slower that 'Russia' (shorter word)


Correct
- the longer the word the faster the search
- a letter that occurs more than once in the word slows down the search

ie
Anna is slower than Anxy
Anxy is slower than Anxyz
Gregory
Previous
Reply
Map
View

Click here to load this message in the networking platform