Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
VFP vs C# String handling
Message
From
22/05/2011 03:46:49
 
 
To
22/05/2011 01:01:44
General information
Forum:
Visual FoxPro
Category:
Other
Environment versions
Visual FoxPro:
VFP 9 SP2
OS:
Windows 7
Miscellaneous
Thread ID:
01511379
Message ID:
01511385
Views:
79
>I remember about 10 years ago Steve Black giving a presentation at VFE Devcon in Las Vegas about string handling in VFP, using War and Peace from Project Gutenberg and I remember all of us going "Oooooh". Since that time I have taken it as an article of faith that VFP had string handling powers beyond the reach of other languages.
>
>So in light of my surprising results on the loop in VFP vs Other Languages(Python/Ruby) I thought I'd try something similar.
>
>First, the task was to load War and Peace ( over 3.2 million characters ) into a string and replace war with PEACE and peace with WAR. To give VFP the edge, I made the finding case insensitive ( for some perverse reason C# does not have a case insensitive string replace )
>
>Here's the VFP code : ( my VFP is rusty so if there is a better way to do this, I'm listening )
>
>
>CD c:\users\pagan\documents\test
>fname = "war and peace.txt"
>
>SET SAFETY OFF
>
>nStart = SECONDS()
>
>x = FILETOSTR(fname)
>
>x = STRTRAN(x,"war","www",-1,-1,1)
>x = STRTRAN(x,"peace","ppp",-1,-1,1)
>x = STRTRAN(x,"www","PEACE")
>x = STRTRAN(x,"ppp","WAR")
>
>=STRTOFILE(x,"peace and war.txt",0)
>
>nStop = SECONDS()
>
>? nstop - nstart
>
>
>
>On my box this takes about 2.3 seconds. Pretty cool.
>
>Here's my C# - compensating as best I can for the case sensitivity :
>
>
>        static void Main(string[] args)
>        {
>            Stopwatch stopwatch = new Stopwatch();
>            stopwatch.Start();
>
>            string fname =@"C:\Users\Pagan\Documents\test\war and peace.txt";
>            string result = @"C:\Users\Pagan\Documents\test\csharp result.txt";
>            string foo = File.ReadAllText(fname);
>            StringBuilder sb = new StringBuilder(foo);
>
>            sb.Replace("war","www");
>            sb.Replace("War","www");
>            sb.Replace("WAR", "www");
>            sb.Replace("peace","ppp");
>            sb.Replace("Peace","ppp");
>            sb.Replace("PEACE", "ppp");
>            sb.Replace("www","PEACE");
>            sb.Replace("ppp","WAR");
>
>            foo = sb.ToString();
>            File.WriteAllText(result, foo);
>
>            stopwatch.Stop();
>
>            Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
>            Console.Read();
>
>
>six tenths of a second.
>
>It gets even more interesting when a loop is added to build the string to over 32 million characters by concatenating the original string ten times before doing the replacements, resulting in a 35 mb or so output file.
>
>C# - about 6 seconds
>VFP just dies, shows as not responding in the task manager and has to be killed.
>
>It should be mentioned that the 10x test may not be entirely fair as the VS is 64 bit and the VFP is 32 bit and there are 6gb of ram on this box. But the single file version seems dispositive in any case.
>
>I'd encourage others to try this and tell me what I'm doing wrong.

One thing to point out is the maximum size of a string variable in VFP is 16MB so it's unsurprising your attempt to process a 35MB string failed. Except that, maybe, you should have gotten some sort of "string too large" error.

I don't think you're doing anything wrong, or that there's anything unexpected or magic in your findings.

String-handling functions are hugely important to most programming languages. Algorithms have been heavily studied and extensively optimized. Probably the first optimized string-handling functions were hand-coded in assembler, but it wasn't long before good C/C++ optimizing compilers produced equivalent output and performance.

The gold standard for maximum performance of any algorithm is the output of a modern optimizing C/C++ compiler targeted at the exact CPU on which it will run. It certainly isn't VFP.

I imagine MS has put a lot of effort into optimizing string-handling algorithms in the .Net framework. My guess is string handling called from C# is close to the performance of gold-standard C/C++. Your C# test was 64-bit, but 32-bit address space is 4GB (say 2GB effectively useable in Win32) which far exceeds the largest 35M test string you were using, so I wouldn't expect 64-bit to inherently give your C# results any advantage.

The real kicker is the .Net JIT compiler. That's what's producing the binary code that actually runs on the CPU. It can potentially do a ton of cool stuff, optimizing differently:

- if your 64-bit CPU is Intel or AMD
- depending on how much L1, L2 and (if present) L3 cache is present in your CPU. Optimizations that can keep code and data in CPU cache, avoiding cache misses, can gain large performance improvements
- if your CPU has advanced features (e.g. SSE1/2/3/4)
- another big kicker - if you have multiple CPU cores available, it could potentially parallelize parts of your code
- etc. etc.

So, with a JIT compiler in place you're always going to get code that runs at a pretty good fraction of maximum theoretical performance of your exact hardware.

Now, look at poor ol' VFP. It doesn't have a JIT, each version was compiled to a fixed binary before general release. That single binary has to run on a wide variety of hardware, so compiler optimizations used to build VFP's executables were limited to a lowest common denominator at that time, they could not be aggressive.

That VFP build was 32-bit only. When run on a 64-bit system it's run under WoW64, so there's a certain amount of thunking overhead going on translating calls between 32-bit and 64-bit. I have no idea what that overhead might be.

Finally, the impression I've got over the years is there is considerable variation in the performance of different VFP string-handling functions. The best ones would be those that are simply thin wrappers around high-quality C/C++ string handling library functions. The worst might be a lot of poorly-optimized custom C/C++, perhaps calling low-quality C/C++ string functions. I think that's why some VFP string handling functions show a surprisingly high fraction of the performance of native C/C++, while others are abysmal.

So, given all that, I'm mildly pleased that single-threaded 32-bit VFP displayed as much as 25% of 64-bit C# performance in your tests on your multi-core system.
Regards. Al

"Violence is the last refuge of the incompetent." -- Isaac Asimov
"Never let your sense of morals prevent you from doing what is right." -- Isaac Asimov

Neither a despot, nor a doormat, be

Every app wants to be a database app when it grows up
Previous
Reply
Map
View

Click here to load this message in the networking platform