Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Fragmentation of native VFP data - it's mostly fine
Message
From
14/01/2003 13:38:36
 
 
To
All
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Title:
Fragmentation of native VFP data - it's mostly fine
Miscellaneous
Thread ID:
00741561
Message ID:
00741561
Views:
59
I've completed (I think) testing of the effects of fragmentation on the performance aspects of VFP. This entailed over 150 test runs on two different machines, mostly as standalone runs but some were networked (peer-to-peer between Win2000 Pro(SP3) and XP-Home(SP1)).

It occurred to me long ago that most VFP applications necessarily operate the majority of their running with significant fragmentation in their tables and related files. Considering that there is an axiom in the PC world that all fragmentation is bad, I wondered how it could be that VFP's performance is so fantastic in the face of this "fact".
This anomoly reminded me of my days as a mainframe system programmer responsible for performance/service-levels and the specific situation where I was able to demonstrate that adopting deliberate 'fragmentation' of temporary files in a complete overnight workload had a dramatic positive effect on the elapsed time of that work. Of course I was fully conversant with the innards of that system's allocation algorithms (and I knew (know) very little about NTFS' operations).

Finally I decided that over the Christmas holidays I would examine this question regarding VFP native data in the PC world. My testing was restricted to NTFS and one system is a desktop with 512meg while the other is a notebook with 384meg.
The only tool available to examine fragmentation was the "Disk Defragmenter" utility supplied with both systems' OSes. Every new test run started by deleting all of the test tables/files from any prior running and then defragmenting the HD using the OS utility. Note that this act itself presented somewhat of a unique case in that it eliminated any (fragmented) free clusters that might regularly exist interspersed within the occupied file space. In other words, the only free clusters (space) available during these tests was limited to that space after all of the occupied space.
It was because of the way that the Disk Defragmenter utility processes that I felt compelled to adopt this approach. It operates on a "best fit" basis exclusively - it makes no consideration for directory (folder) residency or creation date or last update date or file name similarities or anything else!

OBSERVATIONS
First, I was able to confirm that VFP will always fragment .DBFs, .CDXs and .FPTs whenever records are added individually (INSERT INTO.../APPEND BLANK-REPLACE) AND the tables in question are USEd in SHARED mode. Of course it is not VFP that fragments the records in question - it is the OS file system based on what VFP directs it do write.
This also means that files that are USEd exclusive or are intrinsically opened in exclusive mode - cursors, SELECT...INTO TABLE, internal temp files, etc. - will have minimal fragmentation, governed by RAM availability (VFP's cache and buffers). I would also bet that "bulk" additions to tables, like APPEND FROM..., would have that data added by the operation minimally fragmented too.

Secondly, and this was a surprise, I found that .FPT files are always written fragmented! Even a PACK command or a PACK MEMO command, though run in exclusive mode, leaves the .FPT file fragmented! The .DBF and .CDX files subject of a PACK are DEfragmented (as much as RAM permits) provided that there was at least one DELETED record in the subject .DBF. If there were no deletions to remove then the original .DBF and .CDX files are left untouched (the .FPT may still be compressed (not defragmented) if processing analysis deems it warranted).

It was conclusive - in the configuration described above - that SEEKs benefit significantly from table/file fragmentation, to the tune of being better than twice as fast as defragmented tables in standalone runs. However I could not conclude the same for Select-SQL, which did run moderately faster with defragmented tables/files. I do not know what to attribute this difference to, save guessing that reading in this case uses a different technique.

Oddly, I found that running networked (tables/files resident on the desktop system and accessed from the notebook system) resulted in significantly faster running times than were obtained when running standalone on the notebook. The network used is 100Mbps (peer-to-peer, don't forget).

Finally, I believe that I was able to conclude that that the Disk Defragmenter utility can, indeed, result in poor placement of tables/files as regards their processing relationship to each other.

CONCLUSIONS
It is worth considering "manually" defragmenting .FPT files, and a good time to do so would be immediately after running a PACK or PACK MEMO on the table in question. But I would say that periodically any time would be helpful.
I say this because testing showed that every run that required memo fields in its output (SEEK and Select-SQL where the output fields include memo fields) ranged from 20% faster to 50%+ faster when the .FPT had been defragmented and most especially so when copied using a FILETOSTR()/STRTOFILE() versus leaving them alone. this was so regardless of the fragmentation levels of the related .DBF and .CDX.

I wouldn't rely on the Disk Defragmenter utility to defragment my VFP tables/files, particularly in performance-sensitive applications. This would be especially so in the case that the HD in question is shared by other non-VFP applications and particularly if the other applications write their files in full and delete the old copy on "save" (like MS Word or Excel or most other standard PC applications do).

If the performance of your application is satisfactory then I would leave it alone and I would refrain from running any "file maintenance" or defragmentation utility against its tables/files. There is a good chance that the resultant table/file layout will be less optimized than it was prior to such an action.

At least I now have a better understanding of fragmentation and its impact on performance, and my general conclusion is that fragmentation of native VFP tables is not harmful. So I consider the old axiom about fragmentation to not be a true axiom at all.

I hope this is informative and useful. I am glad to be done with this subject!
Next
Reply
Map
View

Click here to load this message in the networking platform