PACK - what to expect? - Level Extreme

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

PACK - what to expect?

Message

From

14/01/2003 23:01:46

Jim Nelson
Toronto, Ontario, Canada

14/01/2003 21:50:45

Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia

General information

Forum:

Visual FoxPro

Category:

Databases,Tables, Views, Indexing and SQL syntax

Title:

Re: PACK - what to expect?

Miscellaneous

Thread ID:

00740999

Message ID:

00741742

Views:

Lots here Dragan < s >. I'll relate my observations and inferred conclusions as it goes along...

>>From what I saw yesterday, there is NO writing of used blocks into a new file at all - it changed in fragments count (lowered) but remained highly fragmented. A real surprise to me, to be sure.
>
>I don't know how to find out whether it's a new file or not - short of using something to find the beginning cluster on the disk before and after the Pack Memo.

Here's why I say it is the same file:
the operations I did was...
1) See, using the Analyze of defragger, that .FPt is fragged in 1500 clusters
2) Change memo field content of 25,000 records
3) see that cluster count is now 1525
4) run PACK MEMO
5) see that cluster count is now 1505
>
>Try with a Copy To - if the copy is also fragmented, that's it: the disk space on your disk is fragmented.

Well I used FILETOSTR/STRTOFILE regularly to create unfragmented files in these tests, checking the defragger report each time to verify 'no' fragments. Confirmed in every case. And there was lots of free space (like 20 gig, split in 2 large chunks according to the graphic).

>
>>The Disk Defragmenter utility is said by MS to defrag used space, but of course there is no way of knowing for sure.
>
>I see the white space being scattered in several chunks when Defrag finishes, and it also still has something to do if you run it repeatedly. So it doesn't really try to do everything, it just tries to do as much as it can, or is allowed to do (or how else would you be motivated to buy a real one).

I see this too, and it is a puzzler. Yet the stats that accompany the defrag report consistently show between 0-2 files with some fragments and details look like NTFS-owned files. Fragment counts in these cases are always small (2-4 typically). In addition, the free clusters section of the report always says 0% fragmented.

>
>>Well my observation strongly suggests that a big file WILL be written in one piece provided that the length gien to the OS as needing writing is the whole size of the file. In other words, it does look like NTFS does look for a contiguous set of clusters to hold all that is specified for the length of the write in question.
>>BUT it also looks like NTFS will take the FIRST such piece of contiguous space that exists - it does not restrict its initial search for space to the known large chunk of free space. I believe it would make a significant difference in things overall if it did operate as it does not now (above).
>>I've also seen that VFP, in EXCL mode, will also hold the physical write until it exhausts its cache/buffer RAM. I have seen files of 6 frags, 2 frags and a single frag in such circumstances. I say 'single frag' only on the basis that it is not seen in the defrag report (Analyze).
>
>The point here is that we're not writing a big file. We're starting with a small file (actually two or three small files - the dbf, fpt and cdx) and growing them. The OS will be asked for additional clusters for each of them in turn, and even if the .cdx is created post festum, I have a strong feeling (can't know for sure) that the dbf and fpt are written sequentially - add a record to the dbf, add the blocks to the fpt, do so while !eof(). Whenever a record in either dbf or fpt overflows the currently allocated space, Fox asks OS for more - and gets the first available chunk of free space. I suppose if we had a real file mapping utility, that we'd find the dbf and fpt clusters alternating, and then followed by the cdx - unless the cdx is also built sequentially while appending the not-deleted records.

Yes, the point that started all this for me a few years ago was exactly that - writing single records at a time.
Now I do NOT see the OS 'being asked for clusters' in any way. Rather, it strongly appears to me that VFP simply says to the OS 'write nnnn bytes to file Xxxx.xxx starting at (relative) byte bbbbbbb in that file'. The OS takes over from there, filling the residual of the last cluster, then deducting the count already written to determine exactly how many clusters it needs now to complete the balance of the writing, then it goes and allocates those clusters in a contiguous block.
I am also quite certain that the CDX is written (roughly) in the same way - as it needs to. This is apparent, again, from the defrag reporting, but also logically necessary so that other users can see the records as soon as possible.

From Fox's point of view, it is really quite simple (except for the case of .FPTs, for some reason. [maybe they are saving that for some later performance improvement magic].
The allocation scheme that I assume also nicely explains how it can be that VFP will write very large fragments when a table is used EXCL - in this case it can buffer/cache the data to be written until RAM constraints force a write, in which case the size is large by comparison to shared single records. Again, I have observed this in my tests.

>
>That's, at least, how it worked in DOS days, but I doubt it's too different today. Maybe the allocation goes in bigger chunks nowadays, but I think the table is still grown cluster by cluster.

The table and its related files are indeed grown cluster by cluster in a regular shared environment. It is similar, save for larger chunks, in an EXCL situation.
Also, in NTFS (standard) the clusters are actually smaller - 4096 bytes today compared to up to up to 64K some time back in FAT.

But 4096 is perfectly nice for VFP (at least in most (by far) cases!
What nicer too about the modern drives is that they have a huge number of clusters per track. I suspect that 200 clusters is on the low side. allowing 4 DBFS and related files (totalling 12 files) and the writes being done "concurrently" for each (and of course each write of each file is for 'related' records), then VFP has immediate access to better than 16 clusters for each file without moving at all! And it gets even better!! That is for but one track of a cylinder. Allow 4 tracks (2 platters on the HD) and you get access to 64 clusters, again for each file of the 12 files, before the heads have to be moved. And then the extreme likelihood is that they will be move only to the next track!!! Keep in mind in all that that each cluster iteslf most probably has a few records in it for the file it houses.

Cheers

Map

View

Click here to load this message in the networking platform