Does index tag on DELETED() help? - Level Extreme

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Does index tag on DELETED() help?

Message

De

24/11/2004 06:43:11

Mike Yearwood
Toronto, Ontario, Canada

À

23/11/2004 09:37:14

Walter Meester
HoogkarspelPays-Bas

Information générale

Forum:

Visual FoxPro

Catégorie:

Autre

Titre:

Re: Does index tag on DELETED() help?

Divers

Thread ID:

00958911

Message ID:

00964197

Vues:

10

Walter

FWIW I agree with you. The DELETED() tag works fine on a local disk because the bandwidth is so high. Everywhere else it is suspect.

The deleted tag across a LAN is less advantageous because the bandwidth is both lower than the disk and irregular. In a LAN or RAS or VPN setup, no deleted tag should be better. If you have a slow PC and a lousy drive, the network may be faster, and so the deleted tag may seem to help, but that might just be abberation.

If FoxPro was used like SQL Server, end-user queries executed remotely by a dedicated FoxPro Server, then the Deleted() tag may be beneficial, especially with a large number of users. It may be beneficial in a web context where vfp is serving up data - if the data is in the same machine that VFP is on.

>David,
>
>>Let me say what I've been saying all along perhaps a little more plainly.
>
>>If the time to retrieve the binary index and apply it to the current Rushmore bitmap is less than the time it takes to retrieve the excess records and evaluate if they should remain in the final result set then there will be a net gain in performance.
>
>Yes, we can agree upon that. However do you realize this in general will only be the case with high selective indexes as oposed to low selective. The time to retrieve about a whole index tag is seldom smaller than downloading a few excess records in the rushmore optimizable bitmap.
>
>>How long to you "think" this stuff is taking?
>
>Depends on a large number of factors, Buffering, Network speed, Avialable memory, Indexvalue size.
>
>>Your 1999 tests are not valid for binary indexes. Back on 11/10 you told Thomas you'd be posting the results of new tests and I've not seen any numbers posted.
>
>See: Re: Does index tag on DELETED() help? Thread #958911 Message #959902
>
>>> Whether or not it has been getting faster (with the binary index) is not the point:
>>I on the other hand think it is the whole point.
>>>It still is overhead.
>>more on your neglected overhead later in the reply...
>
>>>If I'm querying for a PK Value in a million record table, it essentially only need to examine the PK index, getting one record. If you drag a DELETED() tag of a million nodes into the mix, whether or not it is going to filter out the deleted index, is just overhead. The only index you needed was the PK index to have to fastest access in about all circumastances.
>
>>Since you diverge, I'll play the devils advocate here, if I was doing a one record SELECT I'd be inclined to just do an indexseek().
>
>Which fails under certain circumstances and cannot be used in SET ORIENTED solutions like SQL views: Not a good alternative.
>
>>>Sure, but retrieving no DELETED() index file from what kind of I/O still is worse than no I/O at all, no matter what.
>
>>Yes and doing the I/O to retrieve the binary index will pretty quickly outperform the addtional record I/O you are causing to happen.
>
>Do the tests and be surprised. No it will not. Certainly not in cases when only a small subset of data is to be retrieved from a large table. Point me to an example which shows otherwise (a significance difference, (not a few hundreds of a second).
>
>>Bottom line removing I/O has a lot better performance gain than a few microseconds it takes to build a bitmap and AND it.
>
>An bitmap index requires I/O itself and in mosts case is bigger than the I/O needed for getting the record info itself. In my test it took 0.166 seconds to get a particular query done on one record and only 0.023 without any deleted flag index at all. Note that the only record retrieved was in fact deleted(). Also, don't forget the processing an index, even if it is a bitmap index takes memory which in itself could drag your applications performance down.
>
>>>Note that the 18 times faster is not a constant factor. In my tests it was about 5 times or so, while in certain cases it could be slower than a traditional DELETED() tag.
>>
>>What tests are those? Must be with uninterestingly small tables.
>
>See link above: a 130K record (length 1400 bytes) table
>
>
>>N records, regular index size, binary index size, ratio of regular index size to binary index size
>>

>>      10,     3072,    3072, 1
>>     100,     3072,    3072, 1
>>    1000,     5632,    3072, 0.546
>>   10000,    24756,    4608, 0.186
>>  100000,   307712,   16384, 0.053
>> 1000000,  3205632,  135168, 0.042
>>10000000, 38513152, 1314816, 0.034
>>

>
>>At 100,000 rows the difference is a factor of 18, as the tables get bigger the difference become greater. At 10,000,000 the difference is a factor of 33 times. So yes in pre-VFP9 versions the cost to bring a 38mb index across the wire is quite high, the cost to bring 1mb in VFP9 is much less.
>
>Well take you 1,000,000 record table, with an indexsize of 135K. On a 10Mbps network dragging this index arround probably takes arround .15 seconds. So there is no way any rushmore optimizable query would complete below 0.15 second (on unbuffered data), just because that is the transfer time. And we did not reserve any memory for 1 MB of bitmap and processed the BITAND.
>
>Normally a query on one or on few records could perform very quickly, but now the minimum time probably is arround .2 seconds. Pretty irritating if you expect a resonse of about .02 seconds.
>
>>>It beats me. Maybe because they have been requested by some people who do not see the insanity of it. Go over the SQL Server forum and ask about always using a low selective index. You might get a different picture.
>
>>Again I'm not saying always use a low selectivity index. I am saying use indexes that improve performance within the confines of a particular application.
>
>Sure, but then I want to see a CASE where the deleted TAG index would make sense. because up to now I've seen:
>
>- Worse performance if bandwidth is limited in many cases.
>- Increased memory usage
>- Risk of corruption (You can't currupt and index that does not exist)
>- Increased time maintaining the index (APPEND and DELETE commands)
>
>What I have not seen is a case where it matters, appart from the cases I already outlined in 1999.
>
>>You are fixated on "micro" selectivity of the index, and that's just not always the case.
>
>>I am talking about reducing I/O. If using a binary index can help reduce actual record I/O it's a good thing. You also neglect the "overhead" of removing the excess records from the final result set.
>
>Yes I do, indeed as DELETED() records should not occur in large numbers in a production environment. And even if it contains, lets say 50% of deleted record, it would only take something less twice as long (probably arround 180% or so). On the other side queries on small amounts of data (E.g. DELETES and UPDATES in SQL views on large tables) take significantly longer.
>
>>ummmm totally irrelevant to this discussion.
>
>In so far that this case is not the only case where isanity applies
>
>>Again it is the difference between zero selectivity and micro selectivity and low selectivity and any other level of selectivity. If the selectivity is enough to reduce overall I/O then there is a gain. You advocate the retrieval of a 3000 byte record to get the one byte of deleted information I say there is a high probability that pulling 1 bit of data from a binary index tag can be done a lot faster.
>
>That is the whole problem. Rushmore does not take one bit, it takes the whole index value set to be downloaded. So if your table containes 1,000,000 records and the index value of .F. (not deleted()) is arround 90K, it will only get (thoretical) beneficial when weeding more than 30 records. And then we did not even talk about reserving memory and the overhead in the rushmore process itself.
>
>>For example if the record size is 3000 bytes and the table size is 100,000 rows. It only takes the binary index to eliminate 6 records ( 16384 / 3000 ) before there is a net reduction in total I/O. 6 rows out of 100,000 is pretty low on the selectivity scale.
>
>That would be in theory when you disregard the reserving memory and doing the actual rushmore process. A quick test did not indicate any significance in a 200K table weeding out 36 out of 128 records.
>
>>>Well we know that story: It is left to the individual developer. The fact is that the average developer does not have a clue what he is testing and it is by no means representative to the actual production environment. The threads about this topic in the past is just a plain prove of that.
>
>>So hopefully threads like this can educate those developers that don't do adequate performance testing.
>
>I hope so...
>
>>>Then I don't understand why you come to different conclusions that I do. Please react on my testing report.
>
>
>My whole take on this is that people should NOT apply a TAG on DELETED() in general. It is a pure myth that it speeds up your queries significantly, and even the binary index does not change much here.
>
>However, there might be individual cases where it might improve performance a bit when dealing with significant amounts of deleted() records. But be aware that adding them might improve one specific case and decrease in another esspecially when tables grow larger. So you're warned.
>
>Walter,

Click here to load this message in the networking platform