Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Does index tag on DELETED() help?
Message
From
10/11/2004 05:21:55
Walter Meester
HoogkarspelNetherlands
 
 
To
10/11/2004 02:33:40
Walter Meester
HoogkarspelNetherlands
General information
Forum:
Visual FoxPro
Category:
Other
Miscellaneous
Thread ID:
00958911
Message ID:
00959902
Views:
8
Hi david

The following is the best cases for the DELETED() tag thing, but it shows no about advantage.

In this example I used a 130K record (length 1400 bytes) table and tested this on a local workstation (no network involved). There always is an index on Cycleno. Cycleno 1327 does contain 128 record of which 36 were deleted. None of the indexes trying to optimize on the DELETED() tag helped to speed up beyond the speed of without.

However the test below did crash my VFP9 a couple of times, maybe because of lots of memory use by adding and deleting tags.
USE LAbdetails

SET DELETED ON

SELECT LabDetails
INDEX ON DELETED() TAG DEL

WAIT WINDOW 
nSec= SECONDS()

SELECT * FROM labdetails WHERE cycleno = "1327      " NOFILTER INTO CURSOR x

? SECONDS() - nSec

SELECT LabDetails
DELETE TAG Del
INDEX ON DELETED() TAG del BINARY

WAIT WINDOW 
nSec= SECONDS()

SELECT * FROM labdetails WHERE cycleno = "1327      " NOFILTER INTO CURSOR x

? SECONDS() - nSec

SELECT LabDetails
DELETE TAG Del
INDEX ON Cycleno TAG cycleno2 FOR !DELETED()

WAIT WINDOW 
nSec= SECONDS()

SELECT * FROM labdetails WHERE cycleno = "1327      " NOFILTER INTO CURSOR x

? SECONDS() - nSec

SELECT LabDetails
DELETE TAG cycleno2

WAIT WINDOW 
nSec= SECONDS()

SELECT * FROM labdetails WHERE cycleno = "1327      " NOFILTER INTO CURSOR x

? SECONDS() - nSec
Doing more or less in a network environment shows a big difference when the query is run for the first time: the whole index tag has to be dragges accross the network. This resulted in something like the following results.

Query on 1 deleted record in SET DELETE ON on a 130K record table with clean buffers:
Network: 54 Mbps wireless.

With INDEX ON DELETED() TAG del
0.512 Seconds
With INDEX ON DELETED() TAG del BINARY
0.166 Seconds
With INDEX ON Cycleno TAG cycleno2 FOR !DELETED()
0.847 Seconds
Without a DELETED() flag optimizable index
0.023 Seconds

In all cases the query did not return any record at all.
The tests were done 3 times to get the average above.

You see that the binary index indeed is smaller and is dragged accross the network much faster than a regular index.

Though the INDEX ON Cycleno TAG cycleno2 FOR !DELETED() *IS* optimizable (SYS(3054,2) reports full optimization, I suspect that it drags the whole Cycleno2 index to the workstation (a char(10) field), explaining the poor results.

Even worse is that the optimizer could use any index with the FOR NOT DELETED() filter to optimize the query. So that *if* you've got one index with such filter it is getting dragged down the workstation when SET DELETE = ON, no matter what your query looks like (Can be proven with SYS(3054,2)).

Doing the same test on querying 128 records with cycleno = "1327" of which 36 are deleted (meaning an endresult of 92 record) the difference were much less convincing as the results varied a lot. It seemed that in this case there was not much difference between the using Binary index and without using an DELETED() optimizable index (note that we are talking about 28% (36 out of 128) deleted records here)

I did not test the other possiblities yet, but I don't expect them to show an total different look on this.

PLEASE BEAR IN MIND THAT THOSE TEST WERE DONE WITH A CLEAN CACHE. RUNNING QUERIES MULTIPLE TIMES MIGHT SHOW DIFFERENT RESULTS AS DATA CAN BE CHACHED ON THE WORKSTATION AS LONG AS THE UNDERLYING TABLES AND INDEXES WHERE NOT ALTERED BY OTHER USERS ON THE NETWORK.

Are my tests complete and sufficient? Absolutely not, there are a lot scenarios thinkable. Esspecially if portions of data are beeing chached it might show an advantage in one case as oposed to another.

This message is only ment as a backup to my technical rant on why a DELETED() tag should be avoided, and the common wisdom and article saying it is generally wise to index them with such index should be taken with a lot of sceptisism, because either way you look at it the problem is about low selectivity causing overhead in use of resources in the majority of cases.

People should not take this example as the prove why one should be carefull in blindly applying a DELETED() Tag (binary or not), but read the theoretical rant in order to understand what is happening when optimizing. From this knowledge you can test you own applications.

Walter,
Previous
Reply
Map
View

Click here to load this message in the networking platform