Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Does index tag on DELETED() help?
Message
From
09/11/2004 07:05:35
Walter Meester
HoogkarspelNetherlands
 
 
To
08/11/2004 14:23:10
General information
Forum:
Visual FoxPro
Category:
Other
Miscellaneous
Thread ID:
00958911
Message ID:
00959489
Views:
11
Hi Tamar,

First of all, I've written why the DELETED() tag was a myth one or two months before the article of Chris Probst. You'll find these discussions on the UT on thread #198031 and Thread #196021.


However, the statement in HackFox 7 still is wrong or at least misleading.

>"Even in many complex queries and FOR clauses, Rushmore performs its magic almost entirely on the relatively small and compact CDX file, a file structured with nodes, branches and leaves to be searched efficiently. When DELETED is ON, FoxPro has to check each and every record in a result set (whether from a query, a filter, or FOR) to see if it's deleted—even if no records are actually deleted. This sequential reading of the entire cursor or file completely defeats the benefits of Rushmore. Don't do it!

This is a wrong perception of what is happening. With partial optimization the following happens.
1. Determine the candidate records based on the rushmore optimizable part.

Now, from this point I'm not sure what happens, but either it is

2a. It downloads all records meeting the rushmore optimizable part and filters out those not meeting the non-optimizable part

2b. Download the fields needed to perform the non-rushmore optimizable part from the rusmore optimized result set and determine the records that meet the condition and download the result.

With the use of the deleted tag you've got only an advantage if there are significant numbers of deleted records that are weeded out in step 1. If not having the tag this will be done in step2. Since you already have to download the record, the argument of lost time checking the deleted status is a false one. It only matters if there are lots of deleted records that have to be weeded out in step2.

>"By creating a tag on DELETED(), you let Rushmore do the checking instead of looking at each record sequentially, which makes the whole thing much faster. The larger the result set, the more speed-up you'll see."

This logic was totally wrong as the records had to be downloaded to the client anyways if they met the condition (which is the far majority in a table without many deleted records).

>Sounds really good, doesn't it?

>The walls came crashing down on this piece of wisdom with an article by Chris Probst in the May '99 issue of FoxPro Advisor. Probst was working with an extremely large data set in a network environment and found that some of the queries were just too slow. Monitoring network traffic found the problem: The portion of the index file related to the DELETED() tag was huge (at least in some cases), and transferring it across the network was bogging down the whole query.

>Probst's experiments determined that, when an expression has only a few discrete values (like .T. and .F.), and the values are unevenly distributed (as is typically the case with DELETED(), since in general, few records are deleted), queries run faster without a tag on the expression. In those cases, it's better to let VFP narrow things down first based on the other expressions involved, and then do a sequential check of the remaining records.

This is always the case. I don't see why this only applies to large tables. Though the problem is less evident in small tables, there is no way a DELETE() tag could speed up a query if there are no deleted records. People forget that downloading and processing an index tag information also costs network load and processor resources. Downloading a tag (like the DELETED() tag) that does not narrow down a result set significantly (low selectivity) is about useless in all but specific cases.


>The phenomenon Chris witnessed is not unheard of in the database world. The index with few values is called one with "low selectivity" and is considered suspect in most database designs.

As I've been saying in the threads above just before the article was released.

>But this was the first documented case of the low-selectivity issue appearing in FoxPro's remarkable optimization.

Nope, see the links.

>What does all this boil down to for you? For small tables in a desktop or LAN situation, we generally think having a tag on DELETED() is a better choice. As tables get larger or when operating in a WAN situation, skip it. And just for good measure, you should test the performance, in your specific production (as opposed to development) environment, with and without the DELETED() tag.

I think this should be corrected because it is wrong. It does not speed up anything with any significance, besides the exceptions I gave in the previous message. Why having an index that does not give you any performance enhancement and could potentially hurt you real bad?

I challenge anyone to come up with a case that clearly shows where a DELETED() tag has anysignificance (thus not a few hundreds of a second), which has not already been identified as the exceptions.

Walter,
Previous
Reply
Map
View

Click here to load this message in the networking platform