>I think you're right. Indices must have the expression you're trying to match and must know what record that match goes with. I don't understand the part about a funny three-byte integer format. One would think that the offset or record pointer would have to be more than three bytes (actually four bytes would be fine) since three would give a maximum 16.7 million different values. We know we can have tables with millions more records than that.
It may be jumping to 4 bytes when it reaches that limit, or having a limitation to "only the records with the same MSB in one block", or whatever. The great thing with CDXes was their compactness, right? Did you ever take a look at the innards of a .cdx? Imagine you had three names, "Peter Pan", "Peter Panowsky", "Peter Paulsen" and "Peter Peterson". In the CDX, you could find "Peter Petersonaulsennowsky". Yes, it's going from last to first and shoot me if I know where it stores the offsets and lenghts. It's full of neat coding tricks.
>Rushmore does this even when it's not a good idea, and it's up to the developer to design the queries to limit this.
>If you had tags on Name and Sex
>SELECT * FROM MyTable WHERE Name = "Charlie " AND Sex = "M"
>would probably run better as
>SELECT * FROM MyTable WHERE Name = "Charlie " HAVING Sex = "M"
>or even
>SELECT * FROM MyTable WHERE Name = "Charlie " AND UPPER(Sex) = "M"
>since the last 2 are not fully optimized...which is better...because of the traffic generated by the badly skewed tag where "M" and "F" dominate the expressions.
Now it would really be nice to know some good benchmarking on this. My guess would be that it would retrieve the parts of the index for Name tag for "Charlie" only, and then pull the Sex index for those which are in the already selected part... but then this should mean that each tag is internally indexed by recno() as well... which is unlikely, IMO, so my bet is on your 2nd example. The third one is worse - it would run on the result set (as being unoptimizable) just like the second, but it requires a function call so it would be slower.
BTW, what's the idea of "charlie having sex = M"? I expected the optimized one would be "charlie having sex = feeling good"...
>The LOCATE with a FOR and SET ORDER TO Something is still wrong in that it shouldn't care about the active tag. It's as if the old code before Rushmore optimization didn't quite all get examined.
It has to take care of the active tag, because that's the expected behavior: it should jump to the first located record - but first according to the active tag, so it actually has to build the complete bitmap and then arrange it to the active tag. That's why the Continue works like a snap after this, and only the first Locate takes a while.
I'm wandering if Locate Rest or Locate While are retrieving records physically according to the active tag, or does it still build a bitmap, just made shorter by using the info from the current tag (current record's key and higher).
Boy, I think they had lots of fun nine or ten years ago while they created Rushmore...