SEEK(),INDEXSEEK() or KeyMatch() or SELECT-SQL? - Level Extreme

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

SEEK(),INDEXSEEK() or KeyMatch() or SELECT-SQL?

Message

From

18/04/2005 05:45:34

Thomas Ganss
Main Trend
Frankfurt, Germany

To

15/04/2005 09:03:37

Naomi Nosonovsky
Wisconsin, United States

General information

Forum:

Visual FoxPro

Category:

Databases,Tables, Views, Indexing and SQL syntax

Title:

Re: SEEK(),INDEXSEEK() or KeyMatch() or SELECT-SQL?

Environment versions

Visual FoxPro:

VFP 8 SP1

OS:

Windows XP SP2

Miscellaneous

Thread ID:

01002645

Message ID:

01005706

Views:

39

Hi Nadya,

>I think I would need to create more and better test cases, but I'm a little bit tired of all these testings...
Definitely FIRST order of things!
>
>BTW, I had another idea yesterday which may speed up the execution. Basically, if in UPDATE trigger you didn't change any of PK/FK fields, you don't need to fire any logic. So, in the trigger's code we may analyze what GetFldState returns and check
>
>locate for (upper(ParentTB) = m.lcSearch and m.lcChangedField $ ParentKey) or (upper(ChildTB) = m.lcSearch and m.lcChangedField $ ChildKey)
>
>for all fields which were changed. If we would not find them, just return out of the trigger.
>
>What do you think? Should I put such logic?

I don't think so. First of all, the code already takes partly care of that (at least before calling the "low level" functions) in the case statement comparing old and new key value.

You mentioned that I failed to take into account compound and complex keys - same here. Any index expression just calling a UDF would be missed by that approach. So if the possibility of creating such indices kept you from my other idea, this works here as well. [And to be honest, I think it is a good reason, I hadn't thought of that angle ;-) for the key expressions.]

Possible gains of specified approach:
First of all, if implemented, the check shouldn't use locate. If left table based, use count or other way just using the indices without reading the records.

Nice scenario: Just 1 non-index-used field updated on a table, many relations being in RIDef-Table, [many relations defined for that table]. Switching from reading the relations from the dbc to defining your own table makes it possible for you to define better data types to compare against, but the best reason in my eyes is the possibility to use table/index-optimized ways to reduce the number of elements "tried" by RI. Hence the earlier mentioned approach of making the 2 scans highly optimized. [On a side note: the original code in effective techniques uses a {[slightly redundant]/[loop unrolling] way of coding: the RI-Type is tested outside the for loop on the RI-array, thereby saving comparisons. I'ld go for that approach here as well, which is similar to the check before the second scan loop. Won't give you much duplicated code. Read up there, if you can - I hesitate to post the book code. But should be easy to envision <g>.]

Ugly scenario: 50 fields get updated in a single statement, the last 2 checked defining relations for child and parent. Each of the changed fields forces a table lookup, and the other RI-logik is still needed.

So at best, try such a thing after a usage pattern can be examined (hook some counters in the RI-code) and give it a few dynamic optimizations (only if occurs(ccChangedIdentifier, lcFromGetfieldstate())>lnThreshhold, force keyfields to be checked first for exiting the loop early if any are encountered by table definition conventions) and having a stable measuring framework for the RI-differences.

For an easy way to compare different approaches:
the current RI code can be split into 3 parts: setup code run once on each trigger, rubber to road RI functions and the dispatching logik in between looping or scanning. I'ld create some switching logic to run different implementations side by side, comparing performance under different numbers of RI-relations defined as well as differing table sizes. That would mean switching back at first to the low-level RI functions having identical function signatures to eliminate unneccessary coding. Leave as baselines the code of 1)effective techniques, 2) the code you started from the net, 3) the table based logic. Duplicate the code foreach alternative and optimize that version, measuring perf as you go for the 6 alternatives in each of the three splits, always running the whole enchilada. That should give you an idea which part has to be optimized, which approach is the best for your data [might be different for other table/relation structures and data sizes!]. After having found the best approach micro-optimize that approach by cutting parameters and changing compare data types.

My guess is, that for your "real" data the best appoach based of the already talked about scenarios would be a combination of a) unrolling the Case to outside the scan and defining an index for each of the possible trigger types, table names and relation type. Since the RIdef-Table is App-static, having an index for each alternative is useful (read-only scenario) and using a seek()-scan approach should be the fastest way to go.

There is also the possibility to exchange some of the work in the setup code run once on each trigger to App-Startup code analyzing the common definitions and creating cached data structures. This should nearly eliminate the time needed on the startup code on each trigger call, which perhaps is a worthwhile endeavor. But this you can tell after a few measurements. Here you would exchange some more memory against runtime with agressive run-time caching.

Perhaps we can combine the efforts (at least for the building the code timing structures] and even plug in the appoaches I mentioned in the mail.

regards

thomas

Click here to load this message in the networking platform