Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
UDF's second rate citizen in Rushmore ?
Message
From
06/09/2006 07:16:22
 
 
To
06/09/2006 04:51:40
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Miscellaneous
Thread ID:
01151219
Message ID:
01151250
Views:
45
>Hi Aleksey,
>
>I am not sure the following is a bug or a feature <g>.
>I have read the parts in the help file warning about using a UDF inside a index expression, for instance in "Considerations for Creating Index Expressions".
>I have a situation with a large free tables (.dbf and .fpt both >1GB, .cdx 300 MB) and a very stable procedure file always loaded when working with the table.
>
>When I join on index expressions based on pure dbf-table fields or index expressions using only vfp functions on dbf/fpt fields (padded to stable length in case of fpt-fields) sys(3054) shows the already created index is used directly in the join. If index expressions involving UDF's are used, a temporary index is calculated, even if the UDF is certain to be of fixed length: either returning a PADR() or wrapping the udf in the index expression like PADR(myUdf(field1), 20).
>
>I do think that functions in a .fll don't show that annoying behaviour, but haven't verified - perhaps I am mixing it up with the level Type() can check, as type() will return the type of a fll-function but not of a vfp-udf.
>
>What I would like to do is something similar to
>set proc to myPrc additive && includes myUdf1 and myUdf2
>select newData
>index on Padr(myUdf1(Field1), 20) + Padr(myUdf2(Field2), 15) tag F_1_2_new
>
>select oldData
>index on Padr(myUdf1(Field1), 20) + Padr(myUdf2(Field2), 15) tag F_1_2_Old
>
>Select ;
>    Nw.Field3, Ol.Field4 ;
>    from newData Nw ;
>    join oldData Ol ;
>        on  Padr(myUdf1(ol.Field1), 20) + Padr(myUdf2(ol.Field2), 15) ;
>          = Padr(myUdf1(ne.Field1), 20) + Padr(myUdf2(ne.Field2), 15)
>
>*-- and have sys(3054) show:
>*-- Joining table ne and ol using index tag F_1_2_new
>*-- instead of
>*-- Joining table ne and ol using temp index
>*-- or
>select newData
>local lcSearchMe
>lcSearchme = Padr("....", 35)
>Locate for Padr(myUdf1(Field1), 20) + Padr(myUdf2(Field2), 15) = m.lcSearchMe
>
>*-- and be sure rushmore is working in the above locate!
>
>as there is ample space in the .cdx left without balloning up either dbf or fpt with redundant/calculable data. Since the .cdx could be reused instead of always creating a temp index, it would be a real time saver, as always creating a temp index wit a udf on a table ths size takes time. I also have doubts if such an index would be used in xBase-situations by rushmore. So any xBase code involving/using udf-indices is mostly seek()-scan rest in my code and not even trying to use rushmore.
>
>I currently see the following alternatives:
>create another field in the dbf/fpt, fill it using the udf()'s and index on that field (I hesitate because of table size)
>perhaps port the code into a fll (as it is mostly simple string massage, probably faster executing but I have no dire need otherwise)
>check if moving to stored procedures makes indices with vfp-udf accessible for rushmore
>hope there is some switch/syntax to force an existing index to be used directly<g>
>
>Any thoughts / hints from your side ? Is this a bug, a feature or should I formulate a feature request ?
>If anything is unclear, pls. don't hesitate to ask for clarification. If you say an index created with a vfp-udf is expected to be reused in joins, I'll try to extract some example code showing that this is (at least not always) the case.
>
>regards
>
>thomas

this is a bug because what VFPT has thought is incomplete.

I explain:
When you define an index, you have two conditions:
- the key's datatype is fixed or variable
- the key's is deterministic or not deterministic

Datatype issue:
VFP don't support index keys with variable datatype,
and for optimization the datatype is irrelevant; the point is closed.

Deterministic issue:
you can define a not deterministic index, BUT THE ENGINE CANNOT USES IT,
because with the optimization the result can to be different.

Well, VFP have not a clause for to inform the Engine that an index is deterministic or not;
( this is a bug design for me ).
The engine ( VFPT ) have to guess this, but it cannot.

Then it assume this:
- a field is deterministic, this is correct
- a VFP's native expression is deterministic, this is wrong
- every exp that contain an UDF() is not deterministic,
and then it cannot put the tag in the optimized plan
CREATE CURSOR testudfindex ( aa i )
APPEND BLANK

CLEAR
SYS(3054,12)

INDEX on aa			TAG tfield_TAG

SELECT * FROM testudfindex WHERE aa =1 INTO ARRAY dummy

*  RAND() IS A NOT DETERMINISTIC FUNCTION ( AA=1 OR 0 )
INDEX on RAND(aa)	TAG VFPF_TAG	&&	Clause: DETERMINISTIC OFF    Default is ON
?
*BUG:  : correct is : don't use the index
SELECT * FROM testudfindex WHERE RAND(aa) =1 INTO ARRAY dummy

INDEX ON UDF(aa)	TAG udf_tag	&& Clause:	DETERMINISTIC ON  default is OFF
?
* RESTRICTION: correct is : use the index
SELECT * FROM testudfindex WHERE UDF(aa) =1 INTO ARRAY dummy

PROCEDURE UDF(x)
	RETURN x
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform