Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
VFP versus C++
Message
From
29/10/2003 02:17:21
Al Doman (Online)
M3 Enterprises Inc.
North Vancouver, British Columbia, Canada
 
General information
Forum:
Visual FoxPro
Category:
Other
Title:
Miscellaneous
Thread ID:
00842594
Message ID:
00843929
Views:
26
>Al,
>
>>I now have a theory:
>>
>>ALINES() is fast, and it's a relatively new function in VFP.
>>AT(), SUBSTR() etc. are slow, these are old functions, around since FoxBASE+ days.
>
>Not true. at() is just a thin wrapper around a strstr() family function.
>
>The performance of code degrades because of using the 3rd parameter to the at() ala:
>
>
for i = 1 to 10
>   j = at( x, y, i )
>endfor
>
>which is an O(n2) algorithm. It makes the at() function internally run several times, the next iteration of the loop forces it into redoing the all same string seeks that it already found in the iteration before.
>
>This sort of code construct:
>
>
for i = 1 to 10
>   j = at( x, y )
>   y = substr( y, j+1)
>endfor
>
>reduces the seek time by removing the repeated seeks, at the expense of moving lots of string memory around and some memory management in deallocating the used memory of y as it shortens. This also is O(n2) performance.
>
>>I wonder if the "old" functions are still mired in the 16-bit world and are losing performance with 16/32 bit conversions ("thunking") and/or hitting limitations at 64K chunks? The moral - for fast VFP performance, use new functions??
>
>No, at(x,y) in VFP is linear in performance O(n) where n length of string for n anywhere from a few thousand characters upto 16mb.
>
>The alines() sort of code is O(n) because you only need to make 2 linear passes through the string, once to put it into the array and once to iterate the array.

Thanks for the heads-up. Just for grins I wrote a program to test just how linear AT() really is. Basically, it tests AT() with varying-length strings (from about 1024 bytes up to either 8MB or 16MB). Each test string is a filler that does not match the search string, plus the search string appended at the very end. By default, the search string is our old buddy [http://]. The first test is the easiest possible, the filler is a character not present at all in the search string ([A]). Subsequent tests get progressively harder - the filler is repeated [h], then repeated [ht], [htt], ... up to [http:/].

The program takes a few minutes to run - fairly independent of (reasonable) hardware, it tries to calculate number of iterations to run to keep it close to a user-definable minimum number of seconds per test. A cursor with results is BROWSEd at the end of the run.

The column of most interest is the last one, iFillCharsPerSec. This is the number of fill characters per second that AT() rejects before finding the search string at the end of the test string.

There are two interesting findings on my system:

- A significant dropoff in performance (to about half or less) with test string lengths somewhere between 128K and 256K. I wonder if this is due to processor L1/L2 cache.

- More interestingly, the performance of a fill of repeated [http:] was a lot better than with a fill of [http] ! I couldn't believe this, I ran it twice to be sure. All I can think of here is maybe it's related to the processor grabbing chunks of 32 bits/4 bytes/4 chars at a time.

My system is a 1GHz Athlon processor on an Asus A7V133 mobo, 512MB PC133 SDRAM, W2K SP4, VFP7 SP1.
#DEFINE hcSearchFor		[http://]

#DEFINE hnCharVarMaxLength				16777184
#DEFINE hnFillerLengthMultiplier	2
#DEFINE hnMinimumTestSeconds			2

#DEFINE hcResultsCursorName				[Results]

LOCAL ;
	lnIx

* Set SETs appropriately:
DO SetSets

* Build the results cursor:
DO BuildCursor

* This test measures the speed of the AT() function on strings of varying length.
* Each test string is built from a number of "filler" characters, followed by
* the test string hcSearchFor.

* For "easy" tests, the filler characters will be different from any character
* in the test string so AT() can reject them at maximum speed.

lcFiller = GetFillerString(hcSearchFor, 0)

* We'll run one set of tests based on filler length of 1023 so we can get
* close to the maximum allowable string length in our testing
* (which is a little less than 16MB):
DO RunMultiTests WITH [Easy], lcFiller, 1023
DO RunMultiTests WITH [Easy], lcFiller, 1024

* Now we make it harder - the filler string starts to look increasingly
* like the string we're looking for:
FOR lnIx = 1 TO LEN(hcSearchFor) - 1 STEP 1
	lcFiller = GetFillerString(hcSearchFor, lnIx)
	
	DO RunMultiTests WITH [Stress], lcFiller, 1024

ENDFOR

SELECT (hcResultsCursorName)
GO TOP
BROWSE NOWAIT

RETURN
* EOM() ****************************************
PROCEDURE SetSets

SET TALK OFF
SET ESCAPE ON

RETURN
* EOM() ****************************************
PROCEDURE BuildCursor

IF USED(hcResultsCursorName)
	USE IN (hcResultsCursorName)

ENDIF

CREATE CURSOR (hcResultsCursorName) ;
	(cTest C(6), ;
	cFiller C(LEN(hcSearchFor) - 1), ;
	iFillerLength I, ;
	iTestIterations I, ;
	nSecondsElapsed N(7, 3), ;
	iFillCharsPerSec I)

RETURN
* EOM() ****************************************
FUNCTION GetFillerString

LPARAMETERS ;
	tcSearchFor, ;
	tnFillFactor

LOCAL ;
	lcRetVal

IF tnFillFactor = 0
	* We want a filler string that is not present at all in the SearchFor string:
	lcRetVal = [A]
	
	DO WHILE lcRetVal $ tcSearchFor
		lcRetVal = CHR(ASC(lcRetVal) + 1)
	
	ENDDO

ELSE
	* We want a filler string that is some substring of the SearchFor string:
	lcRetVal = SUBSTR(tcSearchFor, 1, tnFillFactor)

ENDIF

RETURN lcRetVal
* EOM() ****************************************
PROCEDURE RunMultiTests

LPARAMETERS ;
	tcType, ;
	tcFiller, ;
	tnStartFillerLength

LOCAL ;
	lnFillerLength

lnFillerLength = tnStartFillerLength

DO WHILE lnFillerLength + LEN(hcSearchFor) <= hnCharVarMaxLength
	DO RunOneTest WITH tcType, tcFiller, lnFillerLength
	
	lnFillerLength = lnFillerLength * hnFillerLengthMultiplier

ENDDO

RETURN
* EOM() ****************************************
PROCEDURE RunOneTest

LPARAMETERS ;
	tcType, ;
	tcFiller, ;
	tnFillerLength

LOCAL ;
	lnFillerLength, ;
	lnSecondsStart, ;
	lnSecondsElapsed, ;
	lnTestIterations, ;
	lnIx, ;
	lcTest

* We'll round down the filler length to the nearest full multiple
* of the length of the fill string:
lnFillerLength = INT(tnFillerLength / LEN(tcFiller)) * LEN(tcFiller)

lcTest = REPLICATE(tcFiller, lnFillerLength / LEN(tcFiller)) + hcSearchFor

lnTestIterations = 1

lnSecondsStart = SECONDS()
=AT(hcSearchFor, lcTest)
lnSecondsElapsed = SECONDS() - lnSecondsStart

DO WHILE lnSecondsElapsed < hnMinimumTestSeconds
	* If the elapsed time was very short try to figure out a number of test iterations
	* that will be close to hnMinimumTestSeconds:
	DO CASE
	CASE lnSecondsElapsed = 0
		* Unmeasurably short, arbitrarily set a number of iterations:
		lnTestIterations = lnTestIterations * 1000 * hnMinimumTestSeconds
	
	CASE lnSecondsElapsed < 0.01
		* May be under the resolution of SECONDS() on NT/W2K/XP,
		* multiply the calculated iterations by 150%:
		lnTestIterations = lnTestIterations * 1.5 * (hnMinimumTestSeconds / lnSecondsElapsed)
	
	OTHERWISE
		* Assume the resolution of SECONDS() is OK:
		lnTestIterations = lnTestIterations * hnMinimumTestSeconds / lnSecondsElapsed
	
	ENDCASE
	
	lnSecondsStart = SECONDS()
	
	FOR lnIx = 1 TO lnTestIterations STEP 1
		=AT(hcSearchFor, lcTest)
	
	ENDFOR
	
	lnSecondsElapsed = SECONDS() - lnSecondsStart

ENDDO

* Write the results to the cursor:
INSERT INTO (hcResultsCursorName) ;
	(cTest, ;
	cFiller, ;
	iFillerLength, ;
	iTestIterations, ;
	nSecondsElapsed, ;
	iFillCharsPerSec) ;
	VALUES ;
		(tcType, ;
		tcFiller, ;
		lnFillerLength, ;
		lnTestIterations, ;
		lnSecondsElapsed, ;
		INT(lnTestIterations * lnFillerLength / lnSecondsElapsed))

WAIT WINDOW NOWAIT [Filler String: ] + "[" + tcFiller + "]" ;
	+ [  Fill Length: ] + LTRIM(STR(lnFillerLength))

RETURN
* EOM() ****************************************
Regards. Al

"Violence is the last refuge of the incompetent." -- Isaac Asimov
"Never let your sense of morals prevent you from doing what is right." -- Isaac Asimov

Neither a despot, nor a doormat, be

Every app wants to be a database app when it grows up
Previous
Reply
Map
View

Click here to load this message in the networking platform