Error 1691 - Weird but Often Reproducible - Memory ? - Level Extreme

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

Error 1691 - Weird but Often Reproducible - Memory ?

Message

From

06/08/2007 10:35:33

Thomas Ganss
Main Trend
Frankfurt, Germany

To

Aleksey Tsingauz
Microsoft
Seattle, Washington, United States

General information

Forum:

Visual FoxPro

Category:

Visual FoxPro Beta

Title:

Error 1691 - Weird but Often Reproducible - Memory ?

Miscellaneous

Thread ID:

01246146

Message ID:

01246146

Views:

74

Hi Aleksey,

I have a weird and more than half-reproducable error situation. When running the largest data-set in one of my host-update programs I recently encountered at the very end of the run the error 1691.

The Fll which is read at this moment is on a network share, but the program ran often with the smaller data sets loading from this share from different machines.

When I ran into the error again on the same dataset first on the same machine, then in another of my datacrunching machines, I became interested. I could verify in the command window (running the exe in the vfp IDE) that the erroring process could not set this particular library, whereas a freshly started process ***on the same machine*** could set libary to this particular fll - I copied the lines over from the command window so I am sure no typing error occurred during that test.

Testing this repeatedly on the machine made clear that there was no temporary network error,
as the fresh process had no trouble setting and releasing that fll. Working assumption: The programm had somehow clobbered the vfp process into a state it could not SET LIBRARY TO another library.

As the data set processed was my largest data sets, we have quite beefy HW and had most of the other problems connected with large physical RAM size at one time or another, varying the amount of physical memory the OS (XP WS SP2) can see (via /MaxMem in Boot.Ini) was the first thing I tried.

Result: Program showed the error at 3GB, 2 GB, 1.5 GB, 998 MB, 513 MB.
When setting the Maxmem to 512MB, 511MB, 510MB vfp could set library during the whole program run with the data set which had errored out with larger RAM sizes. SYS(3050) is already set to a relatively low level for such beefy machines: we set it TO a maximum OF 504MB for indexing and some SQL, but for "normal code" it is set dynamically depending on the physical memory the OS can see, and on the 3GB machine this is a measely 256MB.

3096 MB Physical		High-SYS(3050) 504MB 	Low-SYS(3050) 256MB
0996 MB Physical		High-SYS(3050) 384MB 	Low-SYS(3050) 238MB
0512 MB Physical		High-SYS(3050) 196MB 	Low-SYS(3050) 154MB

So I don't think this is related to the errors of a setting of 3050() being too large.
The swap file is size is usually ample (1.5 GB), but the error occurred also with larger sizes and no swap file set.

As next step I tried to identify at which place vfp became unable to set the library. I added a couple of lines into one of the logging-routines, which are called a couple of thousand times during program run (depending on available memory the program takes between 6 and 28H to run), which test on every call to the log if vfp at that moment is still able to set LIBRARY TO that specific fll.

I also tried to makesure that it is not a network error in the code. The latest version of this code is:

	#if .t.
		LOCAL lcFll
		lcFll = "T:\Copy_Aed\Prog3rd\Vfx.fll"
		IF TYPE("goKubiProtokollTimer.lCheckFll")=="U"
			= addproperty(goKubiProtokollTimer, "lCheckFll", .t.)
			= addproperty(goKubiProtokollTimer, "cFll", FILETOSTR(m.lcFll))
		ENDIF
		IF goKubiProtokollTimer.lCheckFll
			LOCAL laFll1[1], laFll2[1];
				, laFll3[1], laDll[1];
				, lnLib1, lnLib2, lnLib3 ;
				, lcErrMess, llStopAtAssert

			IF !goKubiProtokollTimer.cFll == FILETOSTR(m.lcFll)
				*-- make CERTAIN it is not a network/path/OS problem!
				lcErrMess = "Fll unequal! "
				llStopAtAssert = .t.
			else
				lcErrMess = ""
				lnLib1 = ALINES(laFll1, SET("Library"),1,",")
				SET LIBRARY TO &lcFll additive		
				lnLib2 = ALINES(laFll2, SET("Library"),1,",")
				IF UPPER(m.lcFll) $ UPPER(SET("Library"))
					RELEASE LIBRARY &lcFll
				ELSE
					lcErrMess = "Fll fehlt "  + m.lcErrMess
				endif
				lnLib3 = ALINES(laFll3, SET("Library"),1,",")
				IF TYPE("goKubiProtokollTimer.nFll")=="U"
					= addproperty(goKubiProtokollTimer, "nFll",m.lnLib1)
				ELSE
					IF goKubiProtokollTimer.nFll # m.lnLib1
						lcErrMess = "Fll-Change "  + m.lcErrMess
						goKubiProtokollTimer.nFll = m.lnLib1
					endif
				ENDIF 
				IF TYPE("goKubiProtokollTimer.nDll")=="U"
					= addproperty(goKubiProtokollTimer, "nDll", ADLLS(laDll))
				ELSE
					if goKubiProtokollTimer.nDll # ADLLS(laDll)
						lcErrMess = "DLL-Change " + m.lcErrMess
						goKubiProtokollTimer.nDll = ADLLS(laDll)
					endif
				ENDIF 
				DO Case
					CASE m.lnLib1 + 1 <> m.lnLib2
						lcErrMess = "FLL-Error1" + m.lcErrMess 
						llStopAtAssert = .t.
					CASE m.lnLib3 + 1 <> m.lnLib2
						lcErrMess = "FLL-Error2" + m.lcErrMess 
						llStopAtAssert = .t.
				ENDCASE
			endif
			IF ! EMPTY(m.lcErrMess)
				ASSERT m.llStopAtAssert = .f. MESSAGE m.lcErrMess
				*-- Jump over this line in debugger for continuing check after first error
				goKubiProtokollTimer.lCheckFll = ! m.llStopAtAssert
				lcMess = m.lcMess + " " + m.lcErrMess 
				= justProt(m.lcMess, "FLL_Err.Asc")
			endif
		endif
	#endif

During the runs with different memory sizes it became apparent that the bug always appeared for the first time at the very same location in the post-processing of the run. Between the last sucessful call and the failing call are a couple of lines which have nothing "tricky" in them, the only heavy processing is a "pack".

I verified the error happening at this particular postprocessing code, which is called only once throughout the application a couple of times and verified that the program will finish without error on memory settings <513MB in boot.ini.

Then I ran the program on "my" machine (which I use to develop, every machine is remoted), the error did NOT appear even with available memory set to 2 GB. When checking for different OS settings I found nothing very striking, but I realized I had defragged the disk the application was running in regularly on "my" machine, whereas the other machines (used by others sometimes) had not gotten such particular care. The OS and swapfile are on a physically different disk.

After defragging the disks on all machines the prgram ran without problem in all memory sizes, so I had lost my reproducible error scenario. So I fragmented one of the disks again and was able to get the error again using large physical memory sizes...

For this error I am unable to produce a simple script, as the data set consists of a couple of GB of sensitive data. But I cleared that I could give someone from MS trying to check on this weird behaviour remote access to a testing machine on a dedicated phone line. So if you are willing to work under such a remote scenario , I will gladly help in any way I can.

I think this error needs to be fixed, as larger RAM sizes will be the norm in the next years/in 64Bit OS versions especially and not everybody will take the time to set up defragmentation, whatever the connection here is between those two items. At the moment I have not tested it under VFP SP2 Beta, as the whole thing is fragile enough. If you think there has been a specific fix for such a situation I will try to get a machine temporarily switched over to SP2 for testing, but as these are all production machines I hesitate to do so if there is no compelling reason.

I am certain I have found something, even if the description sounds weird and flaky ;-).

Regards

thomas

Click here to load this message in the networking platform