Hi Aleksey,
I have a weird and more than half-reproducable error situation. When running the largest data-set in one of my host-update programs I recently encountered at the very end of the run the error 1691.
The Fll which is read at this moment is on a network share, but the program ran often with the smaller data sets loading from this share from different machines.
When I ran into the error again on the same dataset first on the same machine, then in another of my datacrunching machines, I became interested. I could verify in the command window (running the exe in the vfp IDE) that the erroring process could not set this particular library, whereas a freshly started process ***on the same machine*** could set libary to this particular fll - I copied the lines over from the command window so I am sure no typing error occurred during that test.
Testing this repeatedly on the machine made clear that there was no temporary network error,
as the fresh process had no trouble setting and releasing that fll. Working assumption: The programm had somehow clobbered the vfp process into a state it could not SET LIBRARY TO another library.
As the data set processed was my largest data sets, we have quite beefy HW and had most of the other problems connected with large physical RAM size at one time or another, varying the amount of physical memory the OS (XP WS SP2) can see (via /MaxMem in Boot.Ini) was the first thing I tried.
Result: Program showed the error at 3GB, 2 GB, 1.5 GB, 998 MB, 513 MB.
When setting the Maxmem to 512MB, 511MB, 510MB vfp could set library during the whole program run with the data set which had errored out with larger RAM sizes. SYS(3050) is already set to a relatively low level for such beefy machines: we set it TO a maximum OF 504MB for indexing and some SQL, but for "normal code" it is set dynamically depending on the physical memory the OS can see, and on the 3GB machine this is a measely 256MB.
3096 MB Physical High-SYS(3050) 504MB Low-SYS(3050) 256MB
0996 MB Physical High-SYS(3050) 384MB Low-SYS(3050) 238MB
0512 MB Physical High-SYS(3050) 196MB Low-SYS(3050) 154MB
So I don't think this is related to the errors of a setting of 3050() being too large.
The swap file is size is usually ample (1.5 GB), but the error occurred also with larger sizes and no swap file set.
As next step I tried to identify at which place vfp became unable to set the library. I added a couple of lines into one of the logging-routines, which are called a couple of thousand times during program run (depending on available memory the program takes between 6 and 28H to run), which test on every call to the log if vfp at that moment is still able to set LIBRARY TO that specific fll.
I also tried to makesure that it is not a network error in the code. The latest version of this code is:
#if .t.
LOCAL lcFll
lcFll = "T:\Copy_Aed\Prog3rd\Vfx.fll"
IF TYPE("goKubiProtokollTimer.lCheckFll")=="U"
= addproperty(goKubiProtokollTimer, "lCheckFll", .t.)
= addproperty(goKubiProtokollTimer, "cFll", FILETOSTR(m.lcFll))
ENDIF
IF goKubiProtokollTimer.lCheckFll
LOCAL laFll1[1], laFll2[1];
, laFll3[1], laDll[1];
, lnLib1, lnLib2, lnLib3 ;
, lcErrMess, llStopAtAssert
IF !goKubiProtokollTimer.cFll == FILETOSTR(m.lcFll)
lcErrMess = "Fll unequal! "
llStopAtAssert = .t.
else
lcErrMess = ""
lnLib1 = ALINES(laFll1, SET("Library"),1,",")
SET LIBRARY TO &lcFll additive
lnLib2 = ALINES(laFll2, SET("Library"),1,",")
IF UPPER(m.lcFll) $ UPPER(SET("Library"))
RELEASE LIBRARY &lcFll
ELSE
lcErrMess = "Fll fehlt " + m.lcErrMess
endif
lnLib3 = ALINES(laFll3, SET("Library"),1,",")
IF TYPE("goKubiProtokollTimer.nFll")=="U"
= addproperty(goKubiProtokollTimer, "nFll",m.lnLib1)
ELSE
IF goKubiProtokollTimer.nFll # m.lnLib1
lcErrMess = "Fll-Change " + m.lcErrMess
goKubiProtokollTimer.nFll = m.lnLib1
endif
ENDIF
IF TYPE("goKubiProtokollTimer.nDll")=="U"
= addproperty(goKubiProtokollTimer, "nDll", ADLLS(laDll))
ELSE
if goKubiProtokollTimer.nDll # ADLLS(laDll)
lcErrMess = "DLL-Change " + m.lcErrMess
goKubiProtokollTimer.nDll = ADLLS(laDll)
endif
ENDIF
DO Case
CASE m.lnLib1 + 1 <> m.lnLib2
lcErrMess = "FLL-Error1" + m.lcErrMess
llStopAtAssert = .t.
CASE m.lnLib3 + 1 <> m.lnLib2
lcErrMess = "FLL-Error2" + m.lcErrMess
llStopAtAssert = .t.
ENDCASE
endif
IF ! EMPTY(m.lcErrMess)
ASSERT m.llStopAtAssert = .f. MESSAGE m.lcErrMess
goKubiProtokollTimer.lCheckFll = ! m.llStopAtAssert
lcMess = m.lcMess + " " + m.lcErrMess
= justProt(m.lcMess, "FLL_Err.Asc")
endif
endif
#endif
During the runs with different memory sizes it became apparent that the bug always appeared for the first time at the very same location in the post-processing of the run. Between the last sucessful call and the failing call are a couple of lines which have nothing "tricky" in them, the only heavy processing is a "pack".
I verified the error happening at this particular postprocessing code, which is called only once throughout the application a couple of times and verified that the program will finish without error on memory settings <513MB in boot.ini.
Then I ran the program on "my" machine (which I use to develop, every machine is remoted), the error did NOT appear even with available memory set to 2 GB. When checking for different OS settings I found nothing very striking, but I realized I had defragged the disk the application was running in regularly on "my" machine, whereas the other machines (used by others sometimes) had not gotten such particular care. The OS and swapfile are on a physically different disk.
After defragging the disks on all machines the prgram ran without problem in all memory sizes, so I had lost my reproducible error scenario. So I fragmented one of the disks again and was able to get the error again using large physical memory sizes...
For this error I am unable to produce a simple script, as the data set consists of a couple of GB of sensitive data. But I cleared that I could give someone from MS trying to check on this weird behaviour remote access to a testing machine on a dedicated phone line. So if you are willing to work under such a remote scenario , I will gladly help in any way I can.
I think this error needs to be fixed, as larger RAM sizes will be the norm in the next years/in 64Bit OS versions especially and not everybody will take the time to set up defragmentation, whatever the connection here is between those two items. At the moment I have not tested it under VFP SP2 Beta, as the whole thing is fragile enough. If you think there has been a specific fix for such a situation I will try to get a machine temporarily switched over to SP2 for testing, but as these are all production machines I hesitate to do so if there is no compelling reason.
I am certain I have found something, even if the description sounds weird and flaky ;-).
Regards
thomas