But yet, there are times when I feel limited by VFP. One of these big limitations which I find is low-level management. For many, this might not be a problem, but more than one developer has, at some moment, encountered a challenge which can't be solved with Fox, and it is here that we use a DLL or an FLL. This is what we will try to do today.
In order to be able to develop this kind of library, we have to use VC++. In my case, I will use VC++ 7.0 in Spanish, but the same can be done, without any limitation, with VC++ 6.0, which I even prefer.
Some history
Since version 2.0, in the DOS environment, Fox has the possibilities of using PLB libraries, and in Windows, FLL libraries. With the aid of these libraries, the Fox can be powered to really high limits, up to the point where you can develop any kind of low-level routine which would otherwise be practically impossible to develop.
Objectives
As an objective for this issue, I want to propose the following points, to complete the first part of the development of an FLL.
Advanced development with an FLL
I will assume that you have read the article in the previous issue, where I have explained all the concepts necessary to read this article. This is important, since in this article I will include lots of VC++ code, and I will take for granted quite a few things, with respect to understanding the construction of an FLL skeleton. For those who didn't read it yet, the article is available in the following link: Developing FLL libraries - Part one
Access to VFP data
One of the great advantages an FLL has over a DLL is the capacity of modifying data in native VFP tables or cursors, from C++ code. VFP provides several functions for this purpose, so that we can do this without major complexity.
Let's create a project called "FllWithData", whose only purpose shall be to fill a cursor, previously created in VFP, and save a value, in a field which we shall specify, with a specific value.
The code will look like this:
#include #include "pro_ext.h" void FAR ProcessRecords(ParamBlk FAR *parm) { int i, nAdd = 0; nAdd = parm->p[0].val.ev_long; for( i = 0; i < nAdd; i++ ) { _DBAppend(-1, 0); _DBReplace(&parm->p[1].loc, &parm->p[2].val); } _RetLogical(1); } FoxInfo myFoxInfo[] = { {"ProcessRecords", (FPFI) ProcessRecords, 3, "NR,?"}, }; extern "C" { FoxTable _FoxTable = { (FoxTable *) 0, sizeof(myFoxInfo)/sizeof(FoxInfo), myFoxInfo }; }
Which will be called from VFP like this:
SET LIBRARY TO "FllWithData.fll" CREATE CURSOR Roberto ( MyField C(10) ) ? ProcessRecords ( 9000000, @MyField, "Hello" ) SET LIBRARY TO
We can see how easy it is to implement, in C++, a function that generates data in native VFP cursors, thanks to the functions provided by VFP itself.
If we analize the function "ProcessRecords", we see that it calls two other basic functions, which are in charge of modifying the data in the cursors; the functions are _DBAppend and _DBReplace.
int _DBAppend(int workarea, int carryflag) int workarea;
This function inserts a new record into an open table, or into the work area specified by the parameter workarea.
workarea Work area where the record should be inserted; if this parameter is -1, it represents the current work area.
carryflag This parameter can assume the following values:
Here we can see that it is quite easy to manipulate VFP data from an FLL, obtaining great performance, and easily doing things that, from Fox, would have been very difficult, or which, perhaps, we couldn't have done at all.
VFP provides a much greater set of functions than the ones we use in this example, which can be used to manipulate Fox data; for instance, locking a table, searching within a table, etc.
Executing VFP commands from an FLL
The FLL gives us another great feature, one that has no comparison in a DLL. This is the function called "_Execute", which allows us to execute, within C++ ocde, any VFP instruction, compiling and running it as if it were coded normally in a VFP code segment.
int _Execute(char FAR *stmt)
VFP code segment that we wish to execute.
I imagine that many of you will be asking yourself, what is the advantage of using this function within an FLL. Well, the answer to this question is simply "greater control". Where else can you see that a development tool allows you to create an add-on, that this is so integrated that it allows you to modify the tools environment; allows you to define variables in the tool that invoked it; and even allows you to run lines of code from the add-on, as if you were in the tool.
To illustrate this, we develop an example called "FllCommands".
#include #include "pro_ext.h" void FAR GenCRC32(ParamBlk FAR *parm) { long i, nLeng; nLeng = _DBRecCount(-1); _DBRewind(-1); for( i = 0; i < nLeng; i++ ) { _DBReplace(&parm->p[0].loc, &parm->p[1].val); _Execute("FunCalcCRC32()"); // VFP function of the PRG FuncCalcCRC32.prg _DBSkip(-1,1); } _RetLogical(1); } FoxInfo myFoxInfo[] = { {"GenCRC32", (FPFI) GenCRC32, 2, "R,?"}, }; extern "C" { FoxTable _FoxTable = { (FoxTable *) 0, sizeof(myFoxInfo)/sizeof(FoxInfo), myFoxInfo }; }
Function in VFP:
FUNCTION FunCalcCRC32 REPLACE CRC32 WITH SYS(2007, Data, 1) ENDFUNC
Call from VFP:
SET PROCEDURE TO "FuncCalcCRC32.prg" SET LIBRARY TO "FLLCommands.fll" CREATE CURSOR Roberto( Data C(10), CRC32 C(20) ) APPEND BLANK APPEND BLANK ? GenCRC32( @Data, "Hello" ) BROWSE SET LIBRARY TO
Here we can see how, from VFP, we can call a function located within an FLL, this function processes our data (in our case, it only saves), and then calls another function located in VFP, compiled within a PRG.
This really increases our options, a lot.
MultiThreading
If there is something that really thrills me when I develop a system, or part of a system, it is the performance which I give it. Parallelism really helps a lot for this purpose, especially when there is a routine that has to be executed "n" times for different data sets. In this case, VC++ is appropriate to carry out this kind of task, since it interacts directly with the OS to ask it to run threads, and associate them to a parent process (the process that created it). These threads are in charge of executing a copy of the routine several times, but these will process a different data set in each case.
Nowadays there are really powerful servers that have 6 or more processors, and I really believe that it is a waste not to make use of them. Without going so far as to think of a server, we can consider that today there are hardware companies that sell, as a desktop PC, machines which are capable of running two operations at the same time, with a single processor.
Figure 1: Process distribution with different processors
This is a very simplistic explanation of how the OS distributes the load among the processors; sometimes the OS doesn't have more than one processor available to distribute the load; as in the usual case that only a single processor is available. In this case, the OS does time-sharing.
That is to say, the OS lends the processor to one application for a certain time, then it takes it away and lends it to another application that is waiting for it.
In order to do this, the OS provides us with an API function called "CreateThread". What this does is create a thread and assign it to the process that created it. This function has the following declaration in C++:
HANDLE CreateThread( LPSECURITY_ATTRIBUTES lpThreadAttributes, // SD SIZE_T dwStackSize, // initial stack size LPTHREAD_START_ROUTINE lpStartAddress, // thread function LPVOID lpParameter, // thread argument DWORD dwCreationFlags, // creation option LPDWORD lpThreadId // thread identifier );
dwStackSize This is the initial size of the stack, in bytes. This parameter can be NULL; if it is, it will assume the default value for the executable.
lpStartAddress Pointer to the function that contains the routine which you want to execute.
lpParameter Specifies a single parameter that will be passed to the thread.
dwCreationFlags This parameter specifies the creation mode of the thread, Active, Suspended, etc.
lpThreadId Pointer to a variable that will receive the ID of the thread created.
Calling this function from VC, we are already parallelizing a running routine, at the same time, but with two different data sets.
However, if we want to improve our function, this is not enough; so far, our FLL function starts its life cycle executing the API function CreateThread, and after executing it, it returns to VFP, so that the threads created work asynchronously, and VFP never finds out when the job is finished. In order to know when the threads have really finished their job, we have to wait after creating them. For this purpose, we need three more API functions: WaitForMultipleObjects, CreateEvent and SetEvent.
WaitForMultipleObjects waits until it receives certain events from a specific set of threads. CreateEvent will create events which will be executed. And SetEvent will set or execute the events. The declarations of these functions in VC++ are as follows:
DWORD WaitForMultipleObjects( DWORD nCount, // number of handles in array CONST HANDLE *lpHandles, // object-handle array BOOL bWaitAll, // wait option DWORD dwMilliseconds // time-out interval );
lpHandles Pointer to the array of objects you want to wait for.
bWaitAll Specifies the type of wait desired. If this parameter is TRUE, the function returns when all objects specified in the parameter lpHandles are set. If it is FALSE, it returns when the first object of the array lpHandles is set.
dwMilliseconds Specifies the maximum waiting time, in milliseconds. If this parameter is INFINITE, the function will continue waiting for all the objects, depending on the setting of parameter bWaitAll.
HANDLE CreateEvent( LPSECURITY_ATTRIBUTES lpEventAttributes, // SD BOOL bManualReset, // reset type BOOL bInitialState, // initial state LPCTSTR lpName // object name );
bManualReset Specifies if the event is reset manually by the program, or automatically by the OS.
bInitialState Specifies the initial state of the event. If it is TRUE, the initial state is assigned; otherwise, the event is created as not assigned.
lpName Specifies the name of the event created. This name must not exceed the constant MAX_PATH. If this parameter is NULL, the event will be created without a name.
BOOL SetEvent( HANDLE hEvent // handle to event );
With these functions provided by the OS, we are now ready to create a function that will be called from VFP and that parallelizes the work.
For this experiment, we will just do a very simple "Wait Window", which only will do a FOR to show the message.
In this example, the "Wait Window" will appear over the main VFP window; it will be drawn directly on it, without VFP finding this out, leaving a diagram similar to the following:
Figure 2: FllMultiThread
The code in the library will be like this:
#include #include "pro_ext.h" typedef struct stParam{ HANDLE hEvent; int iFor; int iSleep; int nPos; HWND vfpHWND; }; DWORD WINAPI ThreadDrawing( LPVOID lpParam ) { stParam *ptParam = (stParam*)lpParam; char sMsg[256]; RECT rtRect; HDC hdcVFP = GetWindowDC( ptParam->vfpHWND ); rtRect.left = 20; rtRect.right = 270; rtRect.top = 140; rtRect.bottom = 160; if( ptParam->nPos == 1 ) { rtRect.top += 25; rtRect.bottom += 25; } // Save Screen DWORD dwScreen[250][160];
for( int sx = rtRect.left; sx < rtRect.right; sx++ ) for( int sy = rtRect.top; sy < rtRect.bottom; sy++ ) dwScreen[sx-rtRect.left][sy-rtRect.top] = GetPixel( hdcVFP, sx, sy ); HBRUSH hbrBkgnd = CreateSolidBrush( RGB(236,233,216) ); int nBkMode = GetBkMode( hdcVFP ); SetBkMode( hdcVFP, TRANSPARENT ); int nOldSize = 0, nSize = 0; for( int i = 0;i < ptParam->iFor; i++ ) { sprintf( sMsg, "I am in Thread:%6d - %i de %i", GetCurrentThreadId(), i+1, ptParam->iFor ); HGDIOBJ h = SelectObject( hdcVFP, hbrBkgnd ); Rectangle( hdcVFP, rtRect.left, rtRect.top, rtRect.right, rtRect.bottom); SelectObject( hdcVFP, h ); DrawText( hdcVFP, sMsg, (int)strlen( sMsg ), &rtRect, DT_CENTER | DT_VCENTER ); Sleep(ptParam->iSleep); } // Restore Screen for( int sx = rtRect.left; sx < rtRect.right; sx++ ) for( int sy = rtRect.top; sy < rtRect.bottom; sy++ ) SetPixel( hdcVFP, sx, sy, dwScreen[sx-rtRect.left][sy-rtRect.top] ); SetEvent( ptParam->hEvent ); DeleteObject( hbrBkgnd ); SetBkMode( hdcVFP, nBkMode ); return 0; } void FAR DrawInVFP(ParamBlk FAR *parm) { HANDLE hObjThreads[2]; DWORD dwThreadId; stParam ParamThread[2]; HWND hWnd; int iFor = parm->p[0].val.ev_long; int iSeep = parm->p[1].val.ev_long; int i; hWnd = _WhToHwnd( _WMainWindow() ); for( i = 0; i < 2; i ++ ) { ParamThread[i].hEvent = CreateEvent(NULL, false, false, NULL); ParamThread[i].iFor = iFor; ParamThread[i].iSleep = iSeep; ParamThread[i].nPos = i; ParamThread[i].vfpHWND = hWnd; hObjThreads[i] = ParamThread[i].hEvent; CreateThread( NULL, // no security attributes 0, // use default stack size ThreadDrawing, // thread function (LPVOID)&ParamThread[i], // argument to thread function 0, // use default creation flags &dwThreadId); // returns the thread identifier }
WaitForMultipleObjects( 2, hObjThreads, TRUE, INFINITE ); CloseHandle(ParamThread[0].hEvent); CloseHandle(ParamThread[1].hEvent); _RetLogical(1); } FoxInfo myFoxInfo[] = { {"DrawInVFP", (FPFI) DrawInVFP, 2, "NN"}, }; extern "C" { FoxTable _FoxTable = { (FoxTable *) 0, sizeof(myFoxInfo)/sizeof(FoxInfo), myFoxInfo }; }
What this library does is export a function called DrawInVFP, which receives two parameters. The first one is the number of repetitions which the FOR will do in the function, and the second is the time interval which the function should wait without doing anything, in the intervals between one message refresh and the next.
Then, when the function is invoked, it creates two threads in the OS, and then, at the same time, it starts to draw a message on the VFP screen.
The VFP code would be as follows:
Figure 3: Result of executing the library FllMultiThread
SET LIBRARY TO "FllMultiThread.fll" ?DrawInVFP(150,100) SET LIBRARY TO
Figure 3 shows what the library would show.
The possibility of parallelizing a task is really a tool that can help us drastically reduce processing time, if used adequately. The processes also have some other features, for instance:
However, explaining these features would take much more than one article.
What you should remember when parallelizing a process is that on the one hand, you have a set of data to process, and on the other hand, a function, as autonomous as possible, capable of processing a subset of the data, without noticing that this data is only part of a greater whole.
If you manage to do this, you can reduce processing time drastically, for instance, in a salary system. On the one hand, you will have to obtain all the employees, in the company, which have to receive the salary, and on the other hand, you will have a function, within the library, capable of splitting these employees into small data subsets so that another library process can process them, without noticing that it is executing in parallel with other 20 copies of the same function, all running at the same time. ¿Don't you think the required time will be less?
FLL with VFP 9.0
When I had finished writing the first article about FLL with VFP, VFP 9.0 beta became public. I would really have liked to talk about new features in this new version of our fox, but simultaneously with the publication of the first part of my articles on FLL, Fábio Vazquez presented an article explaining new features. Here, I will only mention some of the features which he didn't name, and which have some impact in the development of FLLs.
VFP beta 9.0 supports the creation of arrays of more than 65,000 elements. This directly impacts the development of an FLL, since the files Pro_Ext.h and WinAPIMS.lib change, to be able to give support to this new limit.
The differences between the two .H files are the following:
Previous VFP version
typedef long MENUID; // A Menu id.
VFP 9.0 beta
#ifndef FOXMENU_INCLUDED typedef long MENUID; // A Menu id. #endif
typedef struct { char l_type; short l_where, /* Database number or -1 for memory */ l_NTI, /* Variable name table offset */ l_offset, /* Index into database */ l_subs, /* # subscripts specified 0 <= x <= 2 */ l_sub1, l_sub2; /* subscript integral values */ } Locator;
typedef struct { char l_type; SHORT l_where; /* Database number or -1 for memory */ USHORT l_NTI, /* Variable name table offset */ l_offset, /* Index into database */ l_subs, /* # subscripts specified 0 <= x <= 2 */ l_sub1, l_sub2; /* subscript integral values */ // these are not changed to 4 bytes so FLL compatibility is maintained } Locator;
As you can see, there is no significant change in the file, there is only support for the array within the .lib file.
I think it is not worthwhile to go into more details than this; I believe that for FLLs, this is the only change brought by the new beta version of VFP 9.0.
Executing Assembler from VFP
When we develop an FLL, we are not limited to coding C++ code. Here, we can be more ambitious and put assembler code into our FLL project.
For many, perhaps, Assembler is a dead language, but the truth is, I know nothing faster than it. You might say that C++ has made Assembler almost obsolete, since there is not much that you can do in C++ much easier than in Assembler. However, I repeat, it is still necessary for many things, and it is much faster.
Now, let's see how to code an example called "AdvancedFll", which will determine the speed of the CPU on which it is running, just with a few instructions in Assembler.
We will also use another function, that takes the value left by the OS in the Registry, to compare whether the velocity found by our function is correct. We will find, to our surprise, that our function, made in Assembler, is more accurate than the function provided by the OS.
#include #include "pro_ext.h" void FAR SpeedInRegistry(ParamBlk FAR *parm) { char Buffer[_MAX_PATH], sMsg[MAX_PATH]; DWORD BufSize = _MAX_PATH; DWORD dwMHz = _MAX_PATH; HKEY hKey; long lError = RegOpenKeyEx(HKEY_LOCAL_MACHINE, "HARDWARE\\DESCRIPTION\\System\\CentralProcessor\\0", 0, KEY_READ, &hKey); if(lError != ERROR_SUCCESS) { FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, NULL, lError, 0, Buffer, _MAX_PATH, 0); sprintf( sMsg, "N/A" ); } else { RegQueryValueEx(hKey, "~MHz", NULL, NULL, (LPBYTE) &dwMHz, &BufSize); sprintf( sMsg, "%i", dwMHz ); } _RetChar(sMsg); } void FAR SpeedTest(ParamBlk FAR *parm) { __int64 BeginCycle = 0, EndCycle = 0; unsigned __int64 nCtr = 0, nFreq = 0, nCtrStop = 0; // Obtains the frequency per second if(QueryPerformanceFrequency((LARGE_INTEGER *) &nFreq)) { // Obtains the value of the current counter QueryPerformanceCounter((LARGE_INTEGER *) &nCtrStop);
// Add the counter to the frequency nCtrStop += nFreq; _asm { // Obtain the beginning of the cycle from the CPU clock _emit 0x0f _emit 0x31 mov DWORD PTR BeginCycle, eax mov DWORD PTR [BeginCycle + 4], edx } do{ // retrieve the value of the performance counter // until 1 sec has gone by: // Waits for one second QueryPerformanceCounter((LARGE_INTEGER *) &nCtr); }while (nCtr < nCtrStop); _asm { // Obtains the CPU clock cycles again, but after one second _emit 0x0f _emit 0x31 mov DWORD PTR EndCycle, eax mov DWORD PTR [EndCycle + 4], edx } } // EndCycle-BeginCycle is the speed in Hz, // which, divided by 1,000,000, is the speed in MHz. _RetFloat( ((float)EndCycle-(float)BeginCycle) / 1000000, 20, 8 ); } FoxInfo myFoxInfo[] = { {"SpeedInRegistry", (FPFI) SpeedInRegistry, 0, ""}, {"SpeedTest", (FPFI) SpeedTest, 0, ""}, }; extern "C" { FoxTable _FoxTable = { (FoxTable *) 0, sizeof(myFoxInfo)/sizeof(FoxInfo), myFoxInfo }; }
This is called from VFP as follows:
SET LIBRARY TO "AdvancedFll.fll" ?SpeedInRegistry() && 1996 ?SpeedTest() && 1996.97867596 SET LIBRARY TO
With this, we can see that the second function, "SpeedTest", is more precise than the first one, and it uses only four instructions in Assembler.
_emit 0x0f _emit 0x31
mov DWORD PTR BeginCycle, eax mov DWORD PTR [BeginCycle + 4], edx
Conclusion
I really hope that you liked this article, that you were entrapped while you read it, which is what happened to me the first time that I saw the possibilities that VFP gave me with FLL extensions.
The only negative point which I can see in this practice is the complexity of having to code in C++, but I believe that many are starting to get interested in learning the language, thanks to the widespread use of C#, which, although not identical, is quite similar.
Source Code