Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
What if
Message
From
09/04/2007 19:56:07
Walter Meester
HoogkarspelNetherlands
 
 
General information
Forum:
Visual FoxPro
Category:
Other
Title:
Miscellaneous
Thread ID:
01213261
Message ID:
01213748
Views:
21
Kevin,

I'm going to do something I seldom would do... I'm going to appologise for my agressive stance here. I'll send you a personal reply later. From this point on I'm going to be purely technical and try to avoid emotion (if I can handle that) as much as possible.

>You'll notice that I never said anything derogatory about your VFP solution involving seeks (though I'm sure you'll agree that it gets more complicated once you get into nested hierarchies). There's often more than one way to handle the problem -CTEs and recursive queries are another. But the argument loses relevance when an organization chooses a database with all the benefits of SQL Server over VFP as a database.

Here I think you miss an important aspect of the problem. The argument is about RECORD ORIENTED vs SET ORIENTED and not about SQL Server and VFP per se. You could do set oriented in SQL server as well, though I did not analyse yet how this scales and performs as well. An example:
DECLARE finance CURSOR FOR
SELECT fin_pk, fin_amount FROM finance WHERE debno = '70058' ORDER BY Date 

OPEN Finance

DECLARE @sum int, @fin_pk int, @fin_amount decimal(9,2), @treshhold
SET @treshhold = 1000

FETCH NEXT FROM Finance INTO @fin_pk, @fin_amount
SET @Sum = @fin_amount

WHILE @@FETCH_STATUS = 0 AND @sum < @Treshhold
BEGIN
	FETCH NEXT FROM Finance INTO @fin_pk, @fin_amount
	SET @Sum = @sum + @fin_amount
END

IF @sum >= @Treshhold
	SELECT @fin_pk, @fin_amount
ELSE
	SELECT NULL, NULL
Again, it seemed a bit quicker than the earlier SQL statement, but I really have to test this more closely. From the theory, it only requests the needed records (Depending on how the FETCH is optimized) and does not do more work than neccesary, though of course it is a bit more coding than a:
SELECT TOP 1 * FROM finance A WHERE Debno = '70058' AND 
	(SELECT SUM(fin_amount) FROM t_finance B WHERE Debno '70058' AND fin_date <= b.Fin_date) > @Treshhold
But back to the issue. RECORD ORIENTED and SET ORIENTED are two different approaches that do have their value in certain problems. Here is a little theory:


RECORD ORIENTED systems were the first database systems. They originate from the punchcard age. The structure of dml closely matches the physical sotrage implementation. Data was found with the help of pointers to physical locations. The dml for looking up a single record with the help of a pointer was optimal in order to get the best performance. Programming for such systems required a deep knowledge of how the data was stored. The mechanism to look up a record was equivalent to what we in VFP know as a SEEK. If records had to be search on other criteria it had to perform a LOCATE (which was of course slow as it has to search all records).

SET ORIENTED systems only became popular in the 80-ties. Under the development of SQL which was based on Codds relational model a new type of database was invented. Unlike its predecessors, it decoupled its technical implementation from the logical database model. They became (almost) totally independed. They are called the internal (technical implementations) and external (logical database) schema. Indexes belong to the internal schema so they are not explicitely (with a very few exceptions) used in the SET ORIENTED dml. Between the external and internal schema, the database optimizer is going to translate all DML from the logical schema to the technical schema. This, esspecially in the early days had the dramatic consequence that SQL-RDBMS were notorious for their bad performance. With the advances in optimizer alghoritms and hardware this today is not an much of an issue anymore.

The other difference between the two DMLs is that SET ORIENTED dml is regarding data as collections of data without any order. Its DML is concerned SETs of data in stead on individual records, though of course a single record is also a SET of data.

From a technical pov, this means that if you're encountering a problem that needs handling data record by record, you might be better of with a RECORD ORIENTED dml, as shown in the example we had. If you can use the knowledge of the internal schema (indexes, exact location, order) you have the key to the best performing solution. The problem with the SET ORIENTED solution is that you first have to find your way to use the SET ORIENTED dml to perform an RECORD ORIENTED approach, which as shown in both our efforts is a lot more difficult. A second related problem is that in the execution of the SET ORIENTED approach, you can't take advantage of the internal schema, because they are seperated: There is no direct SEEK. You'll have to let it process it by the optimizer and hope it will take the most efficient plan for you and not consume too much resources (CPU and memory).

This explain the HUGE difference in performance and resource consumption in the two approaches for this particular problem.

Now this is only a very small example. The more complex the problem becomes the more important the decision whether to do a SET ORIENTED dml or a RECORD ORIENTED dml. One of the obvious struggles IT is trying to tackle right now is to build a file system on a database, and though I'm certainly not an expert in all the problems they've encountered here, this certainly is a factor.

In my daily work, I frequently have to deal with hierarchical tree problems, and though I also use SQL2000 to solve the problem in a recursive UDF function to get the full pathnames, it is not the optimum but can live with it.

In some other case, I had to do it differently. In our hierarchical autorisation trees where the autorisation of access to certain data is depened on rules and roles of the logged on user (Dependencies, on user, departments, positions, sites, qualifications, specialties etc), I had to heavily rely on the local database using SEEK and other DML as permissions inherit down the tree. There is no way you could use a SQL Server for that as there might be a few hundred calls in succession (Accessing user rights for a whole menu system) for a few hundred objects. This would certainly drag the SQL server down with a hundred workstations online or if not the network. In VFP, its very quick and easy. So what we did, is get the whole security table (Typically only a 10K or so), index it and calculate from there. All in all a very eleborate and granular authorisation system, performing very well. Impossible to implement in ADO.NET ? No, but certainly is going to be a lot more messy and performance would be significantly lower

If the code interests you:
LPARAMETERS nObjectId, nUserId, cAcu_ID, nEventType, nId2
LOCAL nLevel, nSel, nParentId, aGroupValues[7], aPatValues[6]

* nId2 is used for determining authority for queries on category and 
* tasks on events and eventid´s

nSel=SELECT(0)

nUserId = EVL(nUserId, Applic.Userid)
nLevel = 5

* Level description
* 1 Full access
* 2 Read Only
* 3 No access
* 4 Not visible (e.g. for menuitems)
* 5 No info

** The code below is optimized for performance

SELECT User
SCATTER FIELDS prs_pk, prs_position, prs_unit, prs_department, prs_station, prs_physqualification, ;
	prs_physspecialty TO aGroupValues


SELECT Auth
DO WHILE nLevel = 5 AND nObjectId # 0 
	DO CASE
		CASE !SEEK(nObjectId)
		
		CASE nObjectId = 22800
			** Security on queries
			CALCULATE MIN(sec_seclevel) WHILE sec_objfk = nObjectId ;
				FOR sec_seclevel # 0 AND BETWEEN(Sec_grpid,1,8) AND ;
					(sec_grpId = 8 OR Sec_grpValue = aGroupValues[sec_grpId]) AND ;
					sec_usrgrpfk = 1 AND sec_usrgrpvalue = nId2 ;
				TO nLevel
		
				nLevel = NVL(EVL(nLevel,5),5)

		CASE EMPTY(cAcu_id) 
			** userrights to objects 

			CALCULATE MIN(sec_seclevel) WHILE sec_objfk = nObjectId ;
				FOR sec_seclevel # 0 AND BETWEEN(Sec_grpid,1,8) AND ;
					(sec_grpId = 8 OR Sec_grpValue = aGroupValues[sec_grpId]) ;
				TO nLevel
				
				nLevel = NVL(EVL(nLevel,5),5)

		OTHERWISE
			** userrights to patients data

			CALCULATE MIN(sec_seclevel) WHILE sec_objfk = nObjectId ;
				FOR sec_seclevel # 0 AND BETWEEN(Sec_grpid,1,8) AND ;
					(sec_grpId = 8 OR Sec_grpValue = aGroupValues[sec_grpId]) AND ;
					BETWEEN(sec_usrgrpfk, 1, 6) AND (sec_usrgrpfk = 2 OR sec_usrgrpvalue = aPatValues[sec_usrgrpfk]) ;
				TO nLevel
		
				nLevel = NVL(EVL(nLevel,5),5)
	ENDCASE
        
        ** If no applicable rules have been found at this level, walk down the tree.
	nObJectId = IIF(nLevel=5 AND SEEK(nObjectID,"secobjects","sco_objId"), secobjects.sco_parent, 0)
ENDDO
SELECT (nSel)
RETURN nLevel
This is just one example, but could give you a few others where LOCAL processing of data, just does make more sense than remotely. One that comes in mind is processing meta data for real time input data validation. If you're interested I could give you the details about that.

The bottom line is that this is about meta data. Meta data just like normal data is handled best by a dml, not by examining arrays. I'd like to have a full dml available to me whether the data is local or remote. I'd like to have a choice whether I attack the problem in a SET ORIENTED approach or a RECORD oriented approach either locally or remotely.

The combination of SQL and VFP allows you the following combinations:
LOCAL SO dml, LOCAL RO dml, REMOTE SO dml, REMOTE RO dml

The combination of ADO.NET (excluding LINQ for this moment),
REMOTE SO dml, REMOTE RO dml and very limited LOCAL dml that has a bit of both


>I'm curious why you would suggest that John (or anyone else) should take your challenge, when you've already shown a proclivity for making up your mind beforehand (oh, wait...I forgot....your "intuitions")

Not going to bite that one right now. I'm offering you to learn something.

>For someone who has accused others of spreading FUD about VFP (I guess we all know how that one turned out), you certainly seem to be attempting it yourself, evidenced by claims that uses of certain SQL functions are inherently "resource intensive".

FUD, stands for Fear, Uncertainty and Doubt. The end of line of VFP has in itself nothing to do with FUD. As for the SQL functions, I encourage people to check how resource intensive their data manipulation is. Not only for SQL (Using the profiler), but also for VFP (using filemon). A lot of grief, misunderstanding could have been saved if people understand what a certain DML actually is doing on the detailed level. I don't think I'll be telling anyone any news that a lot of projects suffer from performance problems that could have been avoided if those problems where identified and tracked earlier in the design process. Therefore I accuse heaps of developers to have insufficient knowledge of what the consequences are of their requests on a database. And this applies to both VFP and SQL developers.


>Walter, I've learned one thing from you - your prejudices involving VFP and .NET go far far deeper than I ever realized. You may find this hard to believe, but in specific private conversations I've had with others, I've actually defended a few of your core beliefs, even if I didn't agree with a number of specifics. (And I've been criticized for doing so). That day has passed.

I've never asked for that, and I expect anyone to draw conclusions for themselves and NEVER, NEVER, NEVER take a word from anyone else without validating the claims him/herself.

>Your statement that VFP is better at implementing design patterns than .NET, because of the way VFP handles data is probably the most insightful comment you've ever made.

That is a whole different topic, and if you'd ask me to explain why, I would give you the reasons why it is easier (better is a different/subjective word), to implement design patterns in VFP. The hints I already gave you was: Using UDF code in meta data tables, EXECSCRIPT(), dynamic typing.
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform