Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Dotnetpro database performance contest
Message
De
12/05/2007 14:04:55
 
Information générale
Forum:
Visual FoxPro
Catégorie:
VFP Compiler for .NET
Divers
Thread ID:
01224231
Message ID:
01224995
Vues:
12
Hi Samuel,

>You really know several tricks!.
Thanks - but Cetins post made me think on where to draw the line between "honest database operartions" and "cheating because we know the scenario" <bg>. For instance the gain you plan to get by allowing the delimiter to be a string could be reached by applying the knowlegde that the freshly imported table has no deleted marks and no nulls. Instead of replacing every record you could just redefine the table structure in the header with low level functions, moving the field borders in such a way that the data is left-trimmed. From the description of the contest this should work, but would it be honest ? But what if such a field alignment is part of the database tools, to be used just in case the DBA knows it is appropriate ? Another dirty trick might be to "restructure" the header only to be one big field, create a unique index on it, "correct" the header info and the info in the index structure. Even if the adding of the fields is done at the speed of the C-engine, only build a B-Tree on one large field is probably so much faster that time spent for the low level file manipulations will be more than offset.

So Pseudo code for "border line" implementation:
Create table1
Append Delimited 
[restructure table header to 1 big field]
Index on BigField To BigField compact Unique
[restructure table header to fields offset son that LTRIM not needed]
[Restructure Index header]
Scan
  Insert Into table2
EndScan
Scan
  FPUtS()
EndScan
where low level replacements for the lines inside the scan are possible but the effect there will be less than in the other cases.

@Markus: Do you think such an approach would be considered cheating ?

> Our scanning with our TableLayer turns to be faster (for now I would not dare to say a lot faster) than VFP, but only in big tables, I'm doing benchmark comparisons with a table with 800 million records and 926 bytes by records.
>Because this table has nearly 800 MB, in my test machine with 1 GB ram, VFP has a hard time caching the table and arrives later than our TableLayer.

Sounds good, but somewhere there must be an error: either 800000 records size 926 bytes and can be handled by vfp and Tablelayer or 800 Million recs size 926 bytes for a size of 800 GB ?

>With easily cached tables (few MB with in relation to the memory available), because VFP does extensively caching, it beat us, for now... Just wait we also enable this Caching Tricks in our TableLayer and the thing will get better.

I hope I am not sounding like a broken record, but get a working implementation out of the door and trick/cache later - people like me have memory stuffed machines, helping via the normal disc caching and are willing to search for shortcuts themselves. You might even get some tricks sent back to you <bg>...

regards

thomas
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform