Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
ARGHH! big table killing VFP - need strategy help!
Message
From
10/08/1999 13:47:26
 
 
To
10/08/1999 12:06:22
General information
Forum:
Visual FoxPro
Category:
Other
Miscellaneous
Thread ID:
00250826
Message ID:
00252099
Views:
11
Yesterday I posted a message here about the KDB database. I know that the goal of this community is share Visual FoxPro solutions but to solve your problem, I suggest that you visit www.kx.com and download the KDB demo. The engine is just little 160 KB and it can handle massive amounts of data. Its also possible to perform data series analysis (without extra data!) like you wants. When you finish the installation, simply right click your players.dbf table (ensure that is in DBF version 3, or export that to xls or csv formats) and click the KDB option when the menu appears. You will able to perform queries with KDB Query Tool, SQL 92 scripts or KSQL scripts and you will have ODBC and JDBC interfaces to KDB. I hope you enjoy it !

>>Hi,
>>
>>I've ot a 15.6 million row table (play.dbf) - (only 16 small columns) - that I need to do some data analysis on. Problem is that some aggregate functions drive VFP to an absolute standstill - not just slow - but down to nothing! I need help with a strategy to deal with this.
>>
>>Example: two of the columns are pnumber (patron number) and gamedate. Every patron can have multiple records for each gamedate.
>>
>>First, I wanted a count of how many records per day existed (this goes over a little less than 2 years) so I ran:
>>
>>"select gamedate, count(gamedate) from play group by gamedate" -
>>
>>It was slow compared to what I'm used to, but ran fine and gave me proper results. There were about 600 separate days where play happened - so thus about 600 rows returned. NOW - I want a result with the unique patrons in it - in other words - I wanna see all the different pnumbers in the table? So I try both:
>>
>>select distinct pnumber from play ..... and .... select pnumber from play group by 1
>>
>>both drive VFP to a halt! - Waited fifteen minutes and barely go to 1% on the query thermometer! tried it several times and had to force quit each time. As a note, I would expect a result set of about 100,000 rows in this case.
>>
>>An attempt to do aggregate like "Select pnumber, gamedate, count(*) from play group by 1, 2" never even made it to 1% after fifteen minutes! ARGGHHHH! I've got a lot of these kind of aggregates to do as well as cross tabs and there is no way I can wait twenty or thirty minutes on each one.
>>
>>Any advice on strategy would be greatly appreciated. (BTW - this is on VFP 6 SP3, Dell PII-450 128kRAM - both fields mentioned above are indexed.)
>>
>>TIA!
>
>Isn't there a logical way to split up the data more? I would try to categorize your players or perform the query on a subset of data mothly, weekly, daily, etc.
>
>I can't do an effective analysis on those kind of huge aggregate values anyway. There are so many variables contributing that it is hard to glean any meaningful information from such huge data sets. In other words, do you really care what day in the last two years is the most active? Or would it be more helpful to plot the daily or weekly activity of the last 3 months?
Previous
Reply
Map
View

Click here to load this message in the networking platform