Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Distributing and using VERY large tables.
Message
 
To
05/08/2001 21:01:14
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Miscellaneous
Thread ID:
00539842
Message ID:
00540001
Views:
17
Nancy,

Replies included inside.

>Edmond-
>
>Thanks for the additional information. Can we recap the requirements?
>
>1) You will receive a CD from the PO of fixed length text files that come compressed. Is it WinZip? You said it's 450 mb compressed. What is the P.O. data uncompressed (unzipped)? Not after you've pulled it into Fox DBFs.

REPLY: I haven't unzipped the PO data all at once to see what it would be. The data comes in 927 files ranging from 002.zip to 999.zip. One file for each 3-Digit zipcode. I had written a program to unzip a file from the CD to a text file on the hard drive, add a CRLF at the end of each record (the PO data comes with no CRLF), append the data to a VFP table, delete the text file, and then move on to the next file. I got to 347.zip when I hit the 2 Gig limit on the VFP table. Unzipping one of the largest files comes up with this: 770.zip (3.9 MB) -> 770.txt (41.6 MB). So we are looking at a 10X increase. This is exactly what I am trying to reduce the file size by. Jim Nelson had suggested ZipMagic which zips a file but does not require it to be unzipped for an application to access it. This may be what everyone is using. The royalities are $25 per copy, though, but, in thinking about it more, that would only be a one time fee for each customer and not for each CD that is mailed out. That may make it feasable.

>
>2) There are two parts to this task. Monthly data updates, and application you will write that processes the data in the montly updates and provides queries according to some rules. The application will run at your client's site and will run on workstations, or off a LAN.
>
>3) You will send to your clients, one time with periodic updates, an application.
>
>4) You will send data updates to your client monthly, and you'd like to send the data via one CD. This is probably a fixed requirement but you're having trouble satisfying it given the size of the resulting FoxPro data tables.
>
>5) You would like to use the Fox data engine, but that's not a fixed requirement.
>
>6) You tried time tests with unzipping the 6 gigs (even though it was still too big) and the time was too slow.
>
>Notes and suggestions:
>
>You have 3 competitors all of whom are able to distribute their applications and data on a single CD. You don't know what format their data is in, and have been unable to determine it by opening the tables in Fox, hexedit, or Paradox. did you try, by any chance, WinZip? Have you tried dBASE? Have you considered that they *might* be sending the same data in compressed text format that the P.O. sends, but encrypted? The similiar sizes make me wonder that.

REPLY: The PO data is PKZip but the other vendors could be rezipping it using anything. I've tried to open one of them with WinZip but no luck. They are obviously not unzipping the data when the application runs because they access the data immediatly and unzipping such a large file would take a bit of time.

>
>In testing you've discovered that creating the Fox DBFs results in 6 gb of data. You, I understand, simply pulled all the data in as regular strings. You have not stated if you created indices, memos, or if the tables are free, so I don't know what all files are included. For example, you could reindex the data after it's installed on the target system.

REPLY: No memo fields and no indexes and all text fields. I went through 336.zip before I hit the 2 Gig limit and so I am just extrapolating that to finish the task would require 6 Gig of dbf files.

>
>It's been suggested that you _not_ use all character fields or numerics if possible. You haven't indicated what if any effect that would have.

REPLY: I haven't messed with this yet because I can see that by knocking the record size down from 182 bytes to maybe 150 bytes would not solve my immediate problem. That might get me down to 5 Gig of dbf files which is still way too big.

>
>It's been suggested that you consider DVD, but that is not an option.
>
>Since you haven't been able to determine how the other vendors are doing it, your choices seem to be corporate espionage, which I don't recommend, and forgetting them and figuring out your own solution for good reasons so that you have the force behind you when you justify it.

REPLY: If it comes to figuring out my own algorithim, then I will just have to do that but I want to do as much research as possible first. My only real exposure is to FoxPro tables so I just want to make sure that there is not an easy solution out there that I am not aware of before I start out re-inventing the wheel.

>
>For example, I suggested you do consider DVD but had extra value over what your competitors do. But, if that's not option...
>
>- Are CDs compressible? I don't know.
>
>- Are you _sure_ DVD is not an option just because your competitors don't use it? After all, DVDs are much more common now. And what you may be giving up in shipping media you'll be gaining by the awesome power of the Fox engine.
>
>- Could you distribute just the records from the P.O. that are different from the previous. The first install might be a few CDs but subsequent shouldn't be very many at all.

REPLY: If worse comes to worse I may consider just making the application WEB enabled and just keep all of the data on my server. This would solve a lot of problems of distribution but maybe create a few others for users who maybe have 100,000 records to correct and a dial-up modem connection. There will most certainaly would be customers who just won't be able to run the data on-line and would insist on CD distribution.

I will probably make this WEB enabled no matter what but I don't want to start with the code assuming VFP tables just to find three months from now that the solution to the size problem requires that I access the data using LLF functions or something else.

What is driving me right now is the fact that everybody else is able to use this massive amount of data stored in just 450 MB. If they can do it then there must be a way. Like you mentioned, the sizes are so close to the compressed sizes that my gut reaction now is that they are using something like ZipMagic or maybe some shareware version that does the same thing.

Thanks for your attention to this. It really helps to have someone to bounce things off of and to have ideas coming in. Especially when it deals with something that I have no experience in.

Ed
Previous
Reply
Map
View

Click here to load this message in the networking platform