Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Distributing and using VERY large tables.
Message
 
To
04/08/2001 22:13:54
Gerry Schmitz
GHS Automation Inc.
Calgary, Alberta, Canada
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Miscellaneous
Thread ID:
00539842
Message ID:
00539890
Views:
13
>>The original zipped fixed length text file take up 487 Megs (there is actually a total of 927 text files in zip form).
>>
>>Each record has 30 fields and the total length of each record is 182 bytes. There are some fields that can be discarded and doubtless the other vendors have done away with some of them. At the most I could drop about 50 bytes from each record so that really doesn't help much. I'd still end up with multiple Gigs.
>>
>>I need to do searches on this data, indexes, SQL queries, etc. I'm not sure that Word would help get this data down to 450 Megs or help with doing complex searches would it?
>
>I guess the next question is: what data types are you dealing with ? Numbers, Dates, Text, all of the above ?
>
>FoxPro "numerics", for example, take up a lot of space; these are candidates for binary fields. Even dates, given a range, could be converted from 8 bytes to perhaps 2 bytes. If this is a "static" DB, consider BLOCKSIZE 0 for memos; there will then be no padding.
>
>Haven't had any experience with "PHBase" (sic); it's supposed to be able to search memos. Perhaps your text can be put in memo fields.
>
>What about a massive "help file" ? Help compilers and "engines" like the MSDN KB "pack" and "index" information ...
>
>With "help", you can even hyper-link to subsets of your data.
>
>Part of your research (I imagine) should be to identify the "types of queries" you will need to satisfy; or is it all ad-hoc ?
>
>Offer to supply a DVD drive ?

Gerry,

The data comes in as all text. Even “numeric” data is in text format. Getting rid of any padding doesn’t seem to help much because I had created an ASCII Delimited Text file from my 2 Gig dbf and the text file was almost 2 Gig itself.

That Help file idea bears thinking about. Interesting idea. The data comes from the post office. It is a list of every street, building, apartment, etc in the country and the range of deliverable addresses. It also has information in order to match street nicknames, add the zip+4, add the carrier route, etc. The idea is to take an address from the client’s mailing list, massage it to get it into standard format and try and fix any errors, and then match it with the national database in order to supply the zip+4 extension, carrier route, and other necessary information. There will have to be a number of queries against the data by state, or city, or street name, etc. so I’m not sure how the Help idea would work in this case. The queries would all be figured out on the fly depending on what information is in the original address, how bad a form it is in, and if any matches can be made early on.

A DVD would be out of the question. I would have to stay competitive with the other vendors and they are all distributing the data on CD. Delivering 10 CD’s every other month (that’s how often the post office publishes new files) would also be out of the question.

Do you know of any other database who’s data is this compressed? Paradox (DB) or whatever?

Thanks for considering my problem.

Ed
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform