Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Distributing and using VERY large tables.
Message
 
À
04/08/2001 18:36:52
Nancy Folsom
Pixel Dust Industries
Washington, États-Unis
Information générale
Forum:
Visual FoxPro
Catégorie:
Base de données, Tables, Vues, Index et syntaxe SQL
Divers
Thread ID:
00539842
Message ID:
00539889
Vues:
20
>>>>I have to create a table of 35 millions plus records. The table is to be populated from 850 fixed length text files. I get about 1/3rd of the way through with the imports and I hit the 2 Gig limit for Visual FoxPro tables.
>>>>
>>>>I took the 2 Gig dbf file and exported it to a delimited text file to see how much smaller it might be since there are a lot of empty fields in the table. To my surprise the 2,079,274 KB table made a 2,066,947 KB delimited text file.
>>>
>>>Did you use SDF or CSV? I'm finding CSV to be smaller because it removes the spaces (though not for numbers). I dont' think it's going to help really solve the problem though.
>>
>>Nancy,
>>
>>I had done a COPY TO ... DELIMITED. I was suprised that I didn't get a larger drop in file size. I opened the resulting txt file just to make sure that it wasn't a fixed length file accidently but it was a delimited file without the spaces. Even had that worked and reduced the file size down to something managable, I would still have had the problem of how to work with a text file doing complicated searches that I would normally do in VFP using indexes and and RUSHMORE.
>
>Apparently it's not entirely clear then exactly what your question is. Do you want to know how to compress 2 gigs down to 650, and have it still be accessible as a Fox backend?
>
>When the application was designed was the requirment that it fit on a CD known? Was it then evaluated as to what the data size would be and then backend tools evaluated?

Nancy,

The data comes from the post office. It is a list of every street, building, apartment, etc in the country and the range of deliverable addresses. It also has information in order to match street nicknames, add the zip+4, add the carrier route, etc. There are a few fields that would not be needed by everybody such as congressional district, county number, etc. For the most part, though, the majority of the fields are needed and it would be the same information distributed by all vendors. The type of software that needs this information is called CASS software and it is used to clean up a mailing list in order to do Automation compatible bulk mail. Some of the vendors of CASS software are Semaphore Corp. (ZP4), MyMailer, Melissa Software (Mailer’s Plus), DataTech (AccuMail), and GroupOne Software.

I have looked at the disks of three different companies and the “data” always seems to be in the range of about 350,000 to 450,000 KB. The file extensions run the range from .dat, .d01….d05, .cas, etc. One company (DataTech’s AccuMail) seems to be written in dBase from what I can see in their WEB site’s FAQ and Help section but the data is not in dbf format or at least VFP can’t open it.

My application hasn’t been designed yet. This data problem is my first hurdle. It will have to be distributed on a single CD, though, so I need to get the data down to no more than 650 MB. If need be I can write a dll with something else like VB or C# if that’s what it takes to get the data down and read it. Normally I would look at 6 Gig of data and say there is no way to knock it down to 1/10th it’s size and still have it accessible as live data but since I see that the other vendors have done this, then it must be possible. I just don’t know what format they are putting the data and what language/engine they are using to access it.

Thanks for your feedback.

Ed
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform