Versions des environnements
been trying a few things to get the best performance I can out of importing a lot of data. I've got some really big data files up to around 500 meg, these need to be pulled into tables where they are sorted and matched and then output into pipe delimited files for some other software down the line.
due to the amount of data (when all in just a dbf I was hitting the 2 gig mark) and the fact that only a small amount of it is important for the sorting and matching I have tables with useful fields and memo fields. our data comes in in text files where every 70 - 80 lines is a record, and then you hit an ascii character 12, I've speeded up the reading of the files by replacing my fgets(), I wrote a c++ dll that uses memory mapped file io and returns a full record at a time which I can then use alines on and extract the usefull fields and put the rest in the memo fields.
the reading and splitting part takes just over a second per 100meg ie as fast as the hd array can manage
however writing that data back to the dbf adds 14 minutes to the process.
admittedly when all is said an done the dbf+fpt is around a gig when the file being read in is only 500meg but surely there is a way of getting better than 14 mins without me writting a new dll for writing the table and fpt bypassing foxpro's internal io?
initally I was outputting a record at a time from my array using gather from array memo
every 5000 records took about 20 seconds
I thought that batching it would be the solution so I read into the array 50 records at a time and used insert into blah from array
every 5000 records then took..... 20 seconds! no difference
I've profiled the program and 95% of all the time taken is in that write to table/fpt command, is there anything I can do?
Ken.
Suivant
Répondre
Voir le fil de ce thread
Voir le fil de ce thread à partir de ce message seulement
Voir tous les messages de ce thread
Voir tous les messages de ce thread à partir de ce message seulement