>>>>I work for an auditing firm where we are looking for duplicate financial transactions that span approx 50M records over 21 separate tables. We are looking for matches based on, say, ABS(GROSS) and INVOICE across all 21 tables. Basically each record in each table has to be checked against the current table and each record in all the other tables. The results are complied into a separate table with a blank record between groupings. Currently this takes 72 hours. There's got to be a better way! Any suggestions?
>>>
>>>Would it be faster to UNION ALL tables together and then run GROUP+HAVING queries.
>>
>>Each table is ~1.5G in size-~30G in all. Wouldn't the UNION ALL put me over the 2G file limit?
>
>You said 50M, so I assumed the size. What are you using now?
50M referred to number of records. Currently using 21 tables with ~2M records in each (~1.5G file size). Simply running a series of code and checking one record against another.
Duty is ours, the results are God's - John Quincy Adams