Randomly select n unique codes from 5mln table

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

Randomly select n unique codes from 5mln table

Message

From

10/11/2001 05:36:30

Cetin Basoz
Engineerica Inc.
Izmir, Turkey

09/11/2001 13:42:13

Mike Yearwood
Toronto, Ontario, Canada

General information

Forum:

Visual FoxPro

Category:

Databases,Tables, Views, Indexing and SQL syntax

Title:

Re: Randomly select n unique codes from 5mln table

Miscellaneous

Thread ID:

00579211

Message ID:

00580103

Views:

>>Hi everybody,
>>
>>How can I randomly select n (110) codes from some huge table, so these codes would be unique? I'm trying to create a test case now...
>
>Every suggestion you've been given will involve moving through more records than you need. The rand() function will be a performance hit if its used in a SQL. There is a method I learned years ago called a fixed interval random start sample.
>
>If you have 1,000,000 records and you want 110 records, you divide 1M by 110 yielding 9090. You then pick a random number between 1 and 9090 and then you select records from that random start by skipping forward 9090 records per jump.
>
>Using a loop, you get the records so fast your head will spin.
>
>lnTotal = 110
>lnInterval = int(1000000 / m.lnTotal)
>lnFirst = int(RAND()* m.lnInterval)
>lnRec = m.lnFirst
>go m.lnRec
>for lnCount = 1 to m.lnTotal
> ?recno()
> skip m.lnInterval
>endfor
>
>This technique was used to select people for telephone interviews from a list of 13 million numbers. Every query we tried took too much time as they involved some kind of calculation against each record in the table.
>
>The system was much more complex. Certain area codes were deemed less likely to participate, so we had a factor which increased the number of records per area code.
>
>Records that were selected were ineligible to be selected again until after it was used in an interview, etc etc.

Mike,
"The rand() function will be a performance hit if its used in a SQL."
Absouletely agree and my original reply was a pitfall.
For a moment thinking codes were unique does really all suggestions move pointer more than needed compared to this code ?

"Every suggestion you've been given will involve moving through more records than you need." Every ??? I don't think so even if the codes were unique.
This approach has assumptions that target set is huge and draw count is relatively small. Not a big deal of course.
Cetin

Çetin Basöz

The way to Go
Flutter - For mobile, web and desktop.
World's most advanced open source relational database.
.Net for foxheads - Blog (main)
FoxSharp - Blog (mirror)
Welcome to FoxyClasses

LinqPad - C#,VB,F#,SQL,eSQL ... scratchpad

Map

View

Click here to load this message in the networking platform