Normal distribution - Level Extreme

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

Normal distribution

Message

From

31/07/2003 11:44:15

Peter De Valença
Haarlem, Netherlands

30/07/2003 22:46:01

Godfrey Nicholson
Ofek Technologies Ltd
Auckland, New Zealand

General information

Forum:

Visual FoxPro

Category:

Coding, syntax & commands

Title:

Re: Normal distribution

Miscellaneous

Thread ID:

00813885

Message ID:

00815437

Views:

Hi Godfrey,

Here's the way I made things visible with a 'bell' graph: Instead of SMALL() use COUNTIF(). Same parameters. Use as many rows as there are possible values (e.g. 100 rows for 1-100).

Your tip that the combination of skewness and kurtosis make the test has inspired me. You didn't give any significance values, so I've searched on the internet. The best text I found: http://www.jalt.org/test/bro_1.htm.

In my test-case the population is 1000 values. The text suggests these formulas:

skewness: SQRT(  6/1000 ) * 2 ---> .154919334
kurtosis: SQRT( 24/1000 ) * 2 ---> .309838668

So, the found skewness should be between -.154919334 and +.154919334 in order to justify an interpretation that the distribution is not skew by chance. And the found kurtosis should be between -.309838668 and +.309838668 in order to justify an interpretation that the distribution is not kurtic by chance.

In Excel, I found the following values (new data, each group has 1000 values):

method     skewness     kurtosis       Interpretation
--------   ----------   ------------   --------------------------------           
1xRAND()   .014784091   -1.158600547   Not skew. Platykurtic (too flat!). Sounds logical.
4xRAND()   .026446135   -0.346086353   Not skew. Still somewhat Platykurtic (still slightly flat).
7xRAND()   .036880559   -0.299542361   Not skew. Almost Platykurtic (almost slightly flat).

So, the values that were generated with 7xRAND() give a proven normal distribution. I guess an increase to, let's say 10, will give an even better result. Also, an increase of the number of values (e.g. to 10,000) should help.

My first impression that the distribution looked somewhat flat in the middle, was a good one.

The algorithm that I got from someone (4xRAND()) is nice, but appears not to be the perfect one. On the internet I found references to the Box-Muller algorithm and others. Also, there's a vivid group of algorithms that go under the hooding of 'rational approximations'. I assume that such algorithms give results that are not as perfect as the official ones, but give results that are good enough for certain less-critical situations (like mine) and have the advantage of, for example, high speed and simplicity of implementation. The 4xRAND() method might be regarded as such an algorithm, although it's not invented by a mathematician :).

I'm not sure why my customer needs these values. All I know is that he'll use the field to add it to values in another field. The other field may contain missing values. Reading in that field alone in his OLAP-tool would generate errors in case of missing values. The concatenation should prevent such errors. Something like that... :)

Groet,
Peter de Valença

Constructive frustration is the breeding ground of genius.
If there’s no willingness to moderate for the sake of good debate, then I have no willingness to debate at all.
Let's develop superb standards that will end the holy wars.
"There are three types of people: Alphas and Betas", said the beta decisively.
If you find this message rude or offensive or stupid, please take a step away from the keyboard and try to think calmly about ~~an eventual~~ a possible alternative explanation of my message.

Map

View

Click here to load this message in the networking platform