>
> >>Say I have a set of data that is the price of a home.
> >>
> >>100,000
> >>125,000
> >>92,000
> >>175,000
> >>135,000
> >>10,000
> >>500,000
> >>
> >>Well, I know that the 10,000 fiqure is really just for a lot. So when
> figuring trends, I need to throw it out. I also know that the 500,000
> fiqure is also way off, so it too needs to be thrown out.
> >>
> >>If each of these numbers were in a field in a dbf(one per record) How
> could I programmatically throw out numbers that were way off? I have tried
> diffrent forms of average and standard deviation, but nothing seems quite
> right. I need some high school algebra or statistics lessons I think.
> >
No. It depends how you want to look at this. As a programmer, I would
say that the specs are not complete and let the client sort this out.
As a (honest, Bruce are there? :-)) statistician, you would have to
justify your sample. If it is representative, you should not have to
leave anything out, if it is not, ah then why conclude anything on its
basis. And if you leave anything out, which would be acceptable if you
can justify it, meaning if by doing so, you can better prove the
representability of your data. I doubt it that this could be
programmatically done since by definition, this would be a judgement
call, which until lately cannot be done by a computer :-).
My 2 BEF of wisdom,
Marc
If things have the tendency to go your way, do not worry. It won't last. Jules Renard.