>>Say I have a set of data that is the price of a home.
>>
>>100,000
>>125,000
>>92,000
>>175,000
>>135,000
>>10,000
>>500,000
>>
>>Well, I know that the 10,000 fiqure is really just for a lot. So when figuring trends, I need to throw it out. I also know that the 500,000 fiqure is also way off, so it too needs to be thrown out.
>>
>>If each of these numbers were in a field in a dbf(one per record) How could I programmatically throw out numbers that were way off? I have tried diffrent forms of average and standard deviation, but nothing seems quite right. I need some high school algebra or statistics lessons I think.
>
>Well, some things you are leaving out. Will there always be a low and high which will need to be dropped out? Can you setup a simple range to compare values record to record. Setup an index on value and determine the difference between the numbers to establish a reasonable range. If a number falls way-out of the range then drop it. Is there another field that can be used to identify stuff to exclude. How do you know that $10,000 is a lot value? htwh
Yes, most of the time there would be a high and low, but sometimes just one or the other. I know that the low number is a bad price logically. If all homes next door to one another are selling for x average, then one that is 10,000 is way below. No, there is not another field I can use.
My basic question is, what is an exceptable range to include statistically? the middle 80%, the middle 40%, etc.
Previous
Next
Reply
View the map of this thread
View the map of this thread starting from this message only
View all messages of this thread
View all messages of this thread starting from this message only