Question

Remove outliers in a list

7 years ago
September 20, 2017
8 replies
113 views

stihe
13 replies

I have a vector dataset containing buildings and a raster dataset containing elevation data.

I am looking for a way to find the mean and maximum value of the roof of each building.

So far I have converted the raster dataset to points, extracted the raster value to an attribute and clipped it with the building polygons. The StatisticsCalculator can help me finding the mean and max value in each building, but before doing that, I would like to remove the outliers.

So the question is: how can I remove the top and bottom 5% of the values in a list?

+50

redgeographics
Celebrity
3639 replies
7 years ago
September 20, 2017

The StatisticsCalculator can also calculate the standard deviation. In a random sample you would expect 95% of the values to be within 2 standard deviations of the mean. If you filter the rest out you will probably have gotten rid of the most extreme outliers.

jneujens
189 replies
7 years ago
September 20, 2017

Totally agree with redgeographics! Another way might be to set a maximum and use a TestFilter to filter out everything larger than your maximum parameter. This can have major consequences, so I still advice you to use the approach with the StatisticsCalculator.

stihe
Author
13 replies
7 years ago
September 21, 2017

redgeographics wrote:

Thanks for the answer.

How do I filter the 5% out? Do I use the TestFilter? I can't get it to work with lists, so I welcome any suggestions.

+50

redgeographics
Celebrity
3639 replies
7 years ago
September 21, 2017

stihe wrote:

Thanks for the answer.

How do I filter the 5% out? Do I use the TestFilter? I can't get it to work with lists, so I welcome any suggestions.

This should do it. I quickly tested it with a bunch of random numbers (which obviously don't follow normal distribution), you would be testing the z-values.

stihe
Author
13 replies
7 years ago
September 21, 2017

redgeographics wrote:

This works with attributes, but not with lists. Should I convert the list to attributes before running the TestFilter?

+50

redgeographics
Celebrity
3639 replies
7 years ago
September 21, 2017

stihe wrote:

This works with attributes, but not with lists. Should I convert the list to attributes before running the TestFilter?

Yes: ListExploder to turn the list into features, then that Testfilter and then if you want you can rebuild the list again.

stihe
Author
13 replies
7 years ago
September 21, 2017

redgeographics wrote:

Yes: ListExploder to turn the list into features, then that Testfilter and then if you want you can rebuild the list again.

Perfect! Thanks :)

takashi
7694 replies
7 years ago
September 21, 2017

redgeographics wrote:

Note: you can say that approximately 95% points are within the range between (mean - stdev x 2) and (mean + stdev x 2) only if the distribution of elevation of the points can be considered as the standard normal distribution.

See also here: Normal distribution | Wikipedia

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Remove outliers in a list