Question

Searching complete dataset for null values

7 years ago
April 3, 2018
2 replies
21 views

+31

dustin
Influencer
625 replies

I'm trying to calculate the percentage of null values per feature class in a dataset. Null values are defined by '-99999' or 'noinformation' or 'No Information'.

I have hundreds of feature classes, each with hundreds of attributes named differently. So building a workbench calling out each attribute specifically would be very time consuming. This needs to be a dynamic solution.

So far I've tried an AttributeExploder followed by ListSearcher, but I'm having trouble calculating totals based on feature class.

david_r
8355 replies
7 years ago
April 3, 2018

Use the Group By in the StatisticsCalculator to calculate totals by group, in this case using the feature class name (fme_feature_type).

If your data is stored in a SQL database, a much faster solution would be to use a SQLExecutor with something like

select sum(
    case when your_attribute in ('-99999', 'noinformation', 'No Information') then 1
    else 0 end) * 100 / count(*) as percentage_empty
from your_feature_class

You can e.g. use the "Schema (any format)" reader to extract a list of all the feature classes and all the attribute names to feed into the SQLExecutor.

+15

gio
Contributor
2252 replies
7 years ago
April 3, 2018

If you use an attribute exploder you can create list grouped by attr_value.

Add all the list element counts for total. (better to count them prior to list building though, saves some effort)

Then test the attr_value for the criteria you mentioned. Add their counts and divide by total.

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Searching complete dataset for null values