Hi, I want to count values for individual fields in excel. There are 110 fields. I want total counts column by column for all values that are not missing, empty, or null.

Count Values in Excell fields

+23

warrendev
Enthusiast
119 replies
4 years ago
November 14, 2020

Hi @jencinas ,

I think the statistics calculator transformer will work for what you want. Set the statistics calculator mode to Sum.

excel statistics_calculator summary

J

jencinas
Author
14 replies
4 years ago
November 16, 2020

warrendev wrote:

Hi @jencinas ,

I think the statistics calculator transformer will work for what you want. Set the statistics calculator mode to Sum.

excel statistics_calculator summary

@Chris Warren , Hey this tool works pretty good for the numeric features in my excel spreadsheet. I was able to create a workbench to calculate the count and numeric count. However, for the fields with strings, I can only calculate the total count of features. This doesn't work for me as I want to determine the number of values available in those fields as well. In some cases, I might have 828 records in excel. But a given field may only have 642 values entered in total. Any ideas how I can get this "true" value for these fields as well as the numeric ones? Thanks. Joe

+32

ebygomm
Influencer
3254 replies
Best Answer
4 years ago
November 16, 2020

There is a custom transformer on the hub - NullAttributeCounter that will count all the nulls and output a count of Null and Not Null.

Depending on how you are reading your excel file you may need to use a NullAttributeMapper first (or choose to Read Blank cells as Null in the excel reader parameters)

J

jencinas
Author
14 replies
4 years ago
November 16, 2020

Thanks @ebygomm .

I got this to work. It accomplished the task I was after. I was able to create an output with the null versus not null values for each field. I did this for the original and processed spreadsheets so I can compare them and find data that didn't successfully move through one of my ETLs. I have numerous spreadsheets to migrate from one format to another. I needed a method to assess and compare the original to the processed data so I could quickly find the problems and address them. Thanks. Joe

+32

ebygomm
Influencer
3254 replies
4 years ago
November 16, 2020

jencinas wrote:

Thanks @ebygomm .

I got this to work. It accomplished the task I was after. I was able to create an output with the null versus not null values for each field. I did this for the original and processed spreadsheets so I can compare them and find data that didn't successfully move through one of my ETLs. I have numerous spreadsheets to migrate from one format to another. I needed a method to assess and compare the original to the processed data so I could quickly find the problems and address them. Thanks. Joe

If you're looking at doing a before and after comparison, the changedetector might be worth a look

J

jencinas
Author
14 replies
4 years ago
November 17, 2020

ebygomm wrote:

If you're looking at doing a before and after comparison, the changedetector might be worth a look

@ebygomm , Thanks for the suggestion. I am trying to compare two datasets, (Before & After). I looked into the ChangeDetector transformer. The issue I determined was that I can't readily apply that transformer to my situation. My ETL changes the headers in the original spreadsheet to headers that will import into a Personal Geodatabase. The ChangeDectector seems to rely on those headers being the same. What I want to compare is the ratio of Null to NotNull numbers so I can determine which fields in the original spreadsheet aren't getting through my ETL. It is a QAQC approach. Instead, I used a writer to place both the before and after spreadsheets into an MS Access db. Then, I set up a crosswalk table that connects the original field headers to the new ones. Then I was able to write a query that displays only the records where the null value fields don't match. It works. But it is more work than I want to go through. Since I have numerous spreadsheets to process, I don't want this workflow to be more arduous than doing a simple visual inspection of the two spreadsheets which would be faster. Any other ideas? Joe

J

jencinas
Author
14 replies
4 years ago
November 19, 2020

In short, I developed a workbench which calculates "empty" and "missing" values in the spreadsheet cells. I do this for for the original and final spreadsheets. Then the workbench applies the NullAttributeCounter transformer to generate an output which distinguishes the number of cells with Null values versus Non-Null values for the range of records in each spreasdsheet. The writer publishes the resultant tables (before and after) to an MS Access. I use the same naming convention for the tables in MS Access so they match the spreadsheets. Be sure to specify all three values in the "Update/Delete Key columns" parameter. I specified "insert" for the writer mode. This publishes the attributes for the AttrName, Null,NotNull fields that are generated from the transformer. Then, I have a canned query that pieces the fields of both tables together and finds the records where the "Null" fields between the two tables don't match. This is a QAQC query that pin points any fields that are not being processed in an earlier ETL I use to migrate the data from one format to another. A little work is involved as the headers in the before and after spreadsheets don't match. So, I had to create a crosswalk table and use it to join the other two in the Relationship feature of Access. Fortunately, I only have to do this once. I will be able to reuse this MS Access db for other groups of datasets. Thank you to @ebygomm and @Chris Warren for suggestions! Joe

J

jencinas
Author
14 replies
4 years ago
November 19, 2020

jencinas wrote:

In short, I developed a workbench which calculates "empty" and "missing" values in the spreadsheet cells. I do this for for the original and final spreadsheets. Then the workbench applies the NullAttributeCounter transformer to generate an output which distinguishes the number of cells with Null values versus Non-Null values for the range of records in each spreasdsheet. The writer publishes the resultant tables (before and after) to an MS Access. I use the same naming convention for the tables in MS Access so they match the spreadsheets. Be sure to specify all three values in the "Update/Delete Key columns" parameter. I specified "insert" for the writer mode. This publishes the attributes for the AttrName, Null,NotNull fields that are generated from the transformer. Then, I have a canned query that pieces the fields of both tables together and finds the records where the "Null" fields between the two tables don't match. This is a QAQC query that pin points any fields that are not being processed in an earlier ETL I use to migrate the data from one format to another. A little work is involved as the headers in the before and after spreadsheets don't match. So, I had to create a crosswalk table and use it to join the other two in the Relationship feature of Access. Fortunately, I only have to do this once. I will be able to reuse this MS Access db for other groups of datasets. Thank you to @ebygomm and @Chris Warren for suggestions! Joe

I spoke too soon. On a different dataset, I ran into a python error with the NullAttributeCounter transformer. Apparently, the pythoncode in the transformer doesn't handle fields that have both string and numerical values, which my dataset does. Now, I am rethinking how I can modify the data so I can still account for the nulls versus non-nulls. When this transformer works, it makes a really useful output. I have FME 2018 on a Citrix server.

+2

chrisatsafe
Contributor
606 replies
4 years ago
November 19, 2020

jencinas wrote:

@Chris Warren , Hey this tool works pretty good for the numeric features in my excel spreadsheet. I was able to create a workbench to calculate the count and numeric count. However, for the fields with strings, I can only calculate the total count of features. This doesn't work for me as I want to determine the number of values available in those fields as well. In some cases, I might have 828 records in excel. But a given field may only have 642 values entered in total. Any ideas how I can get this "true" value for these fields as well as the numeric ones? Thanks. Joe

Hi @jencinas ,

One way you could do this would be to create a temporary attribute and set the value of that attribute conditionally. Since you want to determine how many features have an attribute value, you can create a temporary attribute that sets the value to 1 if the attribute has a value and 0 if the attribute does not have a value. Then calculate the sum of the temp attribute in the StatisticsCalculator.

2020-11-19_15-04-51

+32

ebygomm
Influencer
3254 replies
4 years ago
November 20, 2020

jencinas wrote:

I spoke too soon. On a different dataset, I ran into a python error with the NullAttributeCounter transformer. Apparently, the pythoncode in the transformer doesn't handle fields that have both string and numerical values, which my dataset does. Now, I am rethinking how I can modify the data so I can still account for the nulls versus non-nulls. When this transformer works, it makes a really useful output. I have FME 2018 on a Citrix server.

Are you sure that is the issue? The null counter will count a mix of string and numerical values for me

J

jencinas
Author
14 replies
4 years ago
November 20, 2020

jencinas wrote:

I spoke too soon. On a different dataset, I ran into a python error with the NullAttributeCounter transformer. Apparently, the pythoncode in the transformer doesn't handle fields that have both string and numerical values, which my dataset does. Now, I am rethinking how I can modify the data so I can still account for the nulls versus non-nulls. When this transformer works, it makes a really useful output. I have FME 2018 on a Citrix server.

@ebygomm , I am not sure now. I did run it successfully against another spreadsheet with similar design. I will try again to make it run and troubleshoot the problem. The NullAttributeCounter does exactly what I want. So, when it failed this time and I got the Python error, I researched it on this knowledgebase. The closes example to the problem was what I already posted. Yet, that doesn't preclude that something else is causing the tool to fail. I will give it another try. Joe

J

jencinas
Author
14 replies
4 years ago
November 20, 2020

jencinas wrote:

I spoke too soon. On a different dataset, I ran into a python error with the NullAttributeCounter transformer. Apparently, the pythoncode in the transformer doesn't handle fields that have both string and numerical values, which my dataset does. Now, I am rethinking how I can modify the data so I can still account for the nulls versus non-nulls. When this transformer works, it makes a really useful output. I have FME 2018 on a Citrix server.

@ebygomm , Here is the error I am getting. I will troubleshoot this further. Joe

Python Exception <IndexError>: list index out of range

Traceback (most recent call last):

File "<string>", line 44, in close

IndexError: list index out of range

NullAttributeCounter_2_PythonCaller(PythonFactory): PythonFactory failed to close properly

NullAttributeCounter_2_PythonCaller(PythonFactory): A fatal error has occurred. Check the logfile above for details

A fatal error has occurred. Check the logfile above for details

Count Values in Excell fields

12 replies

Reply

Helpful Members This Week

Recently Solved Questions

Geometry Validator not repairing Self Intersection

FeatureReader failes to retrieve featuretypes from ArcGIS Portal Service

Performance Scaling

Neighbourfinder - One to one Relationship

Automatically change year in new attribute

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

Change subsequent question text based on selected choiceicon

Dynamically change question text based on selected choices

Mod question text based on previous choice - did I do it the right way?icon

Update same page question text with Javascripticon

Helpful Members This Week

Recently Solved Questions

Geometry Validator not repairing Self Intersection

FeatureReader failes to retrieve featuretypes from ArcGIS Portal Service

Performance Scaling

Neighbourfinder - One to one Relationship

Automatically change year in new attribute

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings