Skip to main content

Hi, 

 

I would like to create a table showing all the attributes available per feature type and the count for each one. 

 

For example, I have these datasets: Trails, Activities, Landscape which contain similar attributes, but also different attributes. (similar = objectid, name...) (different= trail size, type activity).

 

I want to generate an overview with all the attributes and to which feature type they belong to.

 

Any idea how I can do that without using too many transformers?

 

I have already tried many options:

1) attribute exploder, unique values (but then I can trace back to whcih feature type it belongs to)

2) attribute manager, change values to feature type, then statistic calculator (but then I have to use one attribute mananger per feature type which is redundent).

 

Any ideas how I can make this script dynamic?

 

Thanks

What I would do:

  • AttributeExploder
  • Tester to exclude attributes starting with fme_ and objectid.
  • Aggregator, Mode Attributes Onlu, group by fme_feature_type, _attr_name, _attr_value and Count Attribute = _count.

This will result in a table with columns fme_feature_type, _attr_name, _attr_value, _count.


I have tried that, but I am not seeing the feature type, just the attributes names.


For your situation, you could:

  1. Use a FeatureReader and connect that to the database with your features in question and supply it with a directory file path reader to supply to the FeatureReader.
  2. Use the FEATURE_TYPES user parameter to make the workspace dynamic so you can select which feature classes you wish to run.
  3. Remember to just read in the schema (to avoid reading all the geometry and other unnecessary data).
  4. Use a SchemaScanner to expose the table schema info schema scanner settings
  5. Make sure to populate the Schema Definition Name as this will be the name of the feature class or table (i think this is the part you were missing previously).
  6. Expose the schema list and then just keep the dynamic keys to make it standardized (or adjust as needed)
  7. Add an attribute counter to count the number of attributes found.
  8. Then aggregate everything on element_id() if you so wish to get everything into one row.
  9. Finish.

 

 

My use case:

I had a situation where I needed to look at a bunch of different feature classes where 1 database was the master with the correct schema and then compare that to 20+ other databases where the feature classes existed but the schema (attributes, types, etc.) could be different. I needed to know what was the same and what was different between the master database to the database being tested.

 

The first part was using just a directory path reader that was using a user parameter so the I could change the directory where the test database was I wanted to read in and then compare against the Master.

 

So I setup the workspace as: Path reader > FeatureReader (schema only) > SchemaScanner to retrieve the schema information (output) > Expose the schema list via list exploder > attributekeeper and keep only the following:

_element_index,fme_feature_type_name,name,native_data_type

example schema reading and attribute setup 

Then I just used a ChangeDetector to route things and find the differences and similarities based on the attribute schema information:

ChangeDetector on name and datatype 

Then route to the excel output (3 sheets):

  1. the test database attributes
  2. the differences between the two
  3. the master database attributes

Example Excel Output:

excel report output with the test database schema, master schema and the diffsI later added in a couple more user parameters to allow the workbench to be automated (dynamic) through python (advanced):

  • Path1 = test database directory
  • Path2 = master database directory
  • FEATURE_TYPES = feature class or table to run
  • DestinationExcel = where the excel report should write to

image 

Hope this helps!


Reply