Solved

How do you dynamically remove attributes of missing columns?

  • 4 November 2022
  • 6 replies
  • 69 views

I am building transformations that send data to Azure Table Storage. When the attribute value is missing there is a blank string inserted into the table. Due to the nature of the feature, there will only be one row of data but different columns may have missing or null values. These values we do not want written to the table. I have tried using Python and NullAttributeMappers, but these still leave the <missing> value which is causing issues. Is there a way to dynamically remove these columns? The reason for wanting to remove the attribute columns before the Azure Table Storage writer is that "Automatic" Attribute definition appears to be required in order to dynamically pass different columns to Azure.

icon

Best answer by markatsafe 7 November 2022, 17:58

View original

6 replies

Badge +2

@svteshipley​ I think you need to expand on the explanation of the problem. Are you only writing one row per table? If that's the case you can try using BulkAttributeRemover NullValueMapper to drop any null attributes and then use SchemaScanner to find the new schema of the feature and use that to create your table.

@svteshipley​ I think you need to expand on the explanation of the problem. Are you only writing one row per table? If that's the case you can try using BulkAttributeRemover NullValueMapper to drop any null attributes and then use SchemaScanner to find the new schema of the feature and use that to create your table.

For the Azure table we are only righting one row with say columns "High", "Medium", "Low". But given the data, there may be no values for "Medium" and therefore do not want to write to that column in Azure. I have wrestled with BulkAttributeRemover for a bit (probably user error) but have been unable to dynamically remove say "High" and "Low" without the attribute still saying "<missing>".

Badge +3

For the Azure table we are only righting one row with say columns "High", "Medium", "Low". But given the data, there may be no values for "Medium" and therefore do not want to write to that column in Azure. I have wrestled with BulkAttributeRemover for a bit (probably user error) but have been unable to dynamically remove say "High" and "Low" without the attribute still saying "<missing>".

use python caller.

if the attribute is null

  • fmeobjects.FMEFeature.removeAttribute

then pass the feature to py output

For the Azure table we are only righting one row with say columns "High", "Medium", "Low". But given the data, there may be no values for "Medium" and therefore do not want to write to that column in Azure. I have wrestled with BulkAttributeRemover for a bit (probably user error) but have been unable to dynamically remove say "High" and "Low" without the attribute still saying "<missing>".

The code i currently have in my PythonCaller is as follows. The issue that I am having it that I then manually have to select "Attributes to Hide" as that attribute will still be visible with values of <missing> and sent to the Azure Table Storage writer, leading it to enter blank strings into the table.

 

import fme

import fmeobjects

 

class FeatureProcessor(object):

def __init__(self):

pass

 

def input(self, feature):

for tmpAttrName in feature.getAllAttributeNames():

tmpAttrVal = feature.getAttribute(tmpAttrName)

if(tmpAttrVal is None) or (tmpAttrVal == 99):

feature.removeAttribute(tmpAttrName)

self.pyoutput(feature)

 

def close(self):

pass

Badge +2

@svteshipley​ If an attribute is flagged as <missing> then the SchemaScanner will ignore it.  So if you start with this:

_creation_instance	0
attr1 a
attr2 b
attr3 <null>

and pass it through NullAttributeMapper and set <null> to <missing> then the SchemaScanner will give you this:

attribute{0}.fme_data_type	fme_varchar(2)
attribute{0}.name attr1
attribute{1}.fme_data_type fme_varchar(2)
attribute{1}.name attr2
attribute{2}.fme_data_type fme_uint8
attribute{2}.name _creation_instance

Note the absence  of attr3 in the schema list

@svteshipley​ If an attribute is flagged as <missing> then the SchemaScanner will ignore it.  So if you start with this:

_creation_instance	0
attr1 a
attr2 b
attr3 <null>

and pass it through NullAttributeMapper and set <null> to <missing> then the SchemaScanner will give you this:

attribute{0}.fme_data_type	fme_varchar(2)
attribute{0}.name attr1
attribute{1}.fme_data_type fme_varchar(2)
attribute{1}.name attr2
attribute{2}.fme_data_type fme_uint8
attribute{2}.name _creation_instance

Note the absence  of attr3 in the schema list

The SchemaScanner was exactly what was needed. I had to do some research into it to understand. For future reference, I created an Azure Table Storage writer and selected dynamic. From the SchemaScanner I connected the <Schema> and Output to the writer. Within the writer I manually entered the table name to write to, had Schema Sources as "Schema From Schema Feature", Schema Definition Name as "fme_feature_type_name" and for Attributes to Remove as "azure_table_partitionkey, and azure_table_rowkey". These attributes were selected to remove as I had them setup custom and otherwise they would also be written to the table. Now in the User Attributes tab it had the default row of type text.

Reply