Question

Help with understanding difference between missing and empty

  • 16 June 2020
  • 8 replies
  • 70 views

Badge

Hello,

I'm reading a shapefile created by ArcGis and I noticed something strange (or not?).

In ArcGis, if I create a new attribute (for example text attribute) but don't enter any value, when I read it with FME, such field is interpreted as empty field in visual preview table of the reader (there is nothing in that field). But if I enter some value in ArcGis, store it, than delete this value and store it again, than when I read it with FME it is interpreted as <missing>. Why? Why not empty again? Is this correct interpretation by the FME?

From help:

  • Empty – The attribute exists and has an empty string as its value.
  • Null – The attribute exists and has a value of null.
  • Missing (Selected Attributes Only) – The attribute does not exist.

8 replies

Userlevel 4
Badge +26

To be honest, I would expect FME to read the data as either 'Empty' or 'null' if the shapefile being read in actually had the field referenced in the schema.

 

 

If on the other had the shapefile being read did not have the column (but it was defined in FME) I would expect it to be missing.

 

 

The only situation where I *might* expect the attribute to be missing is when no values in the whole shapefile have any data in that column.

 

 

Which version of FME are you using FME 2020 has a new reader.

 

 

Perhaps you can share the example shapefiles?
Badge

Hello,

Thanks for replay. I'm sharing test_shp.zip. I'm using Version: FME(R) 2019.2.0.0 (20191105 - Build 19801 - WIN64).

This is what I have done:

In ArcGIS I have added new attribute. I didn't fill it with any values. When I read it with FME reader I get this:

Step 1: reading shapefile created in ArcGis

Step 2: If I go back to ArcGis and add a value then in FME reader I get this:

Step 3: If I go back to ArcGis and delete this value then I get this:

 

In any of the above steps If I export new shapefile from ArcGis I get this <missing> everywhere where there is no data.

Exported step 1:

Exported step 2:

 

Exported step 3:

Badge

To be honest, I would expect FME to read the data as either 'Empty' or 'null' if the shapefile being read in actually had the field referenced in the schema.

 

 

If on the other had the shapefile being read did not have the column (but it was defined in FME) I would expect it to be missing.

 

 

The only situation where I *might* expect the attribute to be missing is when no values in the whole shapefile have any data in that column.

 

 

Which version of FME are you using FME 2020 has a new reader.

 

 

Perhaps you can share the example shapefiles?

Hello,

sorry, it seems I haven't reply to you but to myself so the answer to your questions is above.

Userlevel 4
Badge +26

Hello,

Thanks for replay. I'm sharing test_shp.zip. I'm using Version: FME(R) 2019.2.0.0 (20191105 - Build 19801 - WIN64).

This is what I have done:

In ArcGIS I have added new attribute. I didn't fill it with any values. When I read it with FME reader I get this:

Step 1: reading shapefile created in ArcGis

Step 2: If I go back to ArcGis and add a value then in FME reader I get this:

Step 3: If I go back to ArcGis and delete this value then I get this:

 

In any of the above steps If I export new shapefile from ArcGis I get this <missing> everywhere where there is no data.

Exported step 1:

Exported step 2:

 

Exported step 3:

OK, so I looked at your files and it does seem to be related (in part) to the new Shapefile reader in FME. In FME 2019 there are two shapefile readers - a tech preview and the 'normal' one.

 

 

The normal one reads as you would expect (empty strings) but the new one maps empty strings to 'missing'.

 

 

In FME 2020 there is a new parameter in the shapefile reader to control how you want the empty string to be represented (either 'null' of 'missing' - but not empty string).

 

 

I suspect that the change in behaviour is related to the introduction of "Bulk Mode" into the shapefile reader, perhaps Bulk Bode doesn't support the 'empty string'?

 

 

Hopefully that answers your question a bit. My advice would be to always assume that an empty value could be either Empty, Missing or null.

 

 

 

Badge

OK, so I looked at your files and it does seem to be related (in part) to the new Shapefile reader in FME. In FME 2019 there are two shapefile readers - a tech preview and the 'normal' one.

 

 

The normal one reads as you would expect (empty strings) but the new one maps empty strings to 'missing'.

 

 

In FME 2020 there is a new parameter in the shapefile reader to control how you want the empty string to be represented (either 'null' of 'missing' - but not empty string).

 

 

I suspect that the change in behaviour is related to the introduction of "Bulk Mode" into the shapefile reader, perhaps Bulk Bode doesn't support the 'empty string'?

 

 

Hopefully that answers your question a bit. My advice would be to always assume that an empty value could be either Empty, Missing or null.

 

 

 

Thank you for your effort!

In the meantime I have discovered another fact. If you take the shp from the above step 3, this one:

and you put NullAttributeMapper after the reader and set it to map all (empty, null and missing) to a new value "Bla", like this:

Then you get this:

 

It seems FME doesn't see this empty field as empty. Or?

And both readers ends up with the same result after this NullAttributeMapper...

Userlevel 1
Badge +21

OK, so I looked at your files and it does seem to be related (in part) to the new Shapefile reader in FME. In FME 2019 there are two shapefile readers - a tech preview and the 'normal' one.

 

 

The normal one reads as you would expect (empty strings) but the new one maps empty strings to 'missing'.

 

 

In FME 2020 there is a new parameter in the shapefile reader to control how you want the empty string to be represented (either 'null' of 'missing' - but not empty string).

 

 

I suspect that the change in behaviour is related to the introduction of "Bulk Mode" into the shapefile reader, perhaps Bulk Bode doesn't support the 'empty string'?

 

 

Hopefully that answers your question a bit. My advice would be to always assume that an empty value could be either Empty, Missing or null.

 

 

 

"My advice would be to always assume that an empty value could be either Empty, Missing or null."

Empty, missing and null are different scenarios and there are cases where the distinctions matter, schema checks for example. It seems odd that the new shapefile writer only allows you to treat empty values as missing or null and not just empty.

Badge

"My advice would be to always assume that an empty value could be either Empty, Missing or null."

Empty, missing and null are different scenarios and there are cases where the distinctions matter, schema checks for example. It seems odd that the new shapefile writer only allows you to treat empty values as missing or null and not just empty.

Yes but, this one which is read as "empty" seems like it is not empty. At least NullAttributeMapper doesn't see it as "empty". In fact if I compare two of this exactly the same "empty" values from such shapefiles FME will say they are different (in tester for example)...

Userlevel 4
Badge +26

Yes but, this one which is read as "empty" seems like it is not empty. At least NullAttributeMapper doesn't see it as "empty". In fact if I compare two of this exactly the same "empty" values from such shapefiles FME will say they are different (in tester for example)...

Yes, the null attribute mapper should read this a empty or null for sure. This is likely a bug which "may" have been fixed in FME 2020 - What should 'fix' the issue for you here is to add a Decelerator (set to 0 seconds) before the NullAttributeMapper. I think this is also an bug related to the bulk mode, The Decelerator should turn them back into 'Normal' slow features.

 

 

Bulk Mode is great but a few bugs/small changes have popped up like this one which have caused a few headaches.

Reply