Skip to main content

When I read data from a PostgreSQL/PostGIS database and I inspect the features using the Inspector, all attributes seems to be of data type "string: (UTF-8)" when looking at the "Feature information" panel. But when I look at the "Table view" panel, the id column (defined as serial4 in the database) is formatted as a number (the value is aligned to the right).

 

Boolean values are read as either "t" or "f" (still "string: (UTF-8)") while when reading boolean values from a GeoPackage they appear as "Yes" or "No" (and the data type says boolean).

 

If I look at the debug logging I see that a boolean postgres attribute has an entry like:

Attribute(string: UTF-8): `my-field' has value `bool,BTREE'

 

While a GeoPackage attribute has an entry like:

Attribute(string: UTF-8): `my-field' has value `boolean'

 

Is this really the expected outcome? I would expect FME to correctly identify the data types and treat boolean values identically no matter what the data source is.

Check out this video which explains it pretty well:

 

https://www.youtube.com/watch?v=_MoalhW8zlA&t=212s

 

How FME treats the attribute "internally" and how FME things of the schema "externally" is different here.

 

You can see the difference here in this screenshot

 

imageThe left shows the tables schema from postgres and you can see the types match what's in the database. On the right we can see how FME is treating them internally - it shouldn't make a different to how FME works with it's transformers in the workspace.

 

when it comes to the writer FME tries to carry the data types through from the readers feature type to the output FeatureType.

 

You can read a little more on it here: https://docs.safe.com/fme/html/FME-Form-Documentation/FME-Form/!FeatureTypeProperties/FME-Data-Types.htm

and here:

https://docs.safe.com/fme/html/FME-Form-Documentation/FME-Form/!FeatureTypeProperties/attribute_types.htm

 

 

 

 

 


Check out this video which explains it pretty well:

 

https://www.youtube.com/watch?v=_MoalhW8zlA&t=212s

 

How FME treats the attribute "internally" and how FME things of the schema "externally" is different here.

 

You can see the difference here in this screenshot

 

imageThe left shows the tables schema from postgres and you can see the types match what's in the database. On the right we can see how FME is treating them internally - it shouldn't make a different to how FME works with it's transformers in the workspace.

 

when it comes to the writer FME tries to carry the data types through from the readers feature type to the output FeatureType.

 

You can read a little more on it here: https://docs.safe.com/fme/html/FME-Form-Documentation/FME-Form/!FeatureTypeProperties/FME-Data-Types.htm

and here:

https://docs.safe.com/fme/html/FME-Form-Documentation/FME-Form/!FeatureTypeProperties/attribute_types.htm

 

 

 

 

 

Hi Matt, I just found an vacant hour to watch the video and thanks for the link.

 

I don't get why the internal transport types should be irrelevant to the users. I think they are extremely relevant since that is how I determine why something doesn't work, because the data type is something different than I expect.

 

In my case above I have two input formats and both contains a boolean attribute with the same name. The values of one says t/f the other Yes/no. The internal transport format should be irrelevant to me, but if I feed the features from both of them into a ChangeDetector, it doesn't treat the fields as boolean comparable (probably strings instead). In this case it is very relevant for me to know the internal transport type of the two boolean fields, since the comparison fails and I how should I know why it fails?

My workaround so far is to use an AttributeManager to remap the values of one of the inputs and then the ChangeDetector works. But since FME knows from the input that both of them are boolean I would expect this to be a situation where I shouldn't need to care about the internal transport type - its just a boolean and very easy to compare.

 

So I guess my question is, is this a problem with the reader (I hope so since having to work with several value sets for a boolean data type is annoying from the user perspective) or is it the ChangeDetector which isn't capable of comparing booleans?


Hi Matt, I just found an vacant hour to watch the video and thanks for the link.

 

I don't get why the internal transport types should be irrelevant to the users. I think they are extremely relevant since that is how I determine why something doesn't work, because the data type is something different than I expect.

 

In my case above I have two input formats and both contains a boolean attribute with the same name. The values of one says t/f the other Yes/no. The internal transport format should be irrelevant to me, but if I feed the features from both of them into a ChangeDetector, it doesn't treat the fields as boolean comparable (probably strings instead). In this case it is very relevant for me to know the internal transport type of the two boolean fields, since the comparison fails and I how should I know why it fails?

My workaround so far is to use an AttributeManager to remap the values of one of the inputs and then the ChangeDetector works. But since FME knows from the input that both of them are boolean I would expect this to be a situation where I shouldn't need to care about the internal transport type - its just a boolean and very easy to compare.

 

So I guess my question is, is this a problem with the reader (I hope so since having to work with several value sets for a boolean data type is annoying from the user perspective) or is it the ChangeDetector which isn't capable of comparing booleans?

I see your point here in this case, however, on the flip side I should be able to use a tester and test where some attribute = t (rather than testing when it's true)

 

I think also a ChangeDetector or Matcher should also treat an integer 5 the same way as a string 5 when matching - This is the main use case where the user shouldn't care about the type.

 

Perhaps something else which is interesting is that even with User Parameters the internal type seems to be a string (Even if specifically specifying a parameter as a float, int or a Yes/No checkbox type.

 

It does seem to make sense to me though for boolean values to at least be treated/imported into FME the same way across all formats. If that were the case then the you wouldn't have noticed.

 

Something else which you might find interesting is with the AttirbuteValidator and how it works with booleans. In the case of Geopackage and Postgres both pass the test for being booleans, also they are both strings BUT only the Geopackage boolean also passed the int test.

It means that data from a CSV file which is just text can also be a Boolean based just on the content of the string but again, on the flipside it can be confusing because in the csv itself it's a sting I guess.

 

My guess as to why Postgres and GeoPackage are different here is probably because Postgres was likley created much earlier before FME Decided to have more than one datatype. After bringing it initially in a string with value 't'/'f' it becomes difficult to change without breaking existing workflows. Shapefile is different again (it's true/false)

 

Boolean does seem to be one of the cases where the type is just as important as the value. Especially in your case when your doing a change detection.


Hi Matt, I just found an vacant hour to watch the video and thanks for the link.

 

I don't get why the internal transport types should be irrelevant to the users. I think they are extremely relevant since that is how I determine why something doesn't work, because the data type is something different than I expect.

 

In my case above I have two input formats and both contains a boolean attribute with the same name. The values of one says t/f the other Yes/no. The internal transport format should be irrelevant to me, but if I feed the features from both of them into a ChangeDetector, it doesn't treat the fields as boolean comparable (probably strings instead). In this case it is very relevant for me to know the internal transport type of the two boolean fields, since the comparison fails and I how should I know why it fails?

My workaround so far is to use an AttributeManager to remap the values of one of the inputs and then the ChangeDetector works. But since FME knows from the input that both of them are boolean I would expect this to be a situation where I shouldn't need to care about the internal transport type - its just a boolean and very easy to compare.

 

So I guess my question is, is this a problem with the reader (I hope so since having to work with several value sets for a boolean data type is annoying from the user perspective) or is it the ChangeDetector which isn't capable of comparing booleans?

Thank you for your reflections, I think they are good and I hope Safe is listening in on it.

 

There are so many use cases and input data formats that I can't comprehend if treating all data types as string will suffice. I do however appreciate if that is the case since then I don't need to worry about what data type an attribute has - it is definitely a string!

 

In case of the boolean values I guess they need to determine what a boolean value should be and for backwards compatibility they could provide a switch in the affected readers allowing the user to specify how a boolean value should be read. Hopefully over time old workflows will die out or users will need to cast the values just like I do at the moment. I expect data-type-less formats can be trickier to handle.


Reply