Solved

FeatureWriter silently dropping bad features and total features written count is wrong


Badge

Hi,

I'm using a FeatureWriter to insert data into a PostgreSQL database. When certain features can't be converted into the desired data types, FME drops them without terminating the translation.

The logger gives a message in blue:

"Value of attribute 'test' could not be converted to type 'bool'. Feature will be logged and skipped"

 

However, the issue lies in trying to capture this scenario and terminate the translation.

Despite dropping some features, the FeatureWriter when outputting the summary feature still contains the FULL COUNT against the '_total_features_written' attribute. I was intending on comparing this number to a count of incoming features to check for a difference. Now the only other way I can think of is querying the database immediately after and getting the written feature count that way.

 

I guess I have two questions:

1. Is there an inbuilt way to halt the translation when a feature is dropped like this?

2. Is the _total_features_written count including the dropped features a bug or by design? It clearly does not reflect the actual number written to the database so it's at least misleading. Looking at other topics on this matter it seems like it should not be including the dropped features?

 

Thanks

Tested on FME 2019.1.0.0 and FME 2019.1.1.0

 

& PG 11.4

 

icon

Best answer by markatsafe 5 December 2019, 22:11

View original

12 replies

Badge +21

Just out of curiousity - can you see if the same behaviour exist when you use a regular Writer?

Badge +2

@shravan15 To answer @sigtill - the regular writer does behave in the same way.

FME is not telling the truth in these cases: the feature counts in the translation log are the counts of features that FME hoped to write, so they include the features rejected by the database.

You have two broad choices here I think:

  • AttributeValidator: Try and catch the problem records before writing to the database
  • Use the approach that you described and that we presented in a webinar: Dive in with Databases. Use SQLExecutor to count the features in the database before the load, use Counter & Sampler (Last N Features) to identify how many features your loading and then SQLExecutor to after the load. Do the math and then decide if you need to rollback the transaction. I've attached the workspace that more-or-less illustrates these steps. EsriGeodb_to_Geopackage V01.zip
Userlevel 1
Badge +10

@shravan15 To answer @sigtill - the regular writer does behave in the same way.

FME is not telling the truth in these cases: the feature counts in the translation log are the counts of features that FME hoped to write, so they include the features rejected by the database.

You have two broad choices here I think:

  • AttributeValidator: Try and catch the problem records before writing to the database
  • Use the approach that you described and that we presented in a webinar: Dive in with Databases. Use SQLExecutor to count the features in the database before the load, use Counter & Sampler (Last N Features) to identify how many features your loading and then SQLExecutor to after the load. Do the math and then decide if you need to rollback the transaction. I've attached the workspace that more-or-less illustrates these steps. EsriGeodb_to_Geopackage V01.zip

To add to @markatsafe's note about the feature counts in the log... this is a known issue and I've linked your report here to our internal ticket (reference FMEENGINE-46951). I'm sorry you came across this!

Badge

To add to @markatsafe's note about the feature counts in the log... this is a known issue and I've linked your report here to our internal ticket (reference FMEENGINE-46951). I'm sorry you came across this!

@sigtill yes it does behave the same way. I believe there is just one writer implementation behind the scenes no matter how it is being invoked (writer or featurewriter).

@markatsafe @nampreetatsafe Thank you - I have got around this issue using an sqlexecutor immediately after and comparing the counts

Hi @Mark Stoakes​ I stumbled across this issue and @1spatial UK support helped me and pointed me at this thread.

 

I must admit I cannot believe this is still an issue 2 years after this thread and presumably longer internally. Almost more to the point is the suggested 'work around' doesn't appear to accept this is a critical issue.

 

We stumbled across this issue when a customer realised that a single record hadn't been inserted into our MS SQL table form a simple Excel load.

The issue here is that the dataset is a critical Covid dataset that we get daily as our organisation supports the Covid response.

We have no idea how many records we've lost over the last year that have hindered our Covid response.

 

An application that reports on records it "hoped" to write and doesn't properly notify/fail.

 

We are now in a situation where we have to review many workbenches to see where we may be affected.

A lot of our data flows are where we may not necessarily spot where the odd record doesn't get written.

 

Thanks

Badge +2

@stevejames​ It is frustrating the FME does not record the rejected features in the Feature Written.

All the rejected features are logged, for example:

WARN |Value of attribute 'transmit_freq' could not be converted to type 'float4'. Feature will be logged and skipped

Any feature that is logged in the log file is also recorded in an FFS file that you can track. The file also has an FME log message attribute: fme_log_previous_message

I've attached an example of one of the FFS log files. By default the file name is the same as the workspace name with _log.ffs appended

Badge

I agree with @stevejames​ that this is potentially catastrophic behaviour as it's far too easy to lose data.

 

Is there an idea/issue raised for this @Mark Stoakes​? I've found the following around rejected ports for writers to handle failures but I think this is different as the main issue is that the process doesn't fail at all.

 

https://community.safe.com/s/idea/0874Q000000Tl3kQAC/detail

https://community.safe.com/s/bridea/a0r4Q00000Hbs7vQAB/add-a-rejected-port-to-featurewriter

Userlevel 1
Badge +11

Hi @tomcolley​ As @nampreetatsafe​ mentioned above, the issue is FMEENGINE-46951. We will update this thread when it has been resolved.

Badge

Hi @tomcolley​ As @nampreetatsafe​ mentioned above, the issue is FMEENGINE-46951. We will update this thread when it has been resolved.

Thanks for getting back to me @danatsafe​, sorry I missed that previous comment mentioning the issue. It sounds as though that issue is just around fixing the total_features_written count; I'd say that is less important as, if the user is aware of this behaviour, it is trivial to check the features that have actually made it into the table after the writer.

The bigger issue is that if the writer fails to write features then either the workspace should fail or the features should be output via a rejected port to make it clear what has happened.

Thanks​

Badge

Hi there,

I'm facing some similar issues, some of my data is not written because of this warning:

Feature will be logged and skipped

As @shravan15​ asked, is there a way to halt the translation when a feature is dropped like this?

Best regards,

Rémi

Is this still an issue? We have just noticed that we are getting this problem when writing to Snowflake using a DB writer on FME 2021.2. Because this is happening silently, with just a message written to the log file, it is hard to know how often this is happening, which could be a huge problem!

Userlevel 1
Badge +11

Is this still an issue? We have just noticed that we are getting this problem when writing to Snowflake using a DB writer on FME 2021.2. Because this is happening silently, with just a message written to the log file, it is hard to know how often this is happening, which could be a huge problem!

Hi @mmeyers​ Unfortunately FMEENGINE-46951 (now FMEFORM-17470) is still unresolved.

Reply