Solved

Getting duplicate features when writing to ArcGIS Online


When writing to a hosted feature layer in ArcGIS Online we are getting duplicate features on occasion. In the latest run where this happened, there were 15,304 features in the dataset which were reported as written to ArcGIS Online, although the feature layer actually had 16,304 features written to it. Features with OBJECTID 9001 through 10000 were exactly duplicated making it seem like a batch of 1,000 features were written exactly twice during the process. OBJECTID 10001 was an exact duplicate of 9001, 10002 was a dupe of 9002 and so on.... After this batch of duplicates, subsequent features were not duplicated (features before OBJECTID 9001 were also not duplicated).

icon

Best answer by mattmatsafe 20 June 2023, 23:21

View original

40 replies

Badge +6

Hi @jasonschroeder​ , I hope you don't mind, but I'm going to escalate your question to a support case here. We'd like to dive into this a little deeper to get a clear picture of what might be causing this behaviour.

Has there been any resolution on this issue? We've seen similar behavior, but were not able to determine if the base table was not completely truncated, or duplicated rows were being written.

Badge +2

@dnfox​ The problem seems to be related to how the Feature Services writer handles retries. If there is any connection break during the data load, a retry is attempted and this can cause duplicate features. If a batch of data is uploaded, but the response either times out or returns an error, the writer will retry sending the same data again. This is supposed to only happen if the batch completely fails to be uploaded, but it seems like it can happen even when all the data was successfully uploaded and the writer doesn't receive a response from the server. Try reducing the Features per Request on the writer Advanced Parameters

@Mark Stoakes​ @dnfox​ - an FYI, I was still experiencing the duplicate issues even though I reduced the "Features per Request" to 250 at a time. It seems to me more about an Internet hiccup (response gets lost) than the number of features being sent.

@dnfox​ The problem seems to be related to how the Feature Services writer handles retries. If there is any connection break during the data load, a retry is attempted and this can cause duplicate features. If a batch of data is uploaded, but the response either times out or returns an error, the writer will retry sending the same data again. This is supposed to only happen if the batch completely fails to be uploaded, but it seems like it can happen even when all the data was successfully uploaded and the writer doesn't receive a response from the server. Try reducing the Features per Request on the writer Advanced Parameters

@Mark Stoakes​ Has a BUG# been assigned to this issue by Esri and/or Safe Software? Are there any efforts on the part of Esri and/or Safe Software to address this?

Badge +2

@Mark Stoakes​ Has a BUG# been assigned to this issue by Esri and/or Safe Software? Are there any efforts on the part of Esri and/or Safe Software to address this?

@ssharp​ Yes we are talking with our colleagues at Esri to try and find a resolution to this problem.

@Mark Stoakes​ Has a BUG# been assigned to this issue by Esri and/or Safe Software? Are there any efforts on the part of Esri and/or Safe Software to address this?

Thanks for the reply @Mark Stoakes​. Do you have a Safe or Esri BUG# we can use to track progress AND receive notification once resolved?

Badge +16

Here is something we found effective while we're waiting for the underlying issue to be addressed. If your workflow is a truncate and bulk insert of all records regardless of whether rows are unchanged, instead additionally read in all target feature service features, perform a change detection and then only write inserts, updates and deletes. Make the features per transaction small too, say 100. This will help with what we think is an underlying timeout issue.

Here is something we found effective while we're waiting for the underlying issue to be addressed. If your workflow is a truncate and bulk insert of all records regardless of whether rows are unchanged, instead additionally read in all target feature service features, perform a change detection and then only write inserts, updates and deletes. Make the features per transaction small too, say 100. This will help with what we think is an underlying timeout issue.

@bruceharold​ thanks for the suggestion. Will this work if the target FS and source data lack unique stable IDs?

Badge +16

@bruceharold​ thanks for the suggestion. Will this work if the target FS and source data lack unique stable IDs?

Yes, use a set of attributes to detect changes, except any floating point data like Shape_length where you can use the actual geometry. It may take some trial and error if your data are coming from differing systems of record for example by rounding coordinates to 6 decimal places, standardizing datetime fields and so on. Preserve ObjectID on updates and deletes from the target features so the service can match records. If you have no match key fields you will not have any updates, only inserts and deletes (logical updates are delete/insert pairs).

@bruceharold​ thanks for the suggestion. Will this work if the target FS and source data lack unique stable IDs?

Ok, sounds like a reasonable amount of time/effort would be required to develop/debug/test and roll this "workaround" into a production especially if stable ObjectIDs are not an option.

 

Has Esri and/or Safe Software recognized the underlying problem as a bug yet?

Badge +16

@bruceharold​ thanks for the suggestion. Will this work if the target FS and source data lack unique stable IDs?

Not yet.

@bruceharold​ thanks for the suggestion. Will this work if the target FS and source data lack unique stable IDs?

:(

Userlevel 1
Badge +11

@bruceharold​ thanks for the suggestion. Will this work if the target FS and source data lack unique stable IDs?

Hi @ssharp​ and @bruceharold​,

It looks like Mark had filed an issue for us on the Safe Software side tracked internally as (FMEENGINE-68673) and our team will be reaching out to Esri's development team on it to work together towards a solution.

Badge +16

@bruceharold​ thanks for the suggestion. Will this work if the target FS and source data lack unique stable IDs?

Great, thanks Jovita.

@bruceharold​ thanks for the suggestion. Will this work if the target FS and source data lack unique stable IDs?

@jovitaatsafe​  thank you for confirming that Safe Software is working on this bug. Hopefully Esri will respond to your team's efforts.

 

I logged this as a case with Esri on 3/5/2021 (#02755502).

@jovitaatsafe​ have Safe Software and Esri been able to make any headway regarding the logged FMEENGINE-68673 bug?

Userlevel 1
Badge +11

@jovitaatsafe​ have Safe Software and Esri been able to make any headway regarding the logged FMEENGINE-68673 bug?

Hi @ssharp​, unfortunately there isn't a resolution on the issue just yet. I'll add a note in the ticket to bring it to the team's attention and hopefully bump the ticket. I see you've got a case attached to the issue so you'll be informed there as soon as it's been addressed and updated here in this thread as well.

Badge +16

Hi @ssharp​, unfortunately there isn't a resolution on the issue just yet. I'll add a note in the ticket to bring it to the team's attention and hopefully bump the ticket. I see you've got a case attached to the issue so you'll be informed there as soon as it's been addressed and updated here in this thread as well.

Here is another option, a pattern you can use to do bulk appends (prior truncation optional) https://community.esri.com/t5/arcgis-data-interoperability-blog/building-a-data-driven-organization-part-4/ba-p/1086265

 

Hi @ssharp​, unfortunately there isn't a resolution on the issue just yet. I'll add a note in the ticket to bring it to the team's attention and hopefully bump the ticket. I see you've got a case attached to the issue so you'll be informed there as soon as it's been addressed and updated here in this thread as well.

Thanks for the reply @bruceharold​  and @jovitaatsafe​ 

Was this ever resolved? We are having the same issue.

Badge

Was this ever resolved? We are having the same issue.

Hi @achamber_ak

Unfortunately it has not been resolved just yet, I have added your comment to the issue and when the issue is resolved we will be sure to let you know on this thread. Apologies for any inconvenience caused.

 

Hi @ssharp​, unfortunately there isn't a resolution on the issue just yet. I'll add a note in the ticket to bring it to the team's attention and hopefully bump the ticket. I see you've got a case attached to the issue so you'll be informed there as soon as it's been addressed and updated here in this thread as well.

Hi - is there any more progress on this and has it been registered as a bug?

Hi @ssharp​, unfortunately there isn't a resolution on the issue just yet. I'll add a note in the ticket to bring it to the team's attention and hopefully bump the ticket. I see you've got a case attached to the issue so you'll be informed there as soon as it's been addressed and updated here in this thread as well.

No action on the part of Safe Software or Esri to the best of my knowledge to date....

 

https://support.esri.com/en/bugs/nimbus/QlVHLTAwMDEzODE2Mw==

 

Badge +2

@jasonschroeder​  @bruceharold​ @ssharp​ @dnfox​ 

FME 2021.1 and higher has improved logging that might help us trace the cause of this issue. Turn on the debug logging under the Tools menu -> FME Options -> Translation -> Log Message disclosure panel and check Log Debug. If you can send us a log file from a job that results in duplicates that might help us get this resolved. You can post the log file here or send it directly to me mark @ safe.com

Reply