Skip to main content

I have a question about integrating FME with ArcGIS Online (which I will be using to create an Open Data site). I am exploring using FME as the method for pushing data updates to our
feature services. I have tested this successfully with small datasets, but ran
into issues (understandably) with our large parcel dataset (~400,000 parcels).
This is very slow to run with the Update Detector method as well as a Truncate
and re-populate method. Has anyone ran into this issue and do you have
any suggestions for how to do this with larger datasets like this? Any tips or tricks I'm missing?

I am thinking that for large datasets we may have to manually publish from ArcMap or Pro
(since it takes less than 5 mins), but it would be ideal for us to use an FME
workspace for each dataset to keep it clean and consistend. However, if processing time is 2+ hours then that is definitely not the best solution.

Here is a screenshot of my (currently running) workspace. I have limited the revised source to just look at updates from the past 7 days (supposing I would run this weekly). I am using fme_db_operation to pass the type of update to the writer.

Hello @kirstynalex

I suspect the main culprit of the slowness is the truncating. There is a known issue where Truncating with AGOL is quite slow. I will attach the question to the PR we have outstanding that way any updates can be reported here.

 

 

With that in mind, we are kind of at a standstill for working around this. As basically the Truncate option is a semi-delete statement. We may be able to improve the speed by bumping up the value of the advanced parameter 'features per request'. There is a caveat to that, in terms of too high will cause it to fail ungracefully without an informative message. An example would be too high and the server will time out. It really depends on the individual Server specs/load.

I hope this helps.


Thanks for the info @trentatsafe. However I'm not trying to truncate (that was just another method I had tested), I'm trying to do a change detection to find inserts, update, and deletes. With both methods I've tried to change the "features per request", and as you mentioned that caused failure and the system to run out of memory.


Thanks for the info @trentatsafe. However I'm not trying to truncate (that was just another method I had tested), I'm trying to do a change detection to find inserts, update, and deletes. With both methods I've tried to change the "features per request", and as you mentioned that caused failure and the system to run out of memory.

@kirstynalex

 

Oh okay, sorry for that. Would you mind attaching a log from a successful translation? It will allow me to see where exactly the translation is being slowed. It may just be the volume of features, but I cannot confirm this until I can take a look at the log file.
@kirstynalex

 

Oh okay, sorry for that. Would you mind attaching a log from a successful translation? It will allow me to see where exactly the translation is being slowed. It may just be the volume of features, but I cannot confirm this until I can take a look at the log file.
No worries @trentatsafe! Sure, just give me a couple hours cause that's how long it takes to run and I don't have a previous log from it anymore :)

 


I can confirm the exact same behavior also with smaller datasets and I have a pending support issue with Safe about it.

In my particular case I ended up skipping the insert/update/delete methodology and now I export all the data to a new FGDB, upload it to AGOL and then use the AGOL REST API to replace the data source of the feature service. It's much faster and so far I have zero problems.

I do this with a PythonCaller based on the code I found here: https://github.com/Esri/overwrite-hosted-features

 

It's relatively easy for someone proficient in Python to integrate into FME. To avoid messing with Python, you can also just use the code as-is and run it as a separate batch job.

 


Hello @kirstynalex

I suspect the main culprit of the slowness is the truncating. There is a known issue where Truncating with AGOL is quite slow. I will attach the question to the PR we have outstanding that way any updates can be reported here.

 

 

With that in mind, we are kind of at a standstill for working around this. As basically the Truncate option is a semi-delete statement. We may be able to improve the speed by bumping up the value of the advanced parameter 'features per request'. There is a caveat to that, in terms of too high will cause it to fail ungracefully without an informative message. An example would be too high and the server will time out. It really depends on the individual Server specs/load.

I hope this helps.

I think a writer parameter to override the default http timeout would be very helpful. Sometimes 60 seconds is just too short.

 


Thanks for the info @trentatsafe. However I'm not trying to truncate (that was just another method I had tested), I'm trying to do a change detection to find inserts, update, and deletes. With both methods I've tried to change the "features per request", and as you mentioned that caused failure and the system to run out of memory.

@trentatsafe I let this run overnight but the system ran out of memory. I'm going to seek a different method for larger datasets as I think it is simply due to the volume of features. Thanks for your help!

 


@trentatsafe I let this run overnight but the system ran out of memory. I'm going to seek a different method for larger datasets as I think it is simply due to the volume of features. Thanks for your help!

 

Hello @kirstynalex

 

 

Okay no problem. I think @david_r suggestion might be the most helpful as a new method. Good luck!

 


A thought is to try and chunk out your reading using the Max Features to Read and Start Feature.

You could truncate on your first run, then just keep appending for your other runs.


Reply