Skip to main content

Hi All,

I have created a workflow that reads a few large GIS datasets in vector format across different areas. To avoid reading all the records in the dataset and reduce time I have used the FeatureReader transformer which is connected to the layers and an Area of Interest. The Spatial Filter option is ‘turned on’ as Boundary Boxes OGC-Intersect to read only those records within the Area of Interest. The problem is still takes ages to read,

I have got a few FeatureReaders within the workflow and I was thinking to use only one FeatureReader (as all the data comes form the same GDB) to improve the performance,

 

Any ideas to improve the performance of the workflow?

I doubt merging the FeatureReaders into one will make performance any better as processing is done in serie, not in parallel.

A performance win for production runs is to disable feature caching, as this generates a lot if I/O and reading / writing usually takes a big chunk of the pie.

Performance will also benefit from having the data close, so having the gdb on a SSD on the machine which does the processing is a big improvement over having it on a network share.

And if you see “optimizing memory” messages in the log, the RAM on you maxine is maxed out. If I see this message I redesign the workspace as optimizing memory is essentially swapping, writing out data from RAM to disk, and this is slow.


Thanks @nielsgerrits good advise. I would need the feature caching ON as need to view the features in the Visual preview but your suggestions are definitely good practises.


Thanks @nielsgerrits good advise. I would need the feature caching ON as need to view the features in the Visual preview but your suggestions are definitely good practises.

Also, developing workspaces with smaller but representive subsets is more convenient as rerunning the process is so much quicker. More iterations a day.


Thanks @nielsgerrits good advise. I would need the feature caching ON as need to view the features in the Visual preview but your suggestions are definitely good practises.

If you do not need to use visual preview at all steps, you could also collapse the bookmarks for the steps that do not need to be viewed so that they are not cached.


Thanks @nielsgerrits good advise. I would need the feature caching ON as need to view the features in the Visual preview but your suggestions are definitely good practises.

If you do not need to use visual preview at all steps, you could also collapse the bookmarks for the steps that do not need to be viewed so that they are not cached.

This is a good point. I never use it as it ruins my workspace layout, which I need to be as is fitting for my brain. It would be great if bookmarks just had a featurecaching switch.


Thanks @nielsgerrits good advise. I would need the feature caching ON as need to view the features in the Visual preview but your suggestions are definitely good practises.

If you do not need to use visual preview at all steps, you could also collapse the bookmarks for the steps that do not need to be viewed so that they are not cached.

This is a good point. I never use it as it ruins my workspace layout, which I need to be as is fitting for my brain. It would be great if bookmarks just had a featurecaching switch.

I really like that idea, maybe submit a post in the ideas section. I rarely use it for the same reason.


Thanks @nielsgerrits good advise. I would need the feature caching ON as need to view the features in the Visual preview but your suggestions are definitely good practises.

If you do not need to use visual preview at all steps, you could also collapse the bookmarks for the steps that do not need to be viewed so that they are not cached.

This is a good point. I never use it as it ruins my workspace layout, which I need to be as is fitting for my brain. It would be great if bookmarks just had a featurecaching switch.

I really like that idea, maybe submit a post in the ideas section. I rarely use it for the same reason.

I did put it on a board of post-its on last User Conference. I thought I already created an Idea as well, but I can’t find it so here is a new one:

 


I created the idea, but I can not link to it here :)


Assuming that its the process of actually reading the data (not transforming/processing it once its in FME) check whether there are appropriate indexes on your featureclasses

https://pro.arcgis.com/en/pro-app/latest/help/data/geodatabases/overview/attribute-indexes-in-the-geodatabase.htm

https://pro.arcgis.com/en/pro-app/latest/help/data/geodatabases/overview/an-overview-of-spatial-indexes-in-the-geodatabase.htm


Do I see it correct?

Do you have 4000 initiators? Is there anyway you could reduce the 4000 initiators to about 100 initiators?

If you could dissolve and tile your bounding boxes this could be an optimization to reduce requests.

It really depends on your dataset what would be the optimization, but 4000 spatial requests look pretty heavy on a filegeodatabase.


Thanks @jkr_da - that’s what I finally did to improve the performance: I dissolved the individual 4,000 parcels (initiators) into 1 parcel and the trick has reduced the process time from hours to only 20min in total :)


Reply