Question

Performance enhancements with UpdateDetector and FeatureMerger

6 years ago
23 April 2018
7 replies
13 views

cj
19 replies

I have a workbench that has to process about 600,000 lines and apply updates to a master dataset (.gdb). The attached workbench is a very small sample of what has to be completed. The main issue I am running into is the update detector is extremely slow. A sample run of about 1000 lines took just over 2 hours. So any suggestions of things I can do to speed it up would be much appreciated.

I have also found that when I try to run the whole the workbench with the complete dataset the FeatureMerger rejects the suppliers with the rejection code EXTRA_REFERENCEE_FEATURE. This only seems to occur when the number of suppliers gets quite large. I have seen this same rejection code mentioned in reference to duplicates but I don't have any duplicates in the complete supplier dataset.

Any help on either issues would be much appreciated.

Sample of bench

Pic of whole bench

7 replies

+16

bruceharold
Contributor
325 replies
6 years ago
23 April 2018

As you're dealing with polyline data in GDB then at the risk of the Safe moderator's ire you might like to take a look at Esri's own Detect Feature Changes tool:

cj
Author
19 replies
5 years ago
24 April 2018

As you're dealing with polyline data in GDB then at the risk of the Safe moderator's ire you might like to take a look at Esri's own Detect Feature Changes tool:

thanks @bruceharold, fair point, but unfortunately not all inputs are GDB polylines.

Userlevel 4

+13

fmelizard
Contributor
3726 replies
5 years ago
27 April 2018

I'm surprized 1000 lines would take that long. Any chance you could send us in a sample so we could examine (via support@safe.com or to this forum)?

Without checking into your workflow, if there was any way to use the new FeatureJoiner to reduce the amount of data going into the UpdateDetector, that might help. FeatureJoiner is drastically faster than FeatureMerger, with a different model for doing the work but in practice most FeatureMerger problems can be expressed using a FeatureJoiner (if you have FME 2018 handy)

@cj You might try using CRCCalculator as described here.

If you are matching geometry for your updates then reducing the geometry to a single CRC code attribute value allows you to use attribute match instead of a geometry match - which is generally more efficient.

cj
Author
19 replies
5 years ago
1 May 2018

@cj You might try using CRCCalculator as described here.

Thanks @MarkAtSafe. I am already using a CRC code for the geometry matching as you describe. I am extracting the true geometry to the _geom attribute, then rounding the coordinates before creating a geometry only CRC.

cj
Author
19 replies
5 years ago
1 May 2018

I'm surprized 1000 lines would take that long. Any chance you could send us in a sample so we could examine (via support@safe.com or to this forum)?

Thanks @daleatsafe. I need to take a look at the FeatureJoiner, as that sounds like a good option. As I was using the CRC code as the join key / key attribute in both the FeatureMerger and then the UpdateDetector I figured (and tested) that any features that exited the FeatureMerger through the Unmerged-requestor port did not need to go to the UpdateDetector as it is performing the same match, which will fail, so it will exit via the Insert port. So these features i by-passed the UpdateDetector - which improved the performance a bit.

cj
Author
19 replies
5 years ago
1 May 2018

Thanks for your suggestions @bruceharold, @MarkAtSafe, and @daleatsafe. I found some significant performance improvements by creating a CRC code based on the selected attributes I was wanting to compare only. I created this CRC code between the FeatureMerger and the UpdateDetector. Therefore in the UpdateDetector I could use the geometry based CRC field as the key feature and check for updates only in the geom CRC field and the attribute CRC field. So it was only having to check two fields instead 60+ fields that the full dataset had. That plus by-passing the UpdateDetector for all features that exited the FeatureMerger from the Unmerged-requestor port, got the processing time down from 2+ hours to 3 minutes for about 1000 lines. Would still like to look into the FeatureJoiner as that sounds like it could improve things further.

Performance enhancements with UpdateDetector and FeatureMerger

7 replies

Reply

Community Stats

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded