How to debug Matcher vs DuplicateFilter vs FeatureMerger = different answers!

Question

Again, not really a question, more something that might help those trying to understand different results from these transformers.

Two source File Geodatabases with identical schemas.

2152 features in one, 148 in the other.

FeatureMerger (merge Attributes and Geometry) using the unique ID field for the data (not ObjectID) = 93 Merged (exists in both), 2059 NotMerged (only in Requestor), 55 Unreferenced (only in Supplier), plus 93 Referenced which as I understand it and appears so from Inspector, should be the same features that come out of the merged port.

Merged, NotMerged and Unreferenced are sent to Sorter then DuplicateFilter and Matcher, neither of which find any duplicates which is what I was hoping. This results in 2207 unique features.

However, I also sent Merged and Referenced to Matcher which says that 186 (93+93) were NotMatched! I used Match Selected Attributes and ticked everything except Object ID and numReferences. You might ask why? Changing Matched to NOT differentiate between Empty, Missing and Null makes no difference, nor does Lenient Geometry Matching. I know Matcher compares geometry and the other two only compare attributes but I compared some features in Inspector and couldn't see any differences in field types, values or geometry.

To confuse things even more, I sent the two source datasets straight to Matcher. This time, 182 were Matched (so 91 SingleMatched) and 2118 were NotMatched. This results in 2209 unique features.

I also sent the two source datasets direct to Sorter then DuplicateFilter and got 2207 unique features.

My next move was to send the SingleMatched (91) and NotMatched (2118) features via Sorter to DuplicateFilter. This gave me the 2 features that made the difference between 2207 and 2209. For simplicity let's say these have unique IDs of UID1 and UID2.

I then applied a Tester to the Unique output port of the DuplicateFilter to get the other copies of those 2 records (Passed = where unique ID In UID1,UID2) and sent the results (4 features to Inspector).

I still couldn't see any differences so I copied the Feature Information for each feature out of Inspector into separate text files in TextPad (any text editor that does good file compare will do).

This revealed that a couple of fields were utf-16 with a value of <null> in one feature and utf-16e with a blank or empty value in the other, hence NotMatched. Mystery solved! I now know that unless I'm bothered about the difference between <null> and blank/empty I can take the 2207 features output from FeatureMerger (Merged + NotMerged + Unreferenced) as the unique features from the 2 source datasets.

I've still no idea why my other Matcher output 186 NotMatched rather than 93 Matched but it's not so important now I've reconciled the different answers between FeatureMerger, DuplicateFilter and Matcher.

takashi · Answer

Hi @tim_wood, regarding the first question: "However, I also sent Merged and Referenced to Matcher which says that 186 (93+93) were NotMatched! I used Match Selected Attributes and ticked everything except Object ID and numReferences."

A possible reason I can think of is that you have ticked an exposed format attribute such as fme_feature_type. Naturally its value could be different between the Merged feature (from Requestor) and the Referenced feature (from Supplier).

How to debug Matcher vs DuplicateFilter vs FeatureMerger = different answers!

2 replies

Reply

Helpful Members This Week

Recently Solved Questions

Create date segments of two table with overlap of times

Automate Fanout of columns/splitting attributes to different output by attribute name

Tracing Multiple Networks from Sources to Valves Without Python

FME Flow version control how to use different branch

Parameters within group parameters not available in a webhook?

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

Relative illumination and non-sequentialicon

Helpful Members This Week

Recently Solved Questions

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings