Solved

Change Detection adds duplicates


Badge +1

Hi all,

I have a workspace, where I am detecting the hange between my original data in an SDE and revised data coming from a public source.

 

I'm using the fme_db_operation to push Inserts, Updates and Deletes to my SDE. However it apparantly just keeps adding the data as duplicates.

 

I have 6 exactly same polygons for each feature, with the only difference being a Version ID...

 

Any thoughts?

icon

Best answer by nielsgerrits 21 May 2021, 10:07

View original

12 replies

Userlevel 6
Badge +33

First you have to check what the detected changes are. You can do this by generating a detailed changes list in the ChangeDetector:

2021-05-18_13h00_37If the changes are about geometry, comparing data from SDE with a non SDE source is always tricky because in SDE geometry is snapped to a grid. One thing you can try is to snap the non SDE data to the same grid before changedetection. You can do this using the ArcSDEGridSnapper. Also see this article.

Badge +1

First you have to check what the detected changes are. You can do this by generating a detailed changes list in the ChangeDetector:

2021-05-18_13h00_37If the changes are about geometry, comparing data from SDE with a non SDE source is always tricky because in SDE geometry is snapped to a grid. One thing you can try is to snap the non SDE data to the same grid before changedetection. You can do this using the ArcSDEGridSnapper. Also see this article.

I'm still fairly new to FME, so forgive me if my questions are a bit daft! :-)

What will that list give me?

Shouldn't it be straight forward with just connecting the Updated, Deleted and Inserted to the writer, perhaps running it through an AttributeCreator first before the writer?

ChangeDetection

Userlevel 6
Badge +33

I'm still fairly new to FME, so forgive me if my questions are a bit daft! :-)

What will that list give me?

Shouldn't it be straight forward with just connecting the Updated, Deleted and Inserted to the writer, perhaps running it through an AttributeCreator first before the writer?

ChangeDetection

The list will give you the reason why the ChangeDetector concluded the feature which is now duplicate is marked as inserted / new.

Badge +1

The funny thing is that it does not seem to update anything at any point - it merely deletes the old and inserts the new.. I have checked my settings, and the Update Detection Key attribute and it is identical. So it should be able to read the changes to the data.

Or at least just insert a few and delete a few records, instead of deleting everything and inserting the new.

I cant see that what should be changes in the writer:

ChangeDetection2

Badge +2

@larsec​ Can you export a small part of your data to a File Geodb and include that and your workspace and the 'public' data in an attachment? That would give the community something to work with.

If you are trying to match geometry then, as @nielsgerrits​ suggests, this can be tricky. Try a CoordinateRounder in front of the ChangeDetector inputs. Use the Feature Caching / Partial Runs to carefully inspect the features and compare them - including the coordinates . You might find some tips in the links in this article

Badge +1

@larsec​ Can you export a small part of your data to a File Geodb and include that and your workspace and the 'public' data in an attachment? That would give the community something to work with.

If you are trying to match geometry then, as @nielsgerrits​ suggests, this can be tricky. Try a CoordinateRounder in front of the ChangeDetector inputs. Use the Feature Caching / Partial Runs to carefully inspect the features and compare them - including the coordinates . You might find some tips in the links in this article

Hi Mark,

First of all, thank you for your input. I have read through everything that I can find related to the change detection, including the article you refer to, and still no luck. Hopefully I'm just missing some silly setting or something :-)

 

I have attached a zip file containing my fmw file and a file gdb containing one of my layers (the original).

 

Within my workspace, the first part downloads and extracts a file gdb from the public source, which my feature reader reads. The paths will have to be updated to a local path, but other than that, it should run straight through.

 

I have made 4 small attribute changes in the Original dataset, så it should, theoretically, just update those 4 features. Thanx :-)

 

With regards to the coordinateRounder - if my Update Detection Key Attribute is set and matches the original and revised data, then even changes in the coordinates should still be parsed as Updates, and not Delete all the original and Insert all the revised data, shouldn't it?

 

Userlevel 6
Badge +33

Hi Mark,

First of all, thank you for your input. I have read through everything that I can find related to the change detection, including the article you refer to, and still no luck. Hopefully I'm just missing some silly setting or something :-)

 

I have attached a zip file containing my fmw file and a file gdb containing one of my layers (the original).

 

Within my workspace, the first part downloads and extracts a file gdb from the public source, which my feature reader reads. The paths will have to be updated to a local path, but other than that, it should run straight through.

 

I have made 4 small attribute changes in the Original dataset, så it should, theoretically, just update those 4 features. Thanx :-)

 

With regards to the coordinateRounder - if my Update Detection Key Attribute is set and matches the original and revised data, then even changes in the coordinates should still be parsed as Updates, and not Delete all the original and Insert all the revised data, shouldn't it?

 

With regards to the CoordinateRounder - if my Update Detection Key Attribute is set and matches the original and revised data, then even changes in the coordinates should still be parsed as Updates, and not Delete all the original and Insert all the revised data, shouldn't it?

 

This is true, but as you can see in the inspector, the format of the ID "OBJEKT_ID" is different in the 2 sets.

GEN_ARTSFUND_LN_GDB.gdb

  • A36BE99A-F167-4045-9639-F9C964948332

Danske Grunddata Test.gdb

  • {A36BE99A-F167-4045-9639-F9C964948332}

 

As these differ the ChangeDetector sees 81 inserts and 81 deletes instead of 81 updates.

 

So if you add an AttributeCreator where you set OBJEKT_ID = {@Value(OBJEKT_ID)} in the stream from GEN_ARTSFUND_LN_GDB.gdb it returns 81 updates.

 

Next, if you add the changelist you can see all the differences.

  • VERSION_ID is also missing curly braces.
  • STARTDATO 20080219 differs from 2008-02-19 00:00:00.0000000
  • etc.

 

If you drill down to geometry, you see the coordinates differ from one to another. The first point of OBJEKT_ID {01FEE9E4-945C-443D-9EAF-A29F84AC66B8}

 

GEN_ARTSFUND_LN_GDB.gdb

  • 649969.0008001328, 6203418.002200127

Danske Grunddata Test.gdb

  • 649969, 6203418

 

But with {57CAEB1E-064E-437C-A6E3-7669E8BDFD53} it is the other way around:

 

GEN_ARTSFUND_LN_GDB.gdb

  • 658306.7469000816, 6195865.634500027

Danske Grunddata Test.gdb

  • 658306.7000000002, 6195865.6

 

This is why @Mark Stoakes​ suggests you to add CoordinateRounders.

 

I personally prefer to check for changes with rounded geometry and then restore the original geometry. This can easily be done with the GeometryExtractor (geometry to attribute) before the CoordinateRounder and a GeometryReplacer (attribute to geometry) after the ChangeDetector, to prevent putting rounded data in my source database.

 

I also see you have added fme_db_operation manually, but the ChangeDetector generates this attribute so you don't have to do that.

 

In the Writer, make sure you set Feature Operation to fme_db_operation and Match Columns to OBJECTID. In the sample workspace it is set to Insert.

Badge +1

Hi Mark,

First of all, thank you for your input. I have read through everything that I can find related to the change detection, including the article you refer to, and still no luck. Hopefully I'm just missing some silly setting or something :-)

 

I have attached a zip file containing my fmw file and a file gdb containing one of my layers (the original).

 

Within my workspace, the first part downloads and extracts a file gdb from the public source, which my feature reader reads. The paths will have to be updated to a local path, but other than that, it should run straight through.

 

I have made 4 small attribute changes in the Original dataset, så it should, theoretically, just update those 4 features. Thanx :-)

 

With regards to the coordinateRounder - if my Update Detection Key Attribute is set and matches the original and revised data, then even changes in the coordinates should still be parsed as Updates, and not Delete all the original and Insert all the revised data, shouldn't it?

 

HeHe total newbie - didn't realize that the curly braces actually where perceived as a change - I focused solely on the data itself in the OBJECT_ID.. Then it makes a lot more sence why ereything is seen as new data.

 

With regards to the changelist - I simply dont know how to add it and where to find it afterwards. HSould I just write changelist in the List Name? I haven't had any training in FME, so I'm learning as I go along. :-)

 

The comment about the fme_db_operation - does that mean that I dont have to have the three attributecreators with Update, Insert and Delete, and just connect the Change Detector and the writer directly?

 

Thank you for your quick response Niels :-)

Userlevel 6
Badge +33

Hi Mark,

First of all, thank you for your input. I have read through everything that I can find related to the change detection, including the article you refer to, and still no luck. Hopefully I'm just missing some silly setting or something :-)

 

I have attached a zip file containing my fmw file and a file gdb containing one of my layers (the original).

 

Within my workspace, the first part downloads and extracts a file gdb from the public source, which my feature reader reads. The paths will have to be updated to a local path, but other than that, it should run straight through.

 

I have made 4 small attribute changes in the Original dataset, så it should, theoretically, just update those 4 features. Thanx :-)

 

With regards to the coordinateRounder - if my Update Detection Key Attribute is set and matches the original and revised data, then even changes in the coordinates should still be parsed as Updates, and not Delete all the original and Insert all the revised data, shouldn't it?

 

Most of us started this way, no problem :)

 

Yes, just add the listname. Then add an Inspector transformer to the ChangeDetector Updated outputport. Then you get a view with the result for every run. Select one of the features in this view and look for the list in the feature information window.

Badge +1

Hi Mark,

First of all, thank you for your input. I have read through everything that I can find related to the change detection, including the article you refer to, and still no luck. Hopefully I'm just missing some silly setting or something :-)

 

I have attached a zip file containing my fmw file and a file gdb containing one of my layers (the original).

 

Within my workspace, the first part downloads and extracts a file gdb from the public source, which my feature reader reads. The paths will have to be updated to a local path, but other than that, it should run straight through.

 

I have made 4 small attribute changes in the Original dataset, så it should, theoretically, just update those 4 features. Thanx :-)

 

With regards to the coordinateRounder - if my Update Detection Key Attribute is set and matches the original and revised data, then even changes in the coordinates should still be parsed as Updates, and not Delete all the original and Insert all the revised data, shouldn't it?

 

Hi Niels,

Thank you for your valuable input so far.

I have a follow up question to the changelist. I have either excluded or fixed all the differences in the changelist but one.

 

Apparently there is a field called geometryObject - not something I can see anywhere, but it is the only thing left. However the Original and Revised are both empty.. So why does it come up in the changelist? (see attachment).

 

I have added an AttributeExposer to see if I can find the field, but no luck. Do you have any bright ideas? :-)

 

Thanks.Changelist_GeometryObject 

Userlevel 6
Badge +33

Hi Mark,

First of all, thank you for your input. I have read through everything that I can find related to the change detection, including the article you refer to, and still no luck. Hopefully I'm just missing some silly setting or something :-)

 

I have attached a zip file containing my fmw file and a file gdb containing one of my layers (the original).

 

Within my workspace, the first part downloads and extracts a file gdb from the public source, which my feature reader reads. The paths will have to be updated to a local path, but other than that, it should run straight through.

 

I have made 4 small attribute changes in the Original dataset, så it should, theoretically, just update those 4 features. Thanx :-)

 

With regards to the coordinateRounder - if my Update Detection Key Attribute is set and matches the original and revised data, then even changes in the coordinates should still be parsed as Updates, and not Delete all the original and Insert all the revised data, shouldn't it?

 

Hej Lars, geometryObject means the detected change is in the geometry, not in an attribute.

What I would do is isolate the feature with this ID with a Tester in both streams (Original and Revised) before feeding it into the ChangeDetector. Then add Inspectors to both Testers Passed outputports. After a run, select both features in the Data Inspector and switch between them back and forth, looking at the geometry, at the bottom in the Feature Information.

You probably see differences in coordinate precision, which the ChangeDetector identifies as a modification.

Now you have to choose what kind of strategy you want to follow:

  • Correct the geometry in the original set to the geometry in the revised set. (Send Updates to writer.)
    • If the precision how geometry is saved in original differs from revised, this probably is not going to work. If 10.123, 11.456 is saved as 10.1, 11.5 the geometries will differ again next run, meaning the features geometry is always identified as a change, and always updated. Not ideal.
  • Round the geometry before detecting the change.
    • 10.123, 11.456 and 10.1, 11.5 rounded to zero precision is both 10,11, so it won't be detected as a change. How extreme you round depends on the data and what you want to detect.
    • If you don't want to write rounded data you can save the geometry as attribute to the feature (GeometryExtractor) then round the geometry, detect changes and restore the geometry (GeometryReplacer).
Badge +1

Hi Mark,

First of all, thank you for your input. I have read through everything that I can find related to the change detection, including the article you refer to, and still no luck. Hopefully I'm just missing some silly setting or something :-)

 

I have attached a zip file containing my fmw file and a file gdb containing one of my layers (the original).

 

Within my workspace, the first part downloads and extracts a file gdb from the public source, which my feature reader reads. The paths will have to be updated to a local path, but other than that, it should run straight through.

 

I have made 4 small attribute changes in the Original dataset, så it should, theoretically, just update those 4 features. Thanx :-)

 

With regards to the coordinateRounder - if my Update Detection Key Attribute is set and matches the original and revised data, then even changes in the coordinates should still be parsed as Updates, and not Delete all the original and Insert all the revised data, shouldn't it?

 

Hej Niels,

Awesome advise! I found the "BUG" - I had added a Coordinate rounder to my reivsed dataset but not my original, so there where extra decimals that triggered the change. Thanx :-)

Reply