Question

Change Detection w/ No Unique Key

  • 25 July 2019
  • 2 replies
  • 21 views

Badge

I have a need to check for data changes between two datasets (oracle tables). I initially configured the ChangeDetector comparing all of the attributes. The results were, all of the data from the “revised” port were identified as records to be added (some 300K plus records) and the data from the “original” port are identified as records to be deleted (som 300K plus records).

There’s no unique key in this data. If I limit the number of attributes to check to 7 specific attributes, the “added” and “deleted” ports reflect more reasonable results. In other words, from a basic math perspective, the difference between the number of revised and original records calculate correctly (in the sense of finding “changes”).

Essentially, all of the attribute values can change so I need to check for all of the attributes. I don’t understand why in configuring the ChangeDetector to check for all attributes, why the transformer behaves as if all of the records “changed”. I’m fairly new to using this particular transformer.

I use to simply process all of the data which includes geocoding some 300 thousand plus records. I’m trying to build a more efficient process and relieve an anxiety from my GIS folks by using their geocoding web service to process 300 thousand plus records on a nightly basis.

Any thoughts?


2 replies

Badge +22

All attributes will also include hidden fme attributes like the multi_reader_id, fme_dataset etc which will always be different between datasets.

 

 

See https://knowledge.safe.com/content/idea/93639/changedetector-match-all-attributes-should-exclude.html
Badge

All attributes will also include hidden fme attributes like the multi_reader_id, fme_dataset etc which will always be different between datasets.

 

 

See https://knowledge.safe.com/content/idea/93639/changedetector-match-all-attributes-should-exclude.html

Thanks. I'll give it a shot. So - I have to expose the attributes that the two datasets share then (one source is an oracle table the other is an MS SQL table) then feed the sources to the ChangeDetector and ignore those attributes?

Reply