Skip to main content
Question

ChangeDetector for duplicate records


Forum|alt.badge.img

I use ChangeDetector for comparing two datasets. sample databsets are like below. will personid 345 be counted as UNCHANGED or ADDED?

dataset 1:

personidlastnamegender111smithmale345hardingsfemale

dataset 2:

personidfirstnamegender111smithmale345hardingsfemale345travisfemale

2 replies

soeren
Contributor
Forum|alt.badge.img+6
  • Contributor
  • July 18, 2016

If you connect dataset 1 to the original port and dataset 2 to the revised port it will be counted as ADDED where firstname is travis and as UNCHANGED where firstname is hardings. Since there are two objects with personid 345 the ChangeDetector will return both.

If you connect dataset 2 to to original and dataset 1 to revised the UNCHANGED output will stay the same and the one with travis will be counted as DELETED.

Depending on what you want to do with your workspace you might also want to have a look at the DuplicateRemover.


Forum|alt.badge.img
  • Author
  • July 18, 2016
soeren wrote:

If you connect dataset 1 to the original port and dataset 2 to the revised port it will be counted as ADDED where firstname is travis and as UNCHANGED where firstname is hardings. Since there are two objects with personid 345 the ChangeDetector will return both.

If you connect dataset 2 to to original and dataset 1 to revised the UNCHANGED output will stay the same and the one with travis will be counted as DELETED.

Depending on what you want to do with your workspace you might also want to have a look at the DuplicateRemover.

wow. thank you!


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings