Question

ChangeDetector for duplicate records


Badge

I use ChangeDetector for comparing two datasets. sample databsets are like below. will personid 345 be counted as UNCHANGED or ADDED?

dataset 1:

personidlastnamegender111smithmale345hardingsfemale

dataset 2:

personidfirstnamegender111smithmale345hardingsfemale345travisfemale

2 replies

Badge +1

If you connect dataset 1 to the original port and dataset 2 to the revised port it will be counted as ADDED where firstname is travis and as UNCHANGED where firstname is hardings. Since there are two objects with personid 345 the ChangeDetector will return both.

If you connect dataset 2 to to original and dataset 1 to revised the UNCHANGED output will stay the same and the one with travis will be counted as DELETED.

Depending on what you want to do with your workspace you might also want to have a look at the DuplicateRemover.

Badge

If you connect dataset 1 to the original port and dataset 2 to the revised port it will be counted as ADDED where firstname is travis and as UNCHANGED where firstname is hardings. Since there are two objects with personid 345 the ChangeDetector will return both.

If you connect dataset 2 to to original and dataset 1 to revised the UNCHANGED output will stay the same and the one with travis will be counted as DELETED.

Depending on what you want to do with your workspace you might also want to have a look at the DuplicateRemover.

wow. thank you!

Reply