Question

ChangeDetector how does it handle duplication? Only want to compare the most recent rows.

  • 16 February 2022
  • 4 replies
  • 0 views

I am building a vehicle location tracking dataset where I want to keep a history of vehicle locations for future analysis.

 

I am using the ChangeDetector to compare the values in two tables based on a field called [device]. I want to compare the [device] in table 1 with the most recent [device] in table 2 and if there has been a change, 'insert' those values from table 1 into table 2. So where [device] is in table 2 multiple times, I want to compare only the most recent record of the [device] details.

 

When running the ChangeDetector, i have noticed that everything in the Deleted output seems to be the older duplicates.

 

image.pngThis moves me to my question. Does ChangeDetector just look for the first match for the value and move the others to the Deleted port? Does ChangeDetector use the timestamp field that I have to automatically detect the most recent?

 

Can't seem to find any documentation on this?


4 replies

Userlevel 1
Badge +21

I think if you only want to compare the most recent record, sorting by timestamp with most recent first, followed by a duplicate filter to keep only the most recent record for each vehicle and then send that into the change detector is going to be more reliable.

I think if you only want to compare the most recent record, sorting by timestamp with most recent first, followed by a duplicate filter to keep only the most recent record for each vehicle and then send that into the change detector is going to be more reliable.

I am guessing then the change detector works off the first match for key attributes?

 

I initially thought the easiest method would be to sort by my 'serial' type field but it looks like @ebygomm​ are right and timestamp is a better option as the 'serial' is not in timestamp order.

 

Next question, is whether it needs to be in ascending or descending order? Will test and report back.

I think if you only want to compare the most recent record, sorting by timestamp with most recent first, followed by a duplicate filter to keep only the most recent record for each vehicle and then send that into the change detector is going to be more reliable.

Has to be in descending order.

Userlevel 4
Badge +25

I think our documentation could do with improving and I'll talk to our tech writers about that. But I also covered this issue briefly on my Question-of-the-Week video, which you can find here: https://www.youtube.com/watch?v=xGvG5z447ps&t=2596s

Reply