Skip to main content
Question

ChangeDetector how does it handle duplication? Only want to compare the most recent rows.


tiztrain
Contributor
Forum|alt.badge.img+3

I am building a vehicle location tracking dataset where I want to keep a history of vehicle locations for future analysis.

 

I am using the ChangeDetector to compare the values in two tables based on a field called [device]. I want to compare the [device] in table 1 with the most recent [device] in table 2 and if there has been a change, 'insert' those values from table 1 into table 2. So where [device] is in table 2 multiple times, I want to compare only the most recent record of the [device] details.

 

When running the ChangeDetector, i have noticed that everything in the Deleted output seems to be the older duplicates.

 

image.pngThis moves me to my question. Does ChangeDetector just look for the first match for the value and move the others to the Deleted port? Does ChangeDetector use the timestamp field that I have to automatically detect the most recent?

 

Can't seem to find any documentation on this?

4 replies

ebygomm
Influencer
Forum|alt.badge.img+33
  • Influencer
  • February 16, 2022

I think if you only want to compare the most recent record, sorting by timestamp with most recent first, followed by a duplicate filter to keep only the most recent record for each vehicle and then send that into the change detector is going to be more reliable.


tiztrain
Contributor
Forum|alt.badge.img+3
  • Author
  • Contributor
  • February 17, 2022
ebygomm wrote:

I think if you only want to compare the most recent record, sorting by timestamp with most recent first, followed by a duplicate filter to keep only the most recent record for each vehicle and then send that into the change detector is going to be more reliable.

I am guessing then the change detector works off the first match for key attributes?

 

I initially thought the easiest method would be to sort by my 'serial' type field but it looks like @ebygomm​ are right and timestamp is a better option as the 'serial' is not in timestamp order.

 

Next question, is whether it needs to be in ascending or descending order? Will test and report back.


tiztrain
Contributor
Forum|alt.badge.img+3
  • Author
  • Contributor
  • February 17, 2022
ebygomm wrote:

I think if you only want to compare the most recent record, sorting by timestamp with most recent first, followed by a duplicate filter to keep only the most recent record for each vehicle and then send that into the change detector is going to be more reliable.

Has to be in descending order.


mark2atsafe
Safer
Forum|alt.badge.img+44
  • Safer
  • February 18, 2022

I think our documentation could do with improving and I'll talk to our tech writers about that. But I also covered this issue briefly on my Question-of-the-Week video, which you can find here: https://www.youtube.com/watch?v=xGvG5z447ps&t=2596s


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings