Skip to main content
Question

Change detection on large dataset


boubcher
Contributor
Forum|alt.badge.img+11

Hello there

We are using change detector in order to update a large data base with million if records

Is there a wày we could speed up the detection process

Thanks

 

5 replies

bozewolf
  • July 7, 2020

No. By definition, you need to have the full set to detect changes. You could of course cluster changes on some hash function pi and paralise it, but the computing power would just increase and not solve this engineering problem.

 

Just add more memory so that FME runs smoothly would be my advise.


chrisatsafe
Contributor
Forum|alt.badge.img+2
  • Contributor
  • July 7, 2020
bozewolf wrote:

No. By definition, you need to have the full set to detect changes. You could of course cluster changes on some hash function pi and paralise it, but the computing power would just increase and not solve this engineering problem.

 

Just add more memory so that FME runs smoothly would be my advise.

And turn off feature caching.

 

 

Run > Enable Feature Caching (should be deselected).

boubcher
Contributor
Forum|alt.badge.img+11
  • Author
  • Contributor
  • July 7, 2020

Thanks, Guy Appreciated

 

 


erik_jan
Contributor
Forum|alt.badge.img+17
  • Contributor
  • July 7, 2020

Can we get some more details on this?

I have sped up change detection by writing data to a staging table in a database.

If the database is the destination, writing all records to a table and have the database calculate the delta can speed up the process quite a bit.

But all depends on what type of data you are using.


gazza
Contributor
Forum|alt.badge.img+6
  • Contributor
  • July 8, 2020

One trick is calculate a CRC for each record using all the fields and perform the change detection on just the CRC.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings