Skip to main content
Question

Change detection on large dataset

  • July 7, 2020
  • 5 replies
  • 115 views

boubcher
Contributor
Forum|alt.badge.img+11

Hello there

We are using change detector in order to update a large data base with million if records

Is there a wày we could speed up the detection process

Thanks

 

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

5 replies

bozewolf
  • July 7, 2020

No. By definition, you need to have the full set to detect changes. You could of course cluster changes on some hash function pi and paralise it, but the computing power would just increase and not solve this engineering problem.

 

Just add more memory so that FME runs smoothly would be my advise.


chrisatsafe
Contributor
Forum|alt.badge.img+2
  • Contributor
  • July 7, 2020

No. By definition, you need to have the full set to detect changes. You could of course cluster changes on some hash function pi and paralise it, but the computing power would just increase and not solve this engineering problem.

 

Just add more memory so that FME runs smoothly would be my advise.

And turn off feature caching.

 

 

Run > Enable Feature Caching (should be deselected).

boubcher
Contributor
Forum|alt.badge.img+11
  • Author
  • Contributor
  • July 7, 2020

Thanks, Guy Appreciated

 

 


erik_jan
Contributor
Forum|alt.badge.img+26
  • Contributor
  • July 7, 2020

Can we get some more details on this?

I have sped up change detection by writing data to a staging table in a database.

If the database is the destination, writing all records to a table and have the database calculate the delta can speed up the process quite a bit.

But all depends on what type of data you are using.


gazza
Contributor
Forum|alt.badge.img+6
  • Contributor
  • July 8, 2020

One trick is calculate a CRC for each record using all the fields and perform the change detection on just the CRC.