Skip to main content

 

Having a unique constraint on a table is there a way to avoid FME stoping the bulk copy or insert simply ignoring the duplicate values and keep on with the translation, just reporting the warning ?

 

 

Or would be there any other alternative to UNIQUE constraints to avoid duplicates entering database without stopping the translation ?

 

 

Cheers

 

 

 

 

Hi,

 

 

If your goal is to filter out features having duplicate value for the unique field, the DuplicateRemover transformer might help you.

 

 

Takashi
Hi Takashi,

 

No, the DuplicateRemover will not work in this case.

 

Its a tile process running tile by tile and the duplicates are created at tiles shared borders. (point dups)

 

So tile A runs and creates feature ID1 at east border of tile, then tile B runs and creates the same feature ID1 at the west border of tile. At the end there are 2 ID's 1 at the shared border between tiles.

 

I was searching for some logic to be applied at DB side to prevent duplicate inserts , tried to apply a UNIQUE Constraint by ID but that will stop the translations whenever the constraint is checked.

 

Hope i was clear

 

Jorge

 


Naturally the UNIQUE constraint rejects duplicate insertion and the translation fails. If you need to remove features having duplicate ID, the DuplicateRemover should be a quick way, but it wasn't applicable...

 

Not sure the situation and requirement.
Hi,

 

 

It seems you want to run the workbench without stopping because of writer is having an unique constraint and failing the translation.

 

 

In these cases normally I would suggest to translate to stage user without unique constraint and verify the data and make necessary editing to the data / work bench (based on requirement) and then copy the data back to required user when data is as required.

 

 

This will help us to identify the loop holes / errors of data or work bench.

 

 

Hope this helps you.

 

 

Pratap
Can you use a SQLExecutor and query the database to make sure the ID doesn't already exist? If it does then you deal with it (either drop it, or give it a new ID)

 


Exactly Mark, that was my immediate workaround 🙂 you got the point . Now some other problem arised. Since the tiles are running commanded from a WorkspacerRunner (no wait) i need to be sure that on every batch of 8 childs there are no neighbor tiles running at the same time 🙂 tricky one , i randomized the tiles got SQLExecutor as last in the pipeline but there is always some that tend to get duplicate , i think i'm getting a sort of race condition to solve here if i can call it like that.

 

Thanks
Exactly Mark, that was my immediate workaround 🙂 you got the point . Now some other problem arised. Since the tiles are running commanded from a WorkspacerRunner (no wait) i need to be sure that on every batch of 8 childs there are no neighbor tiles running at the same time 🙂 tricky one , i randomized the tiles got SQLExecutor as last in the pipeline but there is always some that tend to get duplicate , i think i'm getting a sort of race condition to solve here if i can call it like that.

 

Thanks

That's an interesting one. My first thought is that you set a flag in the database when you start to process a tile. Then you have a SQLExecutor in the original workspace that checks if the flag is set for any neighboring tiles. If you put this in a custom transformer you could send it in a loop, using a Decelerator, to check every 10 seconds (for example) and keep looping until all the neighbor flags are unset. The other thought is you use a checkerboard pattern to process data, and do it in two processes (ie you process A1, A3, A5, B2, B4, C1, C3, etc, and once they are done you process A2, A4, B1, B3, B5, C2, C4, etc). That seems the most efficient to ensure no two neighboring sets of tiles are being processed simultaneously.


Reply