Solved

DatabaseJoiner doesn't find PostgreSQL record inserted by SQLExecutor on previous feature


Badge +2

I have a custom transformer that caches API call results to a PostgreSQL table. It tries a DatabaseJoiner on the key, and if it finds a matching record, it passes the joined feature directly to the main Output port. Otherwise, it calls the API, passes the result to a SQLExecutor to INSERT it for joining on same key value on subsequent features passed to the transformer.

Unfortunately, the DatabaseJoiner is unable to find records stored by the SQLExecutor for subsequent features during the same workspace execution. I've tried explicit COMMITs (with appropriate delimiter declaration) in the SQLExecutor (it says no transaction is in progress) and a 5-second-per-feature Decelerator after the SQLExecutor in case there is some kind of race condition between FME Workbench and PostgreSQL. No luck. I still get duplicate key violations because the failure of the DatabaseJoiner causes the same key to be sent to the API and storage attempted again via the SQLExecutor. I've even tried moving the SQLExecutor to another workspace called via WorkspaceRunner, but the errors persist.

There is no parallelism in this workspace, and it appears that the INSERT is committed every time the SQLExecutor is called. So I can sense no race condition that should cause this behavior. Hoping someone else can shed some light.

icon

Best answer by takashi 27 May 2018, 03:05

View original

13 replies

Userlevel 4

What happens if you set a breakpoint just after the SQLExecutor and you look at the contents of the postgresql table using e.g. pgAdmin -- is the new record there?

Badge +2

What happens if you set a breakpoint just after the SQLExecutor and you look at the contents of the postgresql table using e.g. pgAdmin -- is the new record there?

Yes, it is there when I look in Navicat, but the DatabaseJoiner still doesn't find it on subsequent call. And I've confirmed that the keys are identical across the board. Thoroughly checked breakpoints after DatabaseJoiner and both Initiator and output ports from SQLExecutor. :-(

 

Badge +2

And I've just tried replacing the DatabaseJoiner with another SQLExecutor and a FeatureMerger...same problem. So it's not just the DatabaseJoiner that misbehaves.

Userlevel 4

Try something like the following in the SQLExecutor that's inserting the record:

FME_SQL_DELIMITER #
insert into ...
#

This should force FME into commiting any ongoing transactions before continuing.

Badge +2

Try something like the following in the SQLExecutor that's inserting the record:

FME_SQL_DELIMITER #
insert into ...
#

This should force FME into commiting any ongoing transactions before continuing.

Tried that before I posted the original question, but I've since made other changes, so tried again.  Same problem.  Besides, if the new record shows up in Navicat, then it doesn't seem like transaction demarcation is the problem.
Badge +2

BTW, I've also shut off all prefetch options in my DatabaseJoiner, so it should not be prefetching the results before passing the features through. Though I must admit that it looks like it's doing something like that anyway.

Userlevel 2
Badge +17

Hi @kweller, I guess that the DatabaseJoiner would fetch and cache matched records as testing result for a condition when a feature having the condition is entered at first, and attempt to find matched records from the cache when the second or later feature having the same condition is entered. The cache for a condition would be empty if there were no records that match the first feature having the condition, so the second feature having the same condition would not match any record even if a matched record has been inserted into the database after the first feature was processed.

If my guessing above was correct, the DatabaseJoiner might not be a good choice for your purpose.

A possible workaround is: use a SQLExecutor to execute a SQL statement that returns the number of matched records for each feature, then use a Tester to test if the number is 0. The SQL statement looks something like this.

select count(*) as _number_of_matched_records
from my_table where key = @Value(_key)

  • Attributes to Expose: _number_of_matched_records
  • Combine Attributes: Keep Result Attributes if Conflict
  • Combine Geometry: Initiator Geometry Only

Badge +2

Hi @kweller, I guess that the DatabaseJoiner would fetch and cache matched records as testing result for a condition when a feature having the condition is entered at first, and attempt to find matched records from the cache when the second or later feature having the same condition is entered. The cache for a condition would be empty if there were no records that match the first feature having the condition, so the second feature having the same condition would not match any record even if a matched record has been inserted into the database after the first feature was processed.

If my guessing above was correct, the DatabaseJoiner might not be a good choice for your purpose.

A possible workaround is: use a SQLExecutor to execute a SQL statement that returns the number of matched records for each feature, then use a Tester to test if the number is 0. The SQL statement looks something like this.

select count(*) as _number_of_matched_records
from my_table where key = @Value(_key)

  • Attributes to Expose: _number_of_matched_records
  • Combine Attributes: Keep Result Attributes if Conflict
  • Combine Geometry: Initiator Geometry Only

I already tried something similar, but even the SQLExecutor could not find the new record, even though Navicat could.  However, I believe I've identified and solved the problem.  I will post
Badge +2

Hi @kweller, I guess that the DatabaseJoiner would fetch and cache matched records as testing result for a condition when a feature having the condition is entered at first, and attempt to find matched records from the cache when the second or later feature having the same condition is entered. The cache for a condition would be empty if there were no records that match the first feature having the condition, so the second feature having the same condition would not match any record even if a matched record has been inserted into the database after the first feature was processed.

If my guessing above was correct, the DatabaseJoiner might not be a good choice for your purpose.

A possible workaround is: use a SQLExecutor to execute a SQL statement that returns the number of matched records for each feature, then use a Tester to test if the number is 0. The SQL statement looks something like this.

select count(*) as _number_of_matched_records
from my_table where key = @Value(_key)

  • Attributes to Expose: _number_of_matched_records
  • Combine Attributes: Keep Result Attributes if Conflict
  • Combine Geometry: Initiator Geometry Only

I think we can also safely say that the DatabaseJoiner is not caching its data on its own (without explicit prefetching).  That's good to know!
Badge +2

It appears that this was actually caused by single quotes in the key values. As soon as I installed some logic to defend against SQL injection (including quote escapes), the problem went away. I suspect that when I was comparing key values during diagnosis, the differences between what was actually stored and what I was querying for were masked thanks to the escape mechanisms.

Badge +2

It appears that this was actually caused by single quotes in the key values. As soon as I installed some logic to defend against SQL injection (including quote escapes), the problem went away. I suspect that when I was comparing key values during diagnosis, the differences between what was actually stored and what I was querying for were masked thanks to the escape mechanisms.

Well, this has reared its ugly head again. Happens less often, but it happens. :-( Submitted support request, will post ultimate solution here.

 

 

Badge +2
Well, this has reared its ugly head again. Happens less often, but it happens. :-( Submitted support request, will post ultimate solution here.

 

 

Resolved for good. See final response comment under Best Answer above.
Badge +2

Hi @kweller, I guess that the DatabaseJoiner would fetch and cache matched records as testing result for a condition when a feature having the condition is entered at first, and attempt to find matched records from the cache when the second or later feature having the same condition is entered. The cache for a condition would be empty if there were no records that match the first feature having the condition, so the second feature having the same condition would not match any record even if a matched record has been inserted into the database after the first feature was processed.

If my guessing above was correct, the DatabaseJoiner might not be a good choice for your purpose.

A possible workaround is: use a SQLExecutor to execute a SQL statement that returns the number of matched records for each feature, then use a Tester to test if the number is 0. The SQL statement looks something like this.

select count(*) as _number_of_matched_records
from my_table where key = @Value(_key)

  • Attributes to Expose: _number_of_matched_records
  • Combine Attributes: Keep Result Attributes if Conflict
  • Combine Geometry: Initiator Geometry Only

OK, turns out that the quoting issue was masking the ultimate solution, which is to use a SQLExecutor instead of a DatabaseJoiner.  The latter must pre-cache the results regardless of the pre-cache settings.  Confusing and unintuitive, but true.

Reply