Skip to main content
Solved

SQL execution

  • 6 March 2024
  • 4 replies
  • 142 views

Hi Community,

 

I am using some SQLExecutors in my workbench and the this is taking so much time to execute. Is there any way I can execute it faster. Is it because the statement going from fme to outside db and execute it in batches of 1000 and commits the edits making the whole delay?. Is it possible to execute the whole sql actions using a plsql script and bind it to fme?. I may also need to have logs or exception tables for plsql script. Is it possible through fme?

4 replies

Userlevel 4
Badge +17

The SQLExecutor will execute once per feature which is passed into it. I believe the database connection is maintained between statements, but they are independent transactions each committed separately and immediately. So each statement has a time cost. If each SQL statement is for example a simple INSERT INTO table VALUES (), then features will be inserted one at a time and not batched.

You can batch inserts yourself, I’ve done this before with a GroupCounter, and then combination of ListBuilder/ListConcatenator to build up a long insert script, to minimise the number of statements needing to be run. INSERT VALUES has a limit of 1000 rows at a time, which is the reason for the GroupCounter. This will run a little bit slower than a SQL Writer doing a bulk insert, but it’s still good.

Also bear in mind the ; delimiter works like a GO command, not a regular SQL ; (meaning variables etc are dropped after it)

For logs/exceptions, I believe a try..catch flow will work, though I’m only familiar with microsoft SQL

Badge +7

I am using Oracle DB. My sql commands are update commands. I believe it is directly sending to the oracle to execute rather than regular transformer which execute in batches. Please clarify. 

Userlevel 5

The reply from @ctredinnick is excellent and mirrors my own way of dealing with this.

I can add that part of the reason for the slow performance of the SQLExecutor is that it does not currently support prepared statements or bind variables*, meaning that the DB back-end has to redo parsing the entire SQL statement for each invocation (trigger feature), even though only the values change, not the SQL statement. This may cause significant overhead when the number of features increase and/or the SQL statement is non-trivial. Also, the SQLExecutor will create a separate database transaction for each trigger feature, which is fairly costly in itself.

 

*) It seems Safe has done some work on this matter but for Oracle only. Therefore I haven’t been able to test it myself. See the discussion here for details:

If you’re handy with Python, it might even be interesting to try out the route suggested by @vlroyrenn on that page. Those libraries should (hopefully) support prepared statements, although it’s possible that you’ll add some overhead from the fmeobjects module when transferring attributes back and forth.

However, you should be able to get improved performance by grouping the UPDATE statements into transaction blocks of e.g. between 100 to 1000 update statements before passing the statement to the SQLExecutor, as suggested by @ctredinnick. You could populate a text attribute with something like the following, before passing it on to the SQLExecutor:

FME_SQL_DELIMITER @

begin transaction;
update my_table set my_column = 'abc' where my_pk = 123;
update my_table set my_column = 'def' where my_pk = 456;
-- etc...
commit;

Note that you’ll have to set FME_SQL_DELIMITER to a character that doesn’t occur in your statement, to prevent FME from trying to split up the SQL before passing it onto the database.

Ultimately, for “regular” update statements, it might be (much) faster to use the Oracle writer / FeatureWriter with mode set to UPDATE, since those mechanisms use prepared statements and automatic transaction groups.

Userlevel 2
Badge +12

If you’re handy with Python, it might even be interesting to try out the route suggested by @vlroyrenn on that page. Those libraries should (hopefully) support prepared statements, although it’s possible that you’ll add some overhead from the fmeobjects module when transferring attributes back and forth.

 

For the record, the libraries I menionned in my last comment on that FME Idea are all low-level C/C++ native bindings that FME could use as a last resort if it’s not possible for them to properly support SQL prepared statement parameter binding, and even then only two (Postgres and MySQL) would actually support that. If you’re using Python, though, you have some options, like SQL string composition for psycopg, or statement generation for SQLAmchemy Core expressions, on top of my head.

 

Regarding @harish‘s original issue of SQL executor being slow for large batches of features, the approach I prefer to use in these cases is to write all the features as-is with a FeatureWriter to a “temporary table” (FeatureWriter supports writing up to 2 billion-ish features per transaction) and run a single SQL query to move and transform the data to the actual table I’m trying to write to. It’s rather wonky, since FME switches sessions between statements, meaning actual temporary tables can’t be used, and you need to create and delete your own manually.

Reply