Solved

Parallel processing with sql executor

6 years ago
July 19, 2018
1 reply
91 views

+12

tva
Contributor
44 replies

Hi,

I have a workspace that reads some data and after some processing it writes the data via feature writer inside an postgis database. Afterwards more sql scripts are triggerd via sql executors. At the end all data is read again for final writing via FME.

The problem is that the time increases exponentially when the processed data becomes larger (taking a few hours to a few days).

In the processing part I can split my data into smaller parts and write this data in multiple schema's in postgis. The problem is that I can't run the next sql scripts in parallel on the X amount created schema's. If I trigger every script x times, this triggering is done sequentially and timing remains the same.

If I try to put the longest script into a separate workspace runner, I can only run this WSR in parallel when I say that it should not wait for completed job, but then I have an issue with the latest step (= combining all data from X schema's into 1) as the table is not yet existing (or doesn't contain data) since the workspace runner hasn't really finished the job yet.

Does anyone has an idea on how to trigger these scripts in parallel and still wait till those parallel jobs are finished to continue with the last part?

A group by function is not present on sql executor.

Best answer by tva

Ok found the answer in the documentation regarding parallel processing.

Seems it's possible to create a custom transformer, where I can specify in the CT advanced parameters on which attributes and how aggressive to do parallelization. Now I'm playing with the different settings.

I do had to adapt the settings of those parameters as it wouln't read the initial setup.

By just using moderate parallelization, I'm able to cut timing from 5 hours to 40 mins (depends on the amount of cores the machine head). Which is significantly better.

View original

Did this help you find an answer to your question?

+12

tva
Author
Contributor
44 replies
Best Answer
6 years ago
July 20, 2018

Ok found the answer in the documentation regarding parallel processing.

I do had to adapt the settings of those parameters as it wouln't read the initial setup.

By just using moderate parallelization, I'm able to cut timing from 5 hours to 40 mins (depends on the amount of cores the machine head). Which is significantly better.

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos

Parallel processing with sql executor