Solved

Does FME Server handle large batch of workspaces ? (around 1 000 000 workspaces)

2 years ago
November 14, 2022
5 replies
7 views

amooo
2 replies

Hello everyone,

Let's say that I need to launch the same workspace 1 000 000 times (one for each object i have in entry). This workspace would have some tester, filter, geometry transformers.

I have planned on using the FME Rest API to launch my jobs. I will launch them by batch of 10 000 jobs (so 100 batchs in total)

Is it a good way to do it ?

I would be happy to read your advices !

Best answer by david_r

I'm fairly certain that it's doable, but I'm less certain that it's the optimal solution: there's a bit of overhead each time a workspace is started, and although it's short, if you multiply it with a million it's going to be noticeable. Is it not possible to process all the objects in the same workspace using e.g. Group By processing?

View original

Did this help you find an answer to your question?

david_r
8342 replies
Best Answer
2 years ago
November 14, 2022

amooo
Author
2 replies
2 years ago
November 14, 2022

hi thank you for your response!

You are absolutely right, I realized that there was a bit of overhead (varying from ~200 ms to 2s or 3s between each job) that is why I questioned myself on the right way to do it.

I didn't know about Group By processing, I will explore this option and make some tests !

I wanted to get the most out of the two engines I have.

What is your opinion on launching the workspace with Group By processing twice. The first time with the first half of my group and the second time with the second half ?

+13

tomfriedl
Contributor
175 replies
2 years ago
November 14, 2022

There is a workflow problem:

1.000.000 workbench, each runnning min. 1 second.

that's 11.5 days.

Tell us more about your workbench.

https://community.safe.com/general-10/fme-form-starter-20721

amooo
Author
2 replies
2 years ago
November 14, 2022

tomfriedl wrote:

There is a workflow problem:

1.000.000 workbench, each runnning min. 1 second.

that's 11.5 days.

Tell us more about your workbench.

Yes it is more or less the time it takes !

The idea of the workbench is to take 1 000 000 lines of a table in a postgis database and for each line, we want to check if the geometry of the line is close to a rectangle.

With my team we tried another approach which uses more the database (with sql creator transformer) with a group by instead of reading one line per execution and we are down to 5 hours (instead of ~14 days).

Thank you so much for your answers !!

+54

hkingsbury
Celebrity
1497 replies
2 years ago
November 14, 2022

amooo wrote:

Yes it is more or less the time it takes !

The idea of the workbench is to take 1 000 000 lines of a table in a postgis database and for each line, we want to check if the geometry of the line is close to a rectangle.

Thank you so much for your answers !!

Based on this, you shouldn't even need group by. Bearing in mind that in a workspace each feature is a row. So when you do a test, you're only considering that single feature in isolation.

Without knowing your data or the wider requirements, you should be able to achieve that with a LineCloser, CircularityCalculator and Tester

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Does FME Server handle large batch of workspaces ? (around 1 000 000 workspaces)