Solved

Does FME Server handle large batch of workspaces ? (around 1 000 000 workspaces)

1 year ago
14 November 2022
5 replies
2 views

amooo
2 replies

Hello everyone,

Let's say that I need to launch the same workspace 1 000 000 times (one for each object i have in entry). This workspace would have some tester, filter, geometry transformers.

I have planned on using the FME Rest API to launch my jobs. I will launch them by batch of 10 000 jobs (so 100 batchs in total)

Is it a good way to do it ?

I would be happy to read your advices !

icon

Best answer by david_r 14 November 2022, 12:53

View original

5 replies

Userlevel 4

david_r
8177 replies
1 year ago
14 November 2022
Best Answer

I'm fairly certain that it's doable, but I'm less certain that it's the optimal solution: there's a bit of overhead each time a workspace is started, and although it's short, if you multiply it with a million it's going to be noticeable. Is it not possible to process all the objects in the same workspace using e.g. Group By processing?

hi thank you for your response!

You are absolutely right, I realized that there was a bit of overhead (varying from ~200 ms to 2s or 3s between each job) that is why I questioned myself on the right way to do it.

I didn't know about Group By processing, I will explore this option and make some tests !

I wanted to get the most out of the two engines I have.

What is your opinion on launching the workspace with Group By processing twice. The first time with the first half of my group and the second time with the second half ?

Userlevel 2

+12

tomfriedl
Contributor
152 replies
1 year ago
14 November 2022

There is a workflow problem:

1.000.000 workbench, each runnning min. 1 second.

that's 11.5 days.

Tell us more about your workbench.

There is a workflow problem:

1.000.000 workbench, each runnning min. 1 second.

that's 11.5 days.

Tell us more about your workbench.

Yes it is more or less the time it takes !

The idea of the workbench is to take 1 000 000 lines of a table in a postgis database and for each line, we want to check if the geometry of the line is close to a rectangle.

With my team we tried another approach which uses more the database (with sql creator transformer) with a group by instead of reading one line per execution and we are down to 5 hours (instead of ~14 days).

Thank you so much for your answers !!

Userlevel 5

+29

hkingsbury
Celebrity
1109 replies
1 year ago
14 November 2022

Yes it is more or less the time it takes !

The idea of the workbench is to take 1 000 000 lines of a table in a postgis database and for each line, we want to check if the geometry of the line is close to a rectangle.

Thank you so much for your answers !!

Based on this, you shouldn't even need group by. Bearing in mind that in a workspace each feature is a row. So when you do a test, you're only considering that single feature in isolation.

Without knowing your data or the wider requirements, you should be able to achieve that with a LineCloser, CircularityCalculator and Tester

Does FME Server handle large batch of workspaces ? (around 1 000 000 workspaces)

5 replies

Reply

Community Stats

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded