Just sharing - parallel lmits fmeworker (FME Form)

8 months ago
September 20, 2024
4 replies
50 views

+35

virtualcitymatt
Celebrity
1850 replies

So I have a big job to process and I’m trying to work out some theoretical numbers for the client (a rough estimate on the processing time range depending on various factors). I want to leverage the fmeworker limit of 16 parallel threads (I’m assuming a machine with essentially unlimited resources) because the fme.exe limit is 8.

I have a batch job of many tiles with many rasters and features which all need to be processed at once. The easy thing to do is to just throw it to the workspace runner but the license only gives me 7-8 parallel threads (which is probably more than enough to be fair, most likely CPU will not be the bottle neck).

Anyways for fun I did a test to see what the limit was on the number of parallel groups to send through a custom transformer when parallel processing was on. I pretty quickly got the following error:

An error has occurred. Check the logfile above for details
f_6 (TransformFact): Transformer 'ThreadLimitTester': Unable to create worker; try reducing the parallelism level (currently 'EXTREME') or the number of groups (7663 seen so far, 1000 considered borderline, 10000 close to upper limit in best possible scenario)

I thought this was a nice error - “1000” considered borderline. This error came weather or not to process when the group changed or to wait until all received.

Has anyone run a job with 10s or 100’s of thousands of groups at all? I assume that each groups will take 1 - 2 mins to process. The more parallel threads the better here.

I can definitely design it in a way where I use a mixture of workspace runner plus custom transformer - but I don’t make it more complicated than it needs to be.

Unfortunately it seems like FME Server/FME Flow is not an option

+53

hkingsbury
Celebrity
1491 replies
8 months ago
September 23, 2024

Interesting!

Have how have you got the custom transformer configured in terms of blocking? Are you waiting for all features to be received before running, or have you got it set to ‘When Groups Change’?

+35

virtualcitymatt
Author
Celebrity
1850 replies
8 months ago
September 23, 2024

hkingsbury wrote:

Interesting!

Have how have you got the custom transformer configured in terms of blocking? Are you waiting for all features to be received before running, or have you got it set to ‘When Groups Change’?

I tried it with both.

It could be that my synthetic set up was the issue - not sure, I just fed a creator with 100,000 features into a transformer set to wait for 5000 seconds per feature so it would never get though.

It seems there is never any waiting at all, just tries to submit submit submit!

Most likely if there was more upstream processing or some rate limiting before the features hit the custom transformer the issue might go away. If I were to use a decelerator to slow the arrival time of featutes to roughly match the processing time perhaps it would work.

As it is I think I will try and use the WorkspaceRunner just to be on the safe side. I think I would still use the groups but reduce so the total number of groups per workspace never exceeds 200 or so.

+10

spatialexjames
Contributor
27 replies
8 months ago
September 23, 2024

I often run workspace runners above 1000 and it’s completely fine! Sometimes I need a decelerator for the first if there’s feature caching set on the feature readers of the child-workbench to avoid clashing. I usually go for “minimal” (if using the data interoperability extension as it’s limited to 4 parallel workbenches). The workspace runner is set to “Wait for Job to Complete” so I wonder if the limit is more to do with when this setting is set to “No”.

+35

virtualcitymatt
Author
Celebrity
1850 replies
8 months ago
September 23, 2024

spatialexjames wrote:

I often run workspace runners above 1000 and it’s completely fine! Sometimes I need a decelerator for the first if there’s feature caching set on the feature readers of the child-workbench to avoid clashing. I usually go for “minimal” (if using the data interoperability extension as it’s limited to 4 parallel workbenches). The workspace runner is set to “Wait for Job to Complete” so I wonder if the limit is more to do with when this setting is set to “No”.

It’s fine for child workspaces for sure, the batching works different and doesn’t require a port for process communication (at least as I understand it). Communcation between processes goes over Ephemeral Ports and there are only about 15,000 or so I think.

But I do seems to remember in the past having more like 20,000+ via custom transformer parallel processing but I might be wrong there.

Reply

Rich Text Editor, editor1

Just sharing - parallel lmits fmeworker (FME Form)

4 replies

Reply

Helpful Members This Week

Recently Solved Questions

Speeding up geocoder

All Attributes from GeoJSON Retrieved via HTTPCaller (FME 2021)

Adding the workbench's file path via a creator

A geodatabase feature could not be written

Why does FME store files in My Documents folder?

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

Show all tables with contents in a dbicon

How to clean strings from false bytes?icon

What are fast Geospatial Formats to Read?icon

Health check failures after 24hrs when running FME Server in AWS using Aurora PostgreSQL Serverless for the FME Server Databaseicon

Can not update Geopackage table with Change Detectoricon

Helpful Members This Week

Recently Solved Questions

Speeding up geocoder

All Attributes from GeoJSON Retrieved via HTTPCaller (FME 2021)

Adding the workbench's file path via a creator

A geodatabase feature could not be written

Why does FME store files in My Documents folder?

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings