Solved

Use the same set of features repeatedly during parallel processing

9 years ago
November 5, 2015
4 replies
9 views

chiron
6 replies

Simple setup:

A large number of Line features (millions) needs to be overlaid with a set of Area features (25000). So a simple LineOnAreaOverlayer is used. This process is very slow and I am running into memory problems even with FMEx64. I have an attribute on the Line features that can group them into about 250 groups. So a case for parallel processing I thought (with which I have limited experience).

My problem is with the Areas. I want to process each group Line features with the same set of Area features. So I want to re-use the same set of Area futures for every batch after initial input. I cannot seem to think of a good idea to accomplish this. Even with a Custom Transformer. The Area Features do not have the group attribute and for various reasons each batch of Line features must be intersected with all Area features.

The only (fairly inelegant) solution that I could come up with is to cross join the Area Features with the list of groups using a FeatureMerger but that also explodes my Area Features into the millions.

Any other ideas?

Regards

Best answer by ryanatsafe

Have you tried using the Clipper transformer instead of the LineOnAreaOverlayer? The Clipper has a `Clipper Type: Clippers First` option. If you can load the areas into the Clipper first (change the order of the readers in the Navigator to have the areas read first; if the lines and areas are both in the same dataset, try using 2 separate readers), then the memory overhead will be lower.

View original

Did this help you find an answer to your question?

This post is closed to further activity.
It may be a question with a best answer, an implemented idea, or just a post needing no comment.
If you have a follow-up or related question, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+6

ryanatsafe
Safer
32 replies
Best Answer
9 years ago
November 5, 2015

Have you tried using the Clipper transformer instead of the LineOnAreaOverlayer? The Clipper has a `Clipper Type: Clippers First` option. If you can load the areas into the Clipper first (change the order of the readers in the Navigator to have the areas read first; if the lines and areas are both in the same dataset, try using 2 separate readers), then the memory overhead will be lower.

C

chiron
Author
6 replies
9 years ago
November 7, 2015

Thanks I will definitely give this a try and let you know. Did not think of this!

+18

fmelizard
Safer
3725 replies
9 years ago
November 7, 2015

ryanatsafe wrote:

Have you tried using the Clipper transformer instead of the LineOnAreaOverlayer? The Clipper has a `Clipper Type: Clippers First` option. If you can load the areas into the Clipper first (change the order of the readers in the Navigator to have the areas read first; if the lines and areas are both in the same dataset, try using 2 separate readers), then the memory overhead will be lower.

I completely agree with @ryancragg 's approach above and think it is a great way to solve the problem.

I wanted however to plant the idea that you could make a custom transformer that had a FeatureReader in it to read the areas. In that way, every time that custom transformer was fired up (in a parallel processing situation), the areas would be read first and effectively a copy of them would appear in the transformer.

If there was some way to just query out your line features by "group" from their source (what format are they in), then perhaps a feature reader in the same custom transformer could read the right subset. And in so doing you'd be able to do all the processing in parallel -- the main workspace would just send in one feature per "group" to be read, which would then trigger the custom transformer to read the right group.

Could be an interesting scenario to mock up anyway.

C

chiron
Author
6 replies
9 years ago
November 8, 2015

Thanks for the help. I never clearly understood the implications of the Clippert First option in conjunction with Group By. But this can of course be set up to clip features in batches with the same set of aras as you suggested. And by collecting the features from both the Inside and Outside ports I can mimic the LineOnAreaOverlayer I had initially. Run my workspace and 24 hours later I am happy. Clipping took about 4 hours in stead of bombing out after 12 with the LineOnAreaOverlayer. Thanks 1 000 000.

Use the same set of features repeatedly during parallel processing

4 replies

Helpful Members This Week

Recently Solved Questions

Remote Engine and ESRI ArcGIS Server

Dissolving then Aggregating, concatenating attributes

PDF to Table

Python issues with SharePointOnline transformer

Running python subprocess.run in FME Form startup fails to find module boto3: ModuleNotFoundError

Community Stats

Latest FME

Cookie policy

Cookie settings

Related Topics

Flutter 6.26.0

Issue with Closed Testing Program and Credit Card Information Requesticon

The setter 'observerMode' isn't definedicon

Play Store rejection due to Broken Functionality Policy Violationicon

Cannot Build for IOSicon

Helpful Members This Week

Recently Solved Questions

Remote Engine and ESRI ArcGIS Server

Dissolving then Aggregating, concatenating attributes

PDF to Table

Python issues with SharePointOnline transformer

Running python subprocess.run in FME Form startup fails to find module boto3: ModuleNotFoundError

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings