Optimizing time consuming workspace

Dear FME community,

I have created an FME workspace for processing cadastral data that does it's job. However, it is rather slow and takes several days to complete. I want to briefly explain what the workspace does. Perhaps somebody here has an idea on how to optimize the workspace further.

It consists of a total of 5 polygonal datasets. The first one represents cadastral parcels. The others are dataset of the type of use, for example areas of living, areas of traffic and areas of economic usage. The datasets all overlap, but the boundaries of the type of use need't to be equivalent to the cadastral parcels, which means they can be smaler or overlap. The goal is to calulate for each cadastral parcel the areas for all overlaping type of use. The sum of these areas must not be bigger or smaler then the offical area of the parcels.

For example:

One parcel has an offical total area of 1000 square metres (sqm). It has three overlapping types of use with an real (GIS) area of:

A) Traffic: 205 sqm

B) Living: 499 sqm

C) Economic: 303 sqm

The first task is to calculate which type of use there are on a specific parcel. For that I use the AreaOnAreaOverlayer (AoA), with a list for the use types. This list is then "exploded" to features, so that for each parcel a new feature for the type of use is created.

The second task is to adjust the areas of the types of use. In the example above, the sum of the areas must be exactly 1000 to match the parcel. Since I do not know how many types of use are in one parcel (list), I use several ExpressionEvaluators. Before that, I split the features with a TestFilter in dependance on the number of types of use per parcel. All this is static, so if there are for example ten types of use per parcel possible, there have be at least ten groups of ExpressionEvaluators transformers.

It it's of any interest I will upload the workspace. But perhaps somebody has an idea on how to optimize it, especially the time consuming part with the AreaOnAreaOverlayer.

Parallel processing wasn't of any help, because it requires a group by parameter which is only present in the cadastral parcels dataset, not in the others.

Kind regards

Thomas

Page 1 / 1

The timeconsuption depends on how you relate the objects. And of course size of the files matter..are you doing entire Holland? ;)

For a city the size of Leiden it takes like a couple of minutes to do that.

Yes, show us the workspace.

Hehe, no, its just a federal state in Germany. But I've looked it up. It's approx. half the size of the Netherlands ;)

I will upload the workspace tomorrow and post the link here.

Kind regards

Thomas

Here's the workspace: https://www.sendspace.com/file/goxtz6

I know it's rather extensive and not self describing. ;)

Kind regards

Thomas

Hi Thomas,

After looking at your workspace i conclude you are doing statistics on the parcels according to the type of landuse.

You do not need to explode the objects after the AreaOnArea overlayer. It is this that results in huge number of objects.

Also Attribtuerenamer and attributerenamer_ are the same...(1x procestime wated)

Remove the exploder. Combine tester_3 and tester_2 into one tester. Calculate areas and keep >0.2.

Then merge the Kadastralobjects with the areatype data to link the Fleache to the Teilflaeche

Now you can calculate all the statistics using StatisticsCalculator and "group by" keys.

So, you replace the aggregators by Statisticscalulators grouped by the attributes you use in the aggregators.

This way you have no fanout and fanin ( in your case 12 )

You do not need to test how many types are in each parcel.

You can make the workbench much smaller and of course faster.

You don not need to use ListSorter_2.

You can get the maximum Teilflaeche by using ListRangeExtractor.

This is much faster then sorting and picking indexnr{0}

This is because you only need the areas (Fleache of the objects) to do statistics, so you do not need to explode anything.

Another tip: use AttributeRemover or Keeper, you dont need to drag along all the created attributes or listatribute, this can save quite some processtime as well.

So you do not need the huge fanouts anymore.

something like this

statistics and calculations all after the merger.

No need for all the fanouts etc.

correction...

little explosion after all for the multi type landgebruik/landuse/landnutzung.

1:35 sec for the city of Leiden, kadaster to landuse statistics.

Hi Gio,

many thanks for your time and input! I have not had the time to test your recommendations, but they sound good and I will report back as soon as I have the chance to implement them.

Kind regards

Thomas

Reply

Community Stats

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded