Skip to main content

Hello everyone,

I have two datasets:

- One of lake polygons,

- Another containing the postal code polygons,

My goal is to calculate the sum of area occupied by the lakes in each of the postal codes. My problem is not really technical because my workflow works with data samples (when I use an attribute filter on postal codes and I reduce the size of this data set).

However, when I launch it with all the data, the workflow always stops at entity number 104. I tried to check in my input file if there was a corrupted data, but that n doesn't seem to be the case.

 

Any ideas ?

 

Thanks !!:)

Théo

How does it stop? Does it just freeze, is there a crash, or is there an error or warning message in the log window?


From the workflow diagram I take it it stops after or at the ListSummer.

Could it be that Postal Code area 105 does not have more than 1 lake in it, causing no list to be created?

Try adding a Tester transformer before the ListSummer to check if the list actually exists.

Hope this helps.


How does it stop? Does it just freeze, is there a crash, or is there an error or warning message in the log window?

OK, I see you say "without error" but does it freeze up? i.e. which button is active (run or stop) on the toolbar.


How does it stop? Does it just freeze, is there a crash, or is there an error or warning message in the log window?

The workflow doesnt stop, it continues to perform spatial comparisons, but it takes hours and hours ... while I can run 10,000 other entities (after the 105th) in less than 1 min.

I am putting a screenshot of my log below.

Thank you

logWindow.PNG

 


OK, I see you say "without error" but does it freeze up? i.e. which button is active (run or stop) on the toolbar.

No, FME does not freeze, I have no error in the log and nothing that could lead me to believe that the workflow has a problem.

 

Ps: to answer your question, the active button is the stop button.


From the workflow diagram I take it it stops after or at the ListSummer.

Could it be that Postal Code area 105 does not have more than 1 lake in it, causing no list to be created?

Try adding a Tester transformer before the ListSummer to check if the list actually exists.

Hope this helps.

Hi @erik_jan,

Thank you for your reply ! I put a tester before the ListSummer and the 105th postal code does not generate any list .. however the postal code in question contains many lakes ..


Hi @erik_jan,

Thank you for your reply ! I put a tester before the ListSummer and the 105th postal code does not generate any list .. however the postal code in question contains many lakes ..

In the case where no list is generated, you can test if a lake area attribute exists. In that case that will be the area of lakes in the postal code, else no lake exists and the total area can be set to 0 (using the AttributeCreator).


I don't see it being a lack of list, because the ListSummer will just pass through features without the list (with a sum of 0).

But with the stop button active it does sound like the workspace is still running. How many features do you run it with when you did the test runs? The log says it has done 69,481 comparisons out of a total (if my math is correct) of 178 billion. You're also potentially creating that many list entries.

I mean, I'm always open to being wrong, but it could be that the count stops at 104 because the machine is too busy handling all that data to update it, or the first 104 features might have a tiny amount of overlaps and 105 is really big and slows up the operation.

I can make a few suggestions that might help, or at least would help enough to show what is wrong:

  • Absolutely do not run this with feature caching turned on!
  • Keep a watch on system memory and CPU use to see if they get dangerously low.
  • Use a generalizer transformer to clean up the incoming geometry by reducing the number of unnecessary vertices. That can make a huge difference.
  • Look out for log messages about memory usage and reorganizing memory

Overall, if we can't get this working, it would be worth sending this in to the support team at Safe for evaluation (safe.com/support). Include your data and we can try and run the workspace at Safe creating a performance profile to tell us what is happening.


I don't see it being a lack of list, because the ListSummer will just pass through features without the list (with a sum of 0).

But with the stop button active it does sound like the workspace is still running. How many features do you run it with when you did the test runs? The log says it has done 69,481 comparisons out of a total (if my math is correct) of 178 billion. You're also potentially creating that many list entries.

I mean, I'm always open to being wrong, but it could be that the count stops at 104 because the machine is too busy handling all that data to update it, or the first 104 features might have a tiny amount of overlaps and 105 is really big and slows up the operation.

I can make a few suggestions that might help, or at least would help enough to show what is wrong:

  • Absolutely do not run this with feature caching turned on!
  • Keep a watch on system memory and CPU use to see if they get dangerously low.
  • Use a generalizer transformer to clean up the incoming geometry by reducing the number of unnecessary vertices. That can make a huge difference.
  • Look out for log messages about memory usage and reorganizing memory

Overall, if we can't get this working, it would be worth sending this in to the support team at Safe for evaluation (safe.com/support). Include your data and we can try and run the workspace at Safe creating a performance profile to tell us what is happening.

Another couple of thoughts: maybe try parallel processing (although it would mean creating a custom transformer). Also, you could try a different transformer (say a Clipper) to get the info as it might be faster than the SpatialRelator.

We do have a push for performance improvements for the SpatialRelator planned for FME2021. If you could file a case with support - and include your data - then we can make sure your specific scenario works with optimum performance. It always helps us to have realistic test data and, of course, the data would be kept private and not shared.


I don't see it being a lack of list, because the ListSummer will just pass through features without the list (with a sum of 0).

But with the stop button active it does sound like the workspace is still running. How many features do you run it with when you did the test runs? The log says it has done 69,481 comparisons out of a total (if my math is correct) of 178 billion. You're also potentially creating that many list entries.

I mean, I'm always open to being wrong, but it could be that the count stops at 104 because the machine is too busy handling all that data to update it, or the first 104 features might have a tiny amount of overlaps and 105 is really big and slows up the operation.

I can make a few suggestions that might help, or at least would help enough to show what is wrong:

  • Absolutely do not run this with feature caching turned on!
  • Keep a watch on system memory and CPU use to see if they get dangerously low.
  • Use a generalizer transformer to clean up the incoming geometry by reducing the number of unnecessary vertices. That can make a huge difference.
  • Look out for log messages about memory usage and reorganizing memory

Overall, if we can't get this working, it would be worth sending this in to the support team at Safe for evaluation (safe.com/support). Include your data and we can try and run the workspace at Safe creating a performance profile to tell us what is happening.

Many thanks !!

The problem was the size of the 105th postal code and the number of lakes that were present inside! However, my ram and my CPU was not at the max ... Finally, I solved the problem by reducing the number of vertices and by segmenting my postal codes (the largest)! Thank you !


Reply