See https://gis.stackexchange.com/questions/123289/grouping-village-points-based-on-distance-and-population-size-using-arcgis-deskt for a grouping algorithm.
While it is possible to implement in native FME, I would probably use standard transformers to generate the connected component graph and then do the iteration in a pythonCaller, assigning a "Group ID" to each feature, which you can then Aggregate or Dissolve with a group-by as desired.
@johnwk
Do you have us a sample dataset?
Do the polygon require spatial relation?
But I think you are looking at searching a subset (the solution(s)) out of subsets out of a power set.
Iteration is indeed required (or if you use python or tcl, recursion. Testing elements of a power set during superset creation is relative easy to do for the iterative mode. Not succeeded doing that to a recursive version.)
There may or may not be a total (including all polygons) solution.
Here is a piece of a prior script (its non recursive) I made after some hours of fiddling and reading up on.
It queries within the power set building and only outputs the sets meeting the condition. (don't output them all...this can be huge 2^n)
You need the ID and the population number. Turn your data to a list.
Replace the query as instructed.
SupersetQuery.fmw
This question is more common than you'd think, and there isn't really a great way to do this in FME other than looping. But looping is difficult where you are working on groups of features. One way I've found is to take two adjacent features, merge them if it's OK (their population is OK to do so), then write the data back out. Then re-run the workspace again - and again - until there is a solution (no more polygons can be merged).
The other idea I have is the SpatialSorter transformer from the FME Hub. You could use this to sort data spatially, so that it is pretty much ordered so that each feature is adjacent to the next. This would make the above part easier because you don't have to worry about whether features are spatially adjacent. You just need to take the first two features and assume they are connected.
So once the data is run through a SpatialSorter you could use the Adjacent Feature Attributes setting in the AttributeManager to get the next feature's population, then decide whether to merge it or not.
If you don't care whether the polygons are next to each other or not, then you could simply ignore the SpatialSorter and just go with the Adjacent Feature Attributes.
One other way is to merge ALL of the features together with an Aggregator, creating a list of records. Then you can loop through the list, extracting out values and merging them. It would be an easier method of looping because you're only dealing with one feature, not a group of features.
I hope this helps. I know this is not the easiest task to do in FME, but I do think that it's possible.
So I wanted to check into this a bit more, and found that it's probably what is known as a Bin Packing Problem. You want to fill a bin with features until it reaches a set limit, and then start packing into a new bin.
It turns out that this is quite a challenge. Basically you can check a solution easily but to calculate that solution takes way more time (like the ShortestPathFinder does) because it's an NP problem.
Still, it's possible and there is some code about if you search for Bin Packing Problem Python. For example this page has a piece of code that I ran in FME with success (albeit generating random numbers rather than using incoming FME features).
The other search would be for 2D Bin Packing, which returns a few Github repositories like this. This may or may not help. It depends on whether the shapes being packed are touching each other (which you would want in a spatially based setup).
Anyway, I hope this helps. I'm going to check with our developers and see if we can produce something in the future. It won't help for now, but this issue comes up often enough that I think we need a solution.
So I wanted to check into this a bit more, and found that it's probably what is known as a Bin Packing Problem. You want to fill a bin with features until it reaches a set limit, and then start packing into a new bin.
It turns out that this is quite a challenge. Basically you can check a solution easily but to calculate that solution takes way more time (like the ShortestPathFinder does) because it's an NP problem.
Still, it's possible and there is some code about if you search for Bin Packing Problem Python. For example this page has a piece of code that I ran in FME with success (albeit generating random numbers rather than using incoming FME features).
The other search would be for 2D Bin Packing, which returns a few Github repositories like this. This may or may not help. It depends on whether the shapes being packed are touching each other (which you would want in a spatially based setup).
Anyway, I hope this helps. I'm going to check with our developers and see if we can produce something in the future. It won't help for now, but this issue comes up often enough that I think we need a solution.
Hi @Mark2AtSafe, thanks so much for looking into this in detail for me, it's very much appreciated. Your bin packing approach makes absolute logical sense and is a term i've now used to describe to my client what the problem is! As for the approach itself, I have never used Python before so I guess i'll have to teach myself!
@johnwk
Do you have us a sample dataset?
Do the polygon require spatial relation?
But I think you are looking at searching a subset (the solution(s)) out of subsets out of a power set.
Iteration is indeed required (or if you use python or tcl, recursion. Testing elements of a power set during superset creation is relative easy to do for the iterative mode. Not succeeded doing that to a recursive version.)
There may or may not be a total (including all polygons) solution.
Here is a piece of a prior script (its non recursive) I made after some hours of fiddling and reading up on.
It queries within the power set building and only outputs the sets meeting the condition. (don't output them all...this can be huge 2^n)
You need the ID and the population number. Turn your data to a list.
Replace the query as instructed.
SupersetQuery.fmw
Hi @gio many thanks for your answer. Unfortunately, I cannot access the link you have provided, it has come up with an error. Please can you reupload, as this solution sounds very interesting!
See https://gis.stackexchange.com/questions/123289/grouping-village-points-based-on-distance-and-population-size-using-arcgis-deskt for a grouping algorithm.
While it is possible to implement in native FME, I would probably use standard transformers to generate the connected component graph and then do the iteration in a pythonCaller, assigning a "Group ID" to each feature, which you can then Aggregate or Dissolve with a group-by as desired.
Thanks @jdh for your answer. As I have never used Python before, i will have to spend some time getting to know that.