Solved

Cluster points based on quantity or maximum distance


Badge

Hi Folks,

I got a set of 235 points and I need to cluster them, but based in two parameters. I need to spatial optimizer them in clusters of max 5 points and maximum distance between them of 15 km or in groups with less them 5 points and maximum distance between them of 15 km.

 

I am trying to use a NeighborFinder approach, but I got strugle into points that could be in on or more clusters and because of this, I don´t know how to optimize them.

Any ideas or tips?

icon

Best answer by markatsafe 7 July 2020, 23:46

View original

6 replies

Badge +2

@rodrigo_ferrao I don't think NeighborFinder will work for you in this case. FME doesn't really have a density clustering algorithm. I think you'll probably have to look for some 'R' or Python tools that would probably do this for you. Here's a nice summary of some of the clustering methods available in 'R' - it looks like the density clustering is what might work for you.

The tutorial RCaller: Ins and outs of using R in FME is a good starting point for working with R in FME.

I think if you included a sample dataset you might find some community members who would like to pickup the challenge...

Badge +22

Have a look at Helmoet's PointClusterer custom transformer. It takes the number of groups as a parameter, but it can proabaly be adapted to your requirements.

Badge

@rodrigo_ferrao I don't think NeighborFinder will work for you in this case. FME doesn't really have a density clustering algorithm. I think you'll probably have to look for some 'R' or Python tools that would probably do this for you. Here's a nice summary of some of the clustering methods available in 'R' - it looks like the density clustering is what might work for you.

The tutorial RCaller: Ins and outs of using R in FME is a good starting point for working with R in FME.

I think if you included a sample dataset you might find some community members who would like to pickup the challenge...

Thanks by that, I am reading and learning a lot because this need to be my next step (R or Python clustering), but not found yet clustering with max number of points per cluster and/or maximum distance, only minimum points and/or maximum distance. I got my result using NeighborFinder, but in manualy way, I cluster the firsts points and they get out of my analysis them I go ahead with the rest data and next NeighboorFinder, as I said, my data is small so this is possible, but I needed to create more than 50 neighborhoodfinder transforms and the result is ok, I can go through this way, its not the best fit, but it is acceptable.

 

My problem now is that I need to change the parameters and it´s not easy to change parameters of 50 neighboorfinder and with I change the distance and numbers of neighboor to find, I need to change the number of interactions, in that case I will reduce the number, so I gonna need more neighboorfinders interacions. So Looping into my NeighborFinders until my data is over is my goal, but it´s being quite impossible. I have made it perfect to my eyes and I got this answer "

f_12 (TransformFact): Custom transformer 'NEIBA' does not have a port named 'Input_2' which is suitable for parameter 'LOOPBACK_INPUT_TAG'

Custom transformer 'NEIBA' does not have a port named 'Input_2' which is suitable for parameter 'LOOPBACK_INPUT_TAG' "

But I have this port, but it´s not published (because I need the attributes from the last interaction). If I make thhis port visible (Published), my process go ahead, but it´s never end.

Because of necessity of transform qualification as Linked Always, I can´t see where the problem is. I am very frustated that I can´t make a simple loop with a block transform.

 

Badge +2

Thanks by that, I am reading and learning a lot because this need to be my next step (R or Python clustering), but not found yet clustering with max number of points per cluster and/or maximum distance, only minimum points and/or maximum distance. I got my result using NeighborFinder, but in manualy way, I cluster the firsts points and they get out of my analysis them I go ahead with the rest data and next NeighboorFinder, as I said, my data is small so this is possible, but I needed to create more than 50 neighborhoodfinder transforms and the result is ok, I can go through this way, its not the best fit, but it is acceptable.

 

My problem now is that I need to change the parameters and it´s not easy to change parameters of 50 neighboorfinder and with I change the distance and numbers of neighboor to find, I need to change the number of interactions, in that case I will reduce the number, so I gonna need more neighboorfinders interacions. So Looping into my NeighborFinders until my data is over is my goal, but it´s being quite impossible. I have made it perfect to my eyes and I got this answer "

f_12 (TransformFact): Custom transformer 'NEIBA' does not have a port named 'Input_2' which is suitable for parameter 'LOOPBACK_INPUT_TAG'

Custom transformer 'NEIBA' does not have a port named 'Input_2' which is suitable for parameter 'LOOPBACK_INPUT_TAG' "

But I have this port, but it´s not published (because I need the attributes from the last interaction). If I make thhis port visible (Published), my process go ahead, but it´s never end.

Because of necessity of transform qualification as Linked Always, I can´t see where the problem is. I am very frustated that I can´t make a simple loop with a block transform.

 

@rodrigo_ferrao Creating Custom Transformers with looping and blocking transformer like NeighborFinder is not easy. That is why I would suggest looking at other 3rd party options for a problem like this that needs a recursive or iterative algorithms.

 

For NeighborFinder, you could use Published Parameters to set the values in the different NeighborFinders
Badge

@rodrigo_ferrao Creating Custom Transformers with looping and blocking transformer like NeighborFinder is not easy. That is why I would suggest looking at other 3rd party options for a problem like this that needs a recursive or iterative algorithms.

 

For NeighborFinder, you could use Published Parameters to set the values in the different NeighborFinders

Yes, I made it by parameter, but if I want to put less points in my group, I will need more interactions in that case, more neighboorfinders (so need to put more transforms not only set values), a loop will kill this problem. Put I don´t know what is wrong, can I put my loop .fmx here? Is that possible?

Badge

@rodrigo_ferrao Creating Custom Transformers with looping and blocking transformer like NeighborFinder is not easy. That is why I would suggest looking at other 3rd party options for a problem like this that needs a recursive or iterative algorithms.

 

For NeighborFinder, you could use Published Parameters to set the values in the different NeighborFinders
I have JUST finished my looping! The problem was that I need to use variable setter before de looping exit and loop to the input. It´s now clear to me and I´am very happy with it, I love FME hahaha! Thanks by the help I will mark your first answer as right, because R and Python are the key to achieve a perfect clustering.

 

Thanks a lot by your time!!

Reply