Question

How to find unique by id matching same name closest neighbors from multiple same name group of points coming from 2 layers

3 years ago
November 10, 2021
4 replies
18 views

sani
12 replies

Hi, I'm a bit stuck and would need some help.

As you can see 2 point layers as shown in the image with different colors. All the points have same name 'iris' though their ids are different.

I need to match the closest same name but the ids shouldnt repeat.

For example: here in the below case iris id 11 and id 12 have iris id 2 as their closest. But what I'm wanting as the output is that if iris id 12 gets closest match as iris id 2 then for iris id 11 it should not consider iris id 1 and check for next point which hasnt got a match with any other point yet i.e. iris id 1.

Let me know if there is any possible solutions for it.

Thank you.

+39

virtualcitymatt
Celebrity
1899 replies
3 years ago
November 10, 2021

Hmm, sounds really easy but seems to be pretty complicated when looking at how to do it.

The NeighborFinder will find the closest candidate for a given base feature but it can match the same candidate.

If you use the NeighborFinder to match ALL candidates to the base features then you will have all the possible pairs. 11>2, 12>2, 12>1 and 12>2 in the example above (you can use the list method to create a list of features and then explode the list to create a feature per pair.

You can sort by distance (matched distance) and use a Sampler to sample the first feature per candidate. This kind of works, but you can end up with duplicate base features (essentially the reverse of the initial problem). This happens when a base feature is closer to two candidates than any other base is.

A loop seems to be the best solution here, when in each loop you match just 1 candidate for each base, take the shortest distance for each matched candidate (sorter + sampler) and then perform another comparison using the unmatched candidates and the duplicated/un sampled base features.

Repeat this until there are no more unsmapled base features of no more unmathced candidates.

Here's an example (You would need to use the 'name' as the Group By) :

+39

virtualcitymatt
Celebrity
1899 replies
3 years ago
November 10, 2021

virtualcitymatt wrote:

Hmm, sounds really easy but seems to be pretty complicated when looking at how to do it.

The NeighborFinder will find the closest candidate for a given base feature but it can match the same candidate.

If you use the NeighborFinder to match ALL candidates to the base features then you will have all the possible pairs. 11>2, 12>2, 12>1 and 12>2 in the example above (you can use the list method to create a list of features and then explode the list to create a feature per pair.

You can sort by distance (matched distance) and use a Sampler to sample the first feature per candidate. This kind of works, but you can end up with duplicate base features (essentially the reverse of the initial problem). This happens when a base feature is closer to two candidates than any other base is.

A loop seems to be the best solution here, when in each loop you match just 1 candidate for each base, take the shortest distance for each matched candidate (sorter + sampler) and then perform another comparison using the unmatched candidates and the duplicated/un sampled base features.

Repeat this until there are no more unsmapled base features of no more unmathced candidates.

Here's an example (You would need to use the 'name' as the Group By) :

It could be there is an easier way though. This seems overly complicated for such a simple sounding problem

+45

danilo_fme
Evangelist
2059 replies
3 years ago
November 10, 2021

virtualcitymatt wrote:

Hmm, sounds really easy but seems to be pretty complicated when looking at how to do it.

The NeighborFinder will find the closest candidate for a given base feature but it can match the same candidate.

If you use the NeighborFinder to match ALL candidates to the base features then you will have all the possible pairs. 11>2, 12>2, 12>1 and 12>2 in the example above (you can use the list method to create a list of features and then explode the list to create a feature per pair.

You can sort by distance (matched distance) and use a Sampler to sample the first feature per candidate. This kind of works, but you can end up with duplicate base features (essentially the reverse of the initial problem). This happens when a base feature is closer to two candidates than any other base is.

A loop seems to be the best solution here, when in each loop you match just 1 candidate for each base, take the shortest distance for each matched candidate (sorter + sampler) and then perform another comparison using the unmatched candidates and the duplicated/un sampled base features.

Repeat this until there are no more unsmapled base features of no more unmathced candidates.

Here's an example (You would need to use the 'name' as the Group By) :

Nice job @virtualcitymatt

+51

geomancer
Evangelist
890 replies
3 years ago
November 16, 2021

Unfortunately, this is not as easy as it looks at first sight.

This problem can be solved using the 'Hungarian Algorithm', also known as the 'Munkres Algorithm'.

Some nice implementations of the algorithm in Python as standalone scripts can be found here and here; a Python module can be found here.

So an easy solution may be to let Python do most of the work in FME.

Or you can implement one of the Python solutions in plain FME transformers. Beginning with the NeighborFinder, as @virtualcitymatt suggests, looks to be a promising start.