Question

Finding nearest neighbour along a network

  • 4 February 2015
  • 7 replies
  • 14 views

Hi all

 

 

I have roughly 2000 base sites (water temperature monitoring sites, from Excel with an x and y coord) and I would like to identify the nearest candidate site (from several thousand flow guaging stations, from a Shape file but will also need to do with Excel with x/y coords) that it is connected to via a network (rivers, Shape file) where the path along the network between the two is less than 3km. To complicate matters, neither the base or candidate sites sit nicely on the river network, but need to be 'snapped' to it in some way.

 

 

I've had a look at the ShortestPathFinder but didn't really understand how to get it to generate the shortest path between every base and every candidate, and it wouldn't do exactly what I want anyway.

 

 

Does anyone have any suggestions?

 

 

Thanks a lot, Alice

7 replies

Badge +3
To find shortest path between every base and every candidate ( a cartesian product), you must make a "from to" set from these.

 

 

To use it on the network(s), you must use neighborfinder with insert vertex to create their "on network" counterparts.

 

 

Create cartesian product "from to" from these created vertices by unconditionaly merging them , then use it as input "From-to" for ShortestPathFinder.

 

(either create a common attribute like LinkAll= 1 and use FeatureMerger with process duplicates and explode duplicate list or use the UnconditionalMerger from the store)

 

 

 
Userlevel 2
Badge +17
Hi, Alice,

 

 

As Gio mentioned, you can use the FeatureMerger to create From-To lines for every combination of base and candidate.

 

But I'm worried that it consumes very long time and huge memory in your case, since there are 2000 bases and several thousands candidates.

 

2000 x 5000 = 10,000,000 !

 

 

If you can restrict maximum straight distance between base and candidate that should be found the shortest path, the NeighborFinder could be used to increase efficiency.

 

(1) Create points of all bases and candidates with the VertexCreator. (alternatively, you can also generate points from x and y values directly by the Excel reader parameter setting)

 

(2) Send base points to the Base port; send candidate points to the Candidate port of the NeighborFinder.

 

Maximum Distance: <maximum straight distance restriction>

 

Close Candidate List Name: <list attribute name. e.g. "_list">

 

(3) Use the ListExploder to explode Matched features (i.e. base features) on the list.

 

(4) Add a neighbor candidate point to each matched base feature with the VertexCreator, to transform them into From-To lines for the ShortestPathFinder input.

 

Mode: Add Point

 

X Value: closest_candidate_x

 

Y Value: closest_candidate_y

 

 

However, the river network should have node points that match with every base and candidate point (i.e. end nodes of From-To lines) when finding shortest paths using the ShortestPathFinder.

 

One possible way to make such a network is:

 

a. Before the step (2), snap all the base/candidate points to lines or existing nodes of the network. You can use another combination of NeighborFinder and VertexCreator to do that.

 

- Send all the points to the Base port; send river lines to the Candidate port.

 

- Set a certain maximum distance.

 

Every Matched feature (point) will have coordinate on the closest line as its attributes, named "_closest_candidate_x" and "_closest_candidate_y".

 

Then, move the point to the coordinate using the VertexCreator.

 

Mode: Replace with Point

 

X Value: _closest_candidate_x

 

Y Value: _closest_candidate_y

 

b. Send the points and the river lines to a PointOnLineOverlayer. The set of resulting lines will be a network that has the snapped base/candidate points as its nodes.

 

 

Hope this helps.

 

Takashi
Badge +3
several thousands can also be 2000 wich then would yeild 4.000.000 from-to connections....;)

 

 

 
Userlevel 2
Badge +17
hmm. I remember, when I learned about the word "several" in junior high, the English teacher said it usually indicates from 5 to 7, 8...

 

Natural language is more difficult than computer program language!

 

Anyway, it's better to avoid generating such many features as much as possible.

 

Badge +3
send the teacher back to school:

 

 

all dictionaries say:

 

 

sev·er·al(s?v??r-?l, s?v?r?l)

 

adj.

 

1. Being of a number more than two or three but not many

 

 

Number of points to proces being "many" depends on the proces to be dane as well as the power one has, ASkinner may have like...a stack of TESLA's in her system an huge memory..:

 

 
Thanks very much for all your help.  I have managed to solve the problem using various bits of what you said and some other things.  I've attached my workspace in case anyone is interested.  I've attached it as an image so don't know whether you'll be able to zoom in and view the detail.

 

 

Process as follows:

 

 

(1)  Use Densifier to create extra nodes on my river network to increase the precision of snapping

 

(2)  Use Anchored Snapper to snap both sets of monitoring sites to the river network

 

(3)  Use Point on Line Overlayer to cut river network up into lots of lines with points at ends/beginnings so works in Shortest Path Finder

 

(4)  Cut down number of potential site pairs by using Neighbor Finder and List Exploder.

 

(5) Create From-To paths for Shortest Path Finder using Vertex Creator on Closest-x and Closest-Y

 

(6) Shortest Path Calculator to identify sites that are connected via river network, Lenght Calculator to measure how long the paths are, and then a tester to identify those lines less than 1.5km.

 

Userlevel 2
Badge +17
Thanks for sharing your excellent solution.

 

 

Additional comment.

 

If every gap between monitoring point and river line is within a certain tolerance that can be resolved by the AnchoredSnapper, the snapping process might not be essential. Because both the PointOnLineOverlayer and the ShortestPathFinder have a parameter to control tolerance.

 

Have a look at these parameters.

 

PointOnLineOverlayer: Point Tolerance

 

ShortestPathFinder: SnappingTolerance

Reply