Solved

Calculating a distance matrix


Badge

Hi all,

So, I've got this small data set of 12 locations in the Netherlands.

 

 

I would like to create a distance matrix that tells me what the distance (as the crow flies) is between each point.

 

I figured that I could use FME for this.

 

I think I managed to get quite far, but now I'm stuck and in need of some assistance :-(.

So what did I do so far?

 

 

I first created a workspace that reads the data set and then uses the FromToBuilder transformer in combination with an AttributeManager transformer to determine distance from one manually selected point to all other points. A csv-writer writes the resulting table to a csv-file.

 

 

Secondly, I added a new published parameter in combination with the Workspace Runner transformer to run through the 12 points. So that I don't have to run the workspace 12 times and manually assign a point to the to-port of the from-to transformer.

 

 

This generates a csv-file of 12x12=144 features.

 

 

 

I'm no stuck figuring out how to convert the 1x144 table to a 12x12 matrix. With attributenames corresponding to the ID's of the 12 points in the dataset.

 

 

Any ideas how I should proceed? Or should I start over and follow a different path?

 

 

Help will be greatly appreciated

 

Many thanks in advance

 

 

Rolf

 

 

P.S. coord. syst is 'Netherlands-RDNew-2008'

 

 

icon

Best answer by redgeographics 20 May 2019, 16:23

View original

11 replies

Userlevel 4
Badge +25

I think you're thinking waaay too complex here. If you set up a NeighborFinder and feed it your points as both base and candidate features you can have it generate a list of all points and distances. You can then either manually create the attributes for the distances and fill them with the list values (which is a bit of a hassle with 12, but not too tricky) or for example concatenate the list

Badge +10

I agree with @redgeographics - the neighbour finder is the way to go here

matrix.fmwt

Badge

I think you're thinking waaay too complex here. If you set up a NeighborFinder and feed it your points as both base and candidate features you can have it generate a list of all points and distances. You can then either manually create the attributes for the distances and fill them with the list values (which is a bit of a hassle with 12, but not too tricky) or for example concatenate the list

Oh sh*t, I didn't think about NeighborFinder combined with lists. That makes life indeed a lot simpler. Thanks

Badge

I agree with @redgeographics - the neighbour finder is the way to go here

matrix.fmwt

YESSS, this really helps. It even allows me to expand to large data sets. Many thanks @egomm

Badge +10

YESSS, this really helps. It even allows me to expand to large data sets. Many thanks @egomm

The tricky bit really is knowing what attributes to expose, I've had some success using python to create a schema to write out dynamically where the ID's aren't known at run time

Userlevel 2
Badge +17

FYI, here is another approach with cross join: distance-matrix.fmwt (FME 2019.0.1)

Badge +3

Using a neighborfinder in this case seems a waste of processor time.

It's just a Cartesian product. ( as @rhcb and @takashi both remark)

So a unconditional merger would do. Attributecreator to calc the distance between points.(basic calculus)

(zero distance etc., yes discard, its the diagonal)To fascilitate your matrix view , id the objects with a counter and use that as mergkey)

Badge

FYI, here is another approach with cross join: distance-matrix.fmwt (FME 2019.0.1)

Hi Takashi,

 

That's also a very interesting and clever approach. With a much larger point dataset I'm testing both approaches, to see which one is fastest.
Badge +10

A python solution just to add to the list :-)

distance_matrix_python.fmwt

 

Badge

A python solution just to add to the list :-)

distance_matrix_python.fmwt

 

This answer helps me but I have a question, can you help me with this code to get distance matrix when my input are id,x,y, length and I don't need to calculate distance, just write length to the right place based on ID.

Badge +3

FYI, here is another approach with cross join: distance-matrix.fmwt (FME 2019.0.1)

In principle the basic approach outlined by @takashi and @gio is what is generally followed by using Cartesians/Cross Joins. To reword the question, it is asking "What is the Isochrone Chart of these N Nodes if each Node connects to each of the (N-1) others?". There are lots of examples online on how to do this for Isochrones using Network/Graphing techniques, but in SQL it is just:

SELECT A.NodeID AS OriginNode, B.NodeID AS DestinationNode, (B.XCoord-A.XCoord)*(B.XCoord-A.XCoord)+(B.YCoord-A.YCoord)*(B.YCoord-A.YCoord) AS DistanceSqrd FROM Nodes A CROSS JOIN Nodes B WHERE A.NodeID<>B.NodeID

Feeding this as the SQL into InlineQuerier would return 12 x 11 rows of all OriginNodeID vs (DestinationNodeID + DistanceSquared) combinations (Not 12x12 because the "WHERE A.NodeID<>B.NodeID" cancels out a match of the Node to itself. If instead want the self-matches left in the result, then delete the WHERE clause component).

Whilst the table is not formatted as a "12 rows x 12 columns" generally in data handling you don't want to do this. Trying to run downstream analyses or queries on tables that have been transposed/pivoted like this are usually that much harder to carry out since you will have a high, and variable number of columns (N columns). It is generally instead better to keep the separate pairs of Nodes as just extra rows of data of Node A vs Node B combinations and then you only ever have 2 Node ID columns/fields to deal with in the resultant table (Node A ID vs Node B ID), no matter how many Nodes N there are. But there may be legitimate reasons for formatting them into row x column combinations.

As SQLite used by InlineQuerier has no native Sqrt function, instead the returned table can have "DistanceSquared" field returned converted to Distance by using the FME functions exposed in AttributeCreator to take the square root of the value returned by InlineQuerier.

Reply