Question

Accessing new nodes created by TopologyBuilder on data that is not clean


The TopologyBuilder produces this useful piece of information when generating new nodes on data that is not clean:

Completed intersection processing, phase #2. 3 new nodes were generated among 294792 intermediate lines

Is there any way to access this information other than using further techniques and if not, two things:

  • Can this be raised as an enhancement request?
  • What would be the most efficient way of finding these new nodes on a large dataset (4.5 million line features)?

I am using FME Desktop 2017.0.1.1

Thanks!


11 replies

Userlevel 2
Badge +17

Hi @john_newland,

The existing nodes will retain any attributes from before the TopologyBuilder. If you use a Tester to test for the existence of any of the attributes, the newly created nodes will be output through the Failed port.

Hi @john_newland,

The existing nodes will retain any attributes from before the TopologyBuilder. If you use a Tester to test for the existence of any of the attributes, the newly created nodes will be output through the Failed port.

Hi @DaveAtSafe ,

 

Thanks for the response! Comparing the nodes that are created, there are no special/differing attributes between the existing nodes and the nodes created by TopologyBuilder.

 

The method I'm using at the moment is to pass the output from the Edge port through a Matcher with the original line network, but this is an expensive operation. I'm happy to keep using this method as it isn't a process I will be running often, but as the information x new nodes were generated is produced, it would be great if the TopologyBuilder could be enhanced to output those new nodes through a new port

 

 

Userlevel 2
Badge +17
Hi @DaveAtSafe ,

 

Thanks for the response! Comparing the nodes that are created, there are no special/differing attributes between the existing nodes and the nodes created by TopologyBuilder.

 

The method I'm using at the moment is to pass the output from the Edge port through a Matcher with the original line network, but this is an expensive operation. I'm happy to keep using this method as it isn't a process I will be running often, but as the information x new nodes were generated is produced, it would be great if the TopologyBuilder could be enhanced to output those new nodes through a new port

 

 

If the existing nodes have no attributes that can be used to differentiate them from new nodes, consider adding an attribute with the AttributeCreator before inputting to the TopologyBuilder.

 

 

If the existing nodes have no attributes that can be used to differentiate them from new nodes, consider adding an attribute with the AttributeCreator before inputting to the TopologyBuilder.

 

 

I must reiterate that I am feeding the TopologyBuilder a line network, no nodes.

 

 

Userlevel 2
Badge +17
Hi @DaveAtSafe ,

 

Thanks for the response! Comparing the nodes that are created, there are no special/differing attributes between the existing nodes and the nodes created by TopologyBuilder.

 

The method I'm using at the moment is to pass the output from the Edge port through a Matcher with the original line network, but this is an expensive operation. I'm happy to keep using this method as it isn't a process I will be running often, but as the information x new nodes were generated is produced, it would be great if the TopologyBuilder could be enhanced to output those new nodes through a new port

 

 

Well, why not create 'existing' nodes with another TopologyBuilder (Assume Clean Data: Yes)?

 

Well, why not create 'existing' nodes with another TopologyBuilder (Assume Clean Data: Yes)?

 

@takashi I like it, but generating Topology on 4.5 million features is a hugely expensive operation taking hours on a pretty powerful Azure virtual machine.

 

Userlevel 2
Badge +17
Hi @DaveAtSafe ,

 

Thanks for the response! Comparing the nodes that are created, there are no special/differing attributes between the existing nodes and the nodes created by TopologyBuilder.

 

The method I'm using at the moment is to pass the output from the Edge port through a Matcher with the original line network, but this is an expensive operation. I'm happy to keep using this method as it isn't a process I will be running often, but as the information x new nodes were generated is produced, it would be great if the TopologyBuilder could be enhanced to output those new nodes through a new port

 

 

This workflow could also be possible, but I don't know which one is better on the performance in the actual conditions. Anyway, if the performance still could not be sufficient for the system requirement, I think it would be worth to post an enhancement idea as you mentioned at first.

 

 

Userlevel 2
Badge +17
Hi @DaveAtSafe ,

 

Thanks for the response! Comparing the nodes that are created, there are no special/differing attributes between the existing nodes and the nodes created by TopologyBuilder.

 

The method I'm using at the moment is to pass the output from the Edge port through a Matcher with the original line network, but this is an expensive operation. I'm happy to keep using this method as it isn't a process I will be running often, but as the information x new nodes were generated is produced, it would be great if the TopologyBuilder could be enhanced to output those new nodes through a new port

 

 

Hi @john_newland,

 

 

Another solution that may have better performance would be to enable 'Propagate All Attributes From Input' in the TopologyBuilder. This will add the edge attributes to the node_angle list on the output nodes.

 

 

Use a ListHistogrammer on any of the sublists from the edge features that is unique by edge. If you don't have any, you can create one with a Counter before the TopologyBuilder.

 

 

Next, add a ListSearcher to search the histogram{}.cont list for values greater than 1. Any node with more than one copy of the unique edge id will be newly created.

 

 

The list transformers work on one feature at a time, and so do not need to cache all features like the Matcher. This should improve the performance.

 

Hi @john_newland,

 

 

Another solution that may have better performance would be to enable 'Propagate All Attributes From Input' in the TopologyBuilder. This will add the edge attributes to the node_angle list on the output nodes.

 

 

Use a ListHistogrammer on any of the sublists from the edge features that is unique by edge. If you don't have any, you can create one with a Counter before the TopologyBuilder.

 

 

Next, add a ListSearcher to search the histogram{}.cont list for values greater than 1. Any node with more than one copy of the unique edge id will be newly created.

 

 

The list transformers work on one feature at a time, and so do not need to cache all features like the Matcher. This should improve the performance.

 

Hi @DaveAtSafe

 

 

Thank you - nearly brilliant! However, it does throw up a lot of false positives in the source data (red links that are legitimate loops)

 

 

loops.png
Userlevel 2
Badge +17
Hi @DaveAtSafe

 

 

Thank you - nearly brilliant! However, it does throw up a lot of false positives in the source data (red links that are legitimate loops)

 

 

loops.png
Hi @john_newland,

 

 

We were looking for any edges on a node whose original ids were duplicated, indicating a break (or a loop). Now we first need to eliminate any edges on a node whose new id is duplicated (indicating a loop).

 

 

Please add a another ListHistogrammer to histogram _node_angle{}._edge_id, then add a another ListSearcher to test for a count greater than 1. Add these before the first set, then run that test on only the NotFound from this new ListSearcher. This should eliminate the loops from the new node search.
Hi @DaveAtSafe

 

 

Thank you - nearly brilliant! However, it does throw up a lot of false positives in the source data (red links that are legitimate loops)

 

 

loops.png

 

@DaveAtSafe - sorry for the delay. I tried this out today and works great. Thanks!

Reply