Skip to main content

Hello!

I'm having a near meltdown trying to figure out how to rework a process that I have and hope the community can help a bit. I'm working on a project that is determined a suitability analysis on which curbs are best suited for EV installation charging. By using the SpatialRelator, I am currently processing millions of feature lines (the curbs) and comparing each line to 20-30 different datasets (polygon areas that have been prepped with preconfigured parameters) to see which of them overlap. I then use a ListExploder to explore those relationships, and then a FeatureMerger that merges the different individual "scores" for each of those 20-30 datasets. Then I run it through an Aggregator to basically generate a total "score" from each of the datasets that intersects with a particular line feature, grouped by the line feature. For a smaller dataset, it works fine!

However, trying to run 500,000 features using a WHERE clause lasted over (last I checked) 18+ hours, and then I think my computer restarted while I slept, so I have no idea how long it would have actually taken!

If anyone has any suggestions on alternatives, I'm all ears! I've attempted the FeatureReader way, but I need to maintain the intersecting relationships in order to generate "scores" for each dataset that intersects with a line. Thank you in advance for any insight.

Looking over it:

  1. Get rid of the FeatureMerger. It is a slow Transformer. Use FeatureJoiner instead.
  2. Strip out All Attributes that get passed to the SpatialRelator and ListExploder to just: CurbID in the Requestor, and SupplierLayerName, SupplierFeatureID in the Supplier Port. Maintaining fully attributed features through these Transformers is computationally expensive as the Attributes and List Attributes need to be built out Feature by Feature even though they aren't required to do the Spatial comparison. Instead, after the ListExploder, join the lean, exploded List of CurbID, SupplierLayerName, SupplierFeatureID, back in with the Missing Source Attributes with a FeatureJoiner. FeatureJoiner is great for these situations because it uses Attribute indices/Bulk Mode, it will merge back in the removed Attributes at the end of the workflow very fast
  3. An alternative, faster method than using SpatialRelator for In-Workspace spatial relationship is to send the data to a Temporary SpatiaLite database with a FeatureWriter, perform the Spatial Relate there with an SQLExecutor executed on the Spatialite DB File, and return the results back through the output port of the SQLExecutor. When dealing with Lists and larger datasets, I find this outperforms SpatialRelator by a significant margin, but the downside is that you need to know how to activate the Spatial Index inside a SpatiaLite SQL statement (SpatiaLite's author, Alessandro Furieri, posts examples online on how to do this though using the SpatialIndex Virtual Table inside a SQL statement)
  4. I can't see what the source data format and location is, but if it is within a Spatially enabled DB, then there are similarly options for doing the spatial comparison here via Eg. SQLCreator/SQLExecutor.

@spizam Spatialrelator can be slow. We're just working on adding a spatial index, amongst other improvements, so down the road this will hopefully be a better experience for you. For now:

- as @bwn says replace the FeatureMerger. It looks like a look-up table, so if that is relative small (>100 records say) then DatabaseJoiner can sometimes be faster as it will cache the join table.

- you data is very localized so try and split your source data into neighborhoods or regions.


My organisation recently upgraded to FME 2021.0.1.0 build 21313. We also experienced some serious performance problems with the SpatialRelator transformer with that particular version of FME. The good news is that the problem appears to be fixed with a more recent dot release, FME 2021.1.0 build 21607. The fix is documented in the change log at build 21547 here:

https://downloads.safe.com/fme/2021/whatsnew_2021_1.txt


Reply