Skip to main content
Question

Cluster sampling of line data


jerrylim735
Contributor
Forum|alt.badge.img+2

I would like to create clusters of line data (geometry) for sampling, the geometry would be connected within each cluster. So far I’ve thought of:

  • Create a bounding polygon > split it into X polygons using Tiler or some other transformer > SpatialFilter. This would be the easiest I guess, but this would create dead ends at the geometry around the polygon boundary. Ideally, I would like a closed/self-contained ‘route’ but not really necessary.
  • SpatialSorter > group every X rows. But I realize this doesn’t give a nice closed-off route either and you would have to manually see how many rows to pick to create a nicer cluster so probably better off with the first option.

Any better ideas? Additionally, it would also be good to be able to define a criteria for sampling like having X km of main roads and Y km of small roads in each cluster but I suppose that can come afterwards by calculating the counts per cluster and choosing clusters to hit the quota.

2 replies

crutledge
Enthusiast
Forum|alt.badge.img+31
  • Enthusiast
  • March 28, 2025
jerrylim735 wrote:

I would like to create clusters of line data (geometry) for sampling, the geometry would be connected within each cluster. So far I’ve thought of:

  • Create a bounding polygon > split it into X polygons using Tiler or some other transformer > SpatialFilter. This would be the easiest I guess, but this would create dead ends at the geometry around the polygon boundary. Ideally, I would like a closed/self-contained ‘route’ but not really necessary.
  • SpatialSorter > group every X rows. But I realize this doesn’t give a nice closed-off route either and you would have to manually see how many rows to pick to create a nicer cluster so probably better off with the first option.

Any better ideas? Additionally, it would also be good to be able to define a criteria for sampling like having X km of main roads and Y km of small roads in each cluster but I suppose that can come afterwards by calculating the counts per cluster and choosing clusters to hit the quota.

Hi ​@jerrylim735 This is interesting. Can you share a pic of what that would look like? I was going to make a suggestion using the Sampler but that’s not the issue. It’s how the lines are connected after selecting for sample right?


jerrylim735
Contributor
Forum|alt.badge.img+2
  • Author
  • Contributor
  • April 1, 2025

Hey ​@crutledge, it would be something like this (did this manually previously, although now that I zoom into it, it wasn’t the best clustering either in terms of ‘closing off’ the cluster).

Ideally, I would like both the sampling and ‘line connection’ to be done either at the same time or one before another, but priority would be the line connection. Since sampling can be done thereafter (assuming there are enough clusters I suppose).

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings