Skip to main content
Question

Creating a sample based on a length attribute


Forum|alt.badge.img

Hi there,

 

I'd like to find out if there is a way to generate a sample of my data based on a measurement instead of a count?

The Sampler transformer seems to do the calculations based on counts of features, but what I need is, for example, say I have 1000 kilometers of road geometry, and I want a random sample of 50 km.

Is this possible?

 

Thanks in advance,

8 replies

takashi
Evangelist
  • March 1, 2019

Hi @robbie_botha, if I understood your requirement correctly, the RandomNumberGenerator and the Snipper might help you.

Create a random number between 0 and 950 (= 1000 - 50), then snip out a 50 length part from the random number's position on the line.


david_r
Celebrity
  • March 1, 2019

If you don't want to alter (snip) the geometry you can use an AttributeCreator and a Tester to only let the first total 50 km of road segments to pass through, e.g.

You can then e.g. test for _total_length < 50000 in the Tester.


Forum|alt.badge.img

Hi @takashi

I think what I failed to mention is that it is not a single line feature. It is an entire network of several roads, which are in total 1000km. I want to sample a few random roads that have a total combined length of 50km.

My apologies, I should have added it to the original post. It is important that I keep the geometry in the state it is now, so I'm not sure snipping will work.


david_r
Celebrity
  • March 1, 2019
david_r wrote:

If you don't want to alter (snip) the geometry you can use an AttributeCreator and a Tester to only let the first total 50 km of road segments to pass through, e.g.

You can then e.g. test for _total_length < 50000 in the Tester.

If you want to randomize the data, you could insert a Sampler before the AttributeCreator, set to randomize sampling.


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • March 1, 2019

@robbie_botha

Count the total roadfeatures and use a rng to create sample selections.

@round(@rand*total_featureCount, 0).

Then use this to select features, add their length till 50 is reached.

If you need a sample of some minimal size then create sets that sum to 50. (Query the superset for your minimal count set. You probably need to use TCL or some lesser snake to do it. And don't output the superset.)

 

Common misunderstanding: if you read a dataset and mutilate it in some fme process... the original dataset is still were it was, in its original state. Till you overwrite it!

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • March 1, 2019

Here is a TCl powerset version I made. (with the help of math sites of course..)

This one is iterative. (I failed to get the recursive version with inner summing/counting to work and di nto pursue it any further)

Stuff it in a TCL-Caller.

 

This one counts objects. You can easily adapt it to sum lengths.

change {[ladd $pset_cnt] <= 100} to do that.

@Takashi can probably do that in a sec..

 

proc test {} {

 

##objectenlijst ophalen

 

set list items{}

 

##fme lists inlezen naar tcl lists

 

for {set i 0} {[FME_AttributeExists Object_list{$i}.OBJECTID]} {incr i} {

 

lappend items_id [FME_GetAttribute Object_list{$i}.OBJECTID]}

 

for {set i 0} {[FME_AttributeExists Object_list{$i}.count]} {incr i} {

 

lappend items_cnt [FME_GetAttribute Object_list{$i}.count]}

 

set Subsets_out [lindex [powersetb $items_id $items_cnt] 0]

 

set Subsets_count [lindex [powersetb $items_id $items_cnt] 1]

 

set Subsets_itemcount [lindex [powersetb $items_id $items_cnt] 2]

##Aan criteria valdoane Subsetlijst naar fme list schrijven

 

for {set i 0} {$i<[llength $Subsets_out]} {incr i} {

 

FME_SetAttribute Subset_list{$i}.Subset [lindex $Subsets_out $i]}

 

for {set i 0} {$i<[llength $Subsets_out]} {incr i} {

 

FME_SetAttribute Subset_list{$i}.count [lindex $Subsets_count $i]}

 

for {set i 0} {$i<[llength $Subsets_out]} {incr i} {

 

FME_SetAttribute Subset_list{$i}.itemcount [lindex $Subsets_itemcount $i]}

 

 

##subsets in aparte lists stoppen??

 

 

}

 

 

##powerset

 

proc powersetb {set count} {

 

set res {}

 

set res_cnt {}

 

set res_itemcount {}

 

for {set i 0} {$i < 2**[llength $set]} {incr i} {

 

set pos -1

 

set pset {}

 

set pset_cnt {}

 

set pset_itemcount {}

 

foreach el $set {

 

if {$i & 1<<[incr pos]} {

 

lappend pset $el

 

lappend pset_cnt [lindex $count [lsearch $set $el]]

 

}

}

 

if {[ladd $pset_cnt] <= 100} {

 

lappend res $pset

 

lappend res_cnt $pset_cnt

 

##aantal elementen per subset tellen

 

lappend res_itemcount [expr {[lsearch $pset [lrange $pset end end]] + 1}]

 

}

 

}

 

##alle items van de uitvoerlist uitvoeren in een samengestelde list

 

return [list $res $res_cnt $res_itemcount]

 

}

proc ladd {l} {

 

set total 0

 

foreach nxt $l {

 

incr total $nxt

 

}

 

return $total

 

}

 


jdh
Contributor
Forum|alt.badge.img+28
  • Contributor
  • March 1, 2019
robbie_botha wrote:

Hi @takashi

I think what I failed to mention is that it is not a single line feature. It is an entire network of several roads, which are in total 1000km. I want to sample a few random roads that have a total combined length of 50km.

My apologies, I should have added it to the original post. It is important that I keep the geometry in the state it is now, so I'm not sure snipping will work.

Do you want truly random roads, which most likely end up with several road segments that are completely unconnected to one another, or do you want a random 'region of connected roads'?


Forum|alt.badge.img
  • January 13, 2020
david_r wrote:

If you don't want to alter (snip) the geometry you can use an AttributeCreator and a Tester to only let the first total 50 km of road segments to pass through, e.g.

You can then e.g. test for _total_length < 50000 in the Tester.

Hi there! I am looking for something similar, but with parcels and acres. See my question here:

https://knowledge.safe.com/questions/105643/add-until-a-certain-value-is-reached.html

I need to know if there is a way to randomly select parcels that together have a combined area of 13 000 ha

Thank you!


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings