Question

Generate Same Random number for sample of data

5 years ago
June 3, 2020
3 replies
10 views

nedwaterman
Contributor
69 replies

I have an odd request.

My user has a data set of 758 records. They want to sample 450 of these randomly.

I've completed this using a sampler with a random sampling. However, the client wants to add further columns of data to the existing random sample.If I rerun this workspace.I'll generate a new random set of data than before.

How do I add a random number to a row, but keep this created number so that when I run the workspace again I get the same sample as before?

Thank you!

nedwaterman
Author
Contributor
69 replies
5 years ago
June 3, 2020

Should add that the only thing I could think of was running my workspace once, and adding a listbuilder / listconcatenator to the output and store an id of the row somewhere for re-use.

jlbaker2779
194 replies
5 years ago
June 3, 2020

Create multiple columns for each row that delineate what run the numbers are from. That way you always have the values for every run in one file.

Example;

RecordIDRun1Run2Run311231991245632441233432342312286627241245684334448865976566755791

+28

jdh
Contributor
1982 replies
5 years ago
June 3, 2020

What about creating an attribute called Sampled.

Workspace will check to see if it exists, if it does then test Sampled = yes to get your random sample.

If it doesn't (ie the first time the workspace is run on the data) do a randomized sample and add the attribute to the selected features.

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Generate Same Random number for sample of data