Skip to main content
Question

Generate Same Random number for sample of data

  • June 3, 2020
  • 3 replies
  • 31 views

nedwaterman
Contributor
Forum|alt.badge.img+9

I have an odd request.

 

My user has a data set of 758 records. They want to sample 450 of these randomly.

I've completed this using a sampler with a random sampling. However, the client wants to add further columns of data to the existing random sample.If I rerun this workspace.I'll generate a new random set of data than before.

 

How do I add a random number to a row, but keep this created number so that when I run the workspace again I get the same sample as before?

 

Thank you!

N

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

3 replies

nedwaterman
Contributor
Forum|alt.badge.img+9
  • Author
  • Contributor
  • June 3, 2020

Should add that the only thing I could think of was running my workspace once, and adding a listbuilder / listconcatenator to the output and store an id of the row somewhere for re-use.


Forum|alt.badge.img+2

Create multiple columns for each row that delineate what run the numbers are from. That way you always have the values for every run in one file.

 

Example;

 

RecordIDRun1Run2Run311231991245632441233432342312286627241245684334448865976566755791

 


jdh
Contributor
Forum|alt.badge.img+40
  • Contributor
  • June 3, 2020

What about creating an attribute called Sampled.

Workspace will check to see if it exists, if it does then test Sampled = yes to get your random sample.

If it doesn't (ie the first time the workspace is run on the data) do a randomized sample and add the attribute to the selected features.