Skip to main content
Question

Is there a way on FME to take out a sample from a table, where each field attribute is represented x amount of times in the sample?


me.aelmo
Contributor
Forum|alt.badge.img+1

I wish to extract a sample from a table where each attribute from some field should be represented x times. For example see below where x is 1, and the fields I want to sample from are House, Priority and Injury.

 

Input:

ID      Name    House       Priority    Injury 

1        Draco   Slytherin    Medium  Head  

2        Harry    Gryffindor  Low        Hand

3        Ron      Gryffindor  Low        Head

4        Cedric  Hufflepuff  High       Chest

 

Output:

ID      Name    House       Priority    Injury 

1        Draco   Slytherin    Medium  Head  

2        Harry    Gryffindor  Low        Hand

4        Cedric  Hufflepuff  High       Chest

3 replies

j.botterill
Influencer
Forum|alt.badge.img+40
  • Influencer
  • April 15, 2024

hey there, perhaps the Sampler with a “group by: set to the fields you want x sampling rate

 

 


me.aelmo
Contributor
Forum|alt.badge.img+1
  • Author
  • Contributor
  • April 15, 2024
j.botterill wrote:

hey there, perhaps the Sampler with a “group by: set to the fields you want x sampling rate

 

 

This tends to generate an excessive number of samples, particularly when dealing with datasets containing numerous fields. Ideally, the number of samples should be set at n * 10, where n represents the distinct number of attributes of the field with the greatest number of distinct attributes. 


evieatsafe
Safer
  • Safer
  • April 18, 2024

Hi @me.aelmo looks like you’re on the right track. If the issue is that your solution generates too many samples, then maybe sampling a second time could generate a smaller sample? You could perform this with either an additional Sampler, or maybe the solution would be to use a PythonCaller with a script that takes care of this scenario all in one transformer. I won’t be able to advise you on how to write this script, sorry to not be of more help. Let us know if you are able to solve this :)


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings