Skip to main content
Solved

max features to read on a folder

  • November 30, 2017
  • 4 replies
  • 45 views

barrett_h
Contributor
Forum|alt.badge.img+1

Hi all,

I'm sure I'm missing something simple here but I can't see a way to limit the number of features that are read when the Reader is pointing to a folder of files eg. c:\\folder\\**\\*.jpg

As far as I can tell, the Max Features to Read options are only applied within a dataset, not across a number of datasets. I'm currently using a Sampler to throttle the number of features that are being processed at any one time but it still has to read every file into the workspace first. As the folder in question has thousands of image files, I would like an easy way to ensure that the process doesn't collapse!

Cheers,

Barrett

Best answer by jlutherthomas

Hi @barrett_h

 

Can you use the Directory and File Pathnames reader first, to read in folders and then pass the first X through to a FeatureReader?

 

You set it to only read in folder names, and set the maximum number of features to read (Folders).

 

If you want to read only X number of features per folder, then in the FeatureReader you can also set the number of features to read, which it would read for each input folder coming from the PATH reader.

 

 

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

4 replies

Forum|alt.badge.img+2
  • Best Answer
  • December 1, 2017

Hi @barrett_h

 

Can you use the Directory and File Pathnames reader first, to read in folders and then pass the first X through to a FeatureReader?

 

You set it to only read in folder names, and set the maximum number of features to read (Folders).

 

If you want to read only X number of features per folder, then in the FeatureReader you can also set the number of features to read, which it would read for each input folder coming from the PATH reader.

 

 


danilo_fme
Celebrity
Forum|alt.badge.img+52
  • Celebrity
  • December 1, 2017

Hi @barrett_h

 

Can you use the Directory and File Pathnames reader first, to read in folders and then pass the first X through to a FeatureReader?

 

You set it to only read in folder names, and set the maximum number of features to read (Folders).

 

If you want to read only X number of features per folder, then in the FeatureReader you can also set the number of features to read, which it would read for each input folder coming from the PATH reader.

 

 

Great suggestion @jlutherthomas

 


danilo_fme
Celebrity
Forum|alt.badge.img+52
  • Celebrity
  • December 1, 2017

Hi @barrett_h,

Just to illustrate that the @jlutherthomas did write you:

1) Reader - Directory and File Pathnames

2) Transformer Counter to generate id for each file your folder

3) Transformer Tester to filter by attribute counter =< max_feature ( published parameters )

4) Transformer FeatureReader to get files using the attribute path_windows.

Attached the Workspace.

Thanks, workspace-max-features-to-read.fmw

Danilo


barrett_h
Contributor
Forum|alt.badge.img+1
  • Author
  • Contributor
  • December 1, 2017

Thank you both @jlutherthomas and @danilo_inovacao. It's not exactly the solution I had in mind but it works great none the less! I was intent on trying to ensure the workspace read only a set number of features but by using the Directory and File Pathnames reader it still reads all the features, but with little impact on the performance of the workspace. I ended up with the following set up (the Directory and File Pathnames reader has the Allowed Path Types set to FILE):

I found that the Max Features to Read parameter in the FeatureReader does not work as I want it to ie. it is counting within a dataset not across all the datasets.

Thanks very much for the suggestions.