Skip to main content
Solved

Batch process 2 readers

  • December 3, 2015
  • 5 replies
  • 74 views

stuiew
Participant
Forum|alt.badge.img

I have two very large sets of files from 2 different years that I need to compare with each other. I already have a workspace set up that currently performs the comparison and gives me the desired results. Reader_1 in the workspace takes a file from the year_2013 set and Reader_2 takes the corresponding file from the year_2014 set

Reader_1 - file_ABC_year_2013.tif

Reader_2 - file_ABC_year_2014.tif

What I need to do now is run the process as a batch job on all of the data so that each file from year_2013 is processed with the corresponding file from year_2014. The two sets of data contain exactly the same number of files and share a naming convention with the year appended to the end of the file name.

Do I create a csv file that lists all of the files from year_2013 and the corresponding files from year_2014? If so how are these files 'feed' to the workspace? via workspacerunner?

Batch processing using the workspacerunner seems reasonably easy when only one reader is required, however setting up two readers that require files to be matched up does not appear to be as straight forward.

Best answer by pratap

Hi,

Use published parameters and supply the parameter to workspace runner with the help of excel or csv. Each record from excel will supply the input & output for 1 iteration of workspace execution. If you want to execute for 38 files then 38 lines should have path of the readers and writers in excel file.

Pratap

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

5 replies

pratap
Contributor
Forum|alt.badge.img+12
  • Contributor
  • Best Answer
  • December 3, 2015

Hi,

Use published parameters and supply the parameter to workspace runner with the help of excel or csv. Each record from excel will supply the input & output for 1 iteration of workspace execution. If you want to execute for 38 files then 38 lines should have path of the readers and writers in excel file.

Pratap


fmelizard
Safer
Forum|alt.badge.img+21
  • Safer
  • December 3, 2015

Hi,

Use published parameters and supply the parameter to workspace runner with the help of excel or csv. Each record from excel will supply the input & output for 1 iteration of workspace execution. If you want to execute for 38 files then 38 lines should have path of the readers and writers in excel file.

Pratap

This is a good approach. Another way to drive things that might be easier is to again read things from a CSV file, one record per pair you want to read, and then feed that into 2 separate FeatureReader transformers. In modern times (for sure FME 2016, I think also FME 2015) you can set the dataset in the FeatureReader to be an attribute or a string editted concatenation of attributes - just check the pull down menu by the dataset in there). From there you'll have the data emerging for you to test etc.


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • December 3, 2015

or you can use a file/directory reader.

Use a stringsearcher (regexp= (.*)(\\d{4})) and tester

_matched_parts{1} = 2013 (or visas versa) to separate the years.

etc.

or use aggregator grouped by _matched_parts{0}, wich wil give u pairs. (assuming filenames are unique within a year...)

Then you dont first have to create a csv/txt file.


erik_jan
Contributor
Forum|alt.badge.img+23
  • Contributor
  • December 3, 2015

You can use the Directory and File Path reader to scan the existing files in the directory.

This will return the filenames.

Then use the WorkSpaceRunner to run your comparison workspace with the two selected files.

This will enable you to loop through a complete directory without you having to scan the directory for filenames.


stuiew
Participant
Forum|alt.badge.img
  • Author
  • Participant
  • December 4, 2015

This is a good approach. Another way to drive things that might be easier is to again read things from a CSV file, one record per pair you want to read, and then feed that into 2 separate FeatureReader transformers. In modern times (for sure FME 2016, I think also FME 2015) you can set the dataset in the FeatureReader to be an attribute or a string editted concatenation of attributes - just check the pull down menu by the dataset in there). From there you'll have the data emerging for you to test etc.

Thanks for the Help...

I ended up using two FeatureReader transformers to bring the file information in via *.csv files. I broke the data down so that a csv file only contains 1 pair of files (I can play around with this later). Once the csv file are feed in via a Workspace runner 8 processes we able to run at once, making things very quick.

Had a few issues fanning out the data at the end as I initially had nothing to work with for the fanout attribute. Eventually overcame this by using the AttributeExposer to get hold of the fme_basename.