Skip to main content

Hi all,

I have a set of 8000+ GML files that i wish to batch process into a geodatabase, i have found that putting them a workspace runner is effective but slows down massively past 50 GML files. Therefore, i want a way to write a new geodatabase file every 50 features, to reduce the 8000 GML files to 160 geodatabases to then combine afterwards.

Has anyone any ideas? Apart from a huge conditional value in attribute manager!

Cheers

Jack

Lots of possible solutions, but the easiest is perhaps to use the Grouper transformer from the FME Hub and use the _group_index as part of the dataset fanout.


I would suggest you take a look at the ModuloCounter combined with a path reader. You should be able to use the ModuloCounter to help create Groups (https://www.safe.com/blog/2016/12/parallel-processing-tips-evangelist159/ ). Combine it with the Aggregator to build an attribute containing the 50 filenames going into the WorkspaceRunner.

 

The result should be a WorkspaceRunner which runs 160 times each with 50 GML files as the source.

Lots of possible solutions, but the easiest is perhaps to use the Grouper transformer from the FME Hub and use the _group_index as part of the dataset fanout.

Did you know that...

The Grouper works great, but the GroupCounter works faster and does the same job?


Did you know that...

The Grouper works great, but the GroupCounter works faster and does the same job?

I did not! Thanks Lars :-)


I did not! Thanks Lars :-)

Now you do!

I created it once and added it to the hub. Found out half a year later that Hans made a similar custom transformer. So I had to put them to the test. If Hans's would outperform mine, I would have removed mine from the Hub, but that wasn't the case.


Now you do!

I created it once and added it to the hub. Found out half a year later that Hans made a similar custom transformer. So I had to put them to the test. If Hans's would outperform mine, I would have removed mine from the Hub, but that wasn't the case.

Haha, sounds like a challenge to Hans 😉 @redgeographics


Haha, sounds like a challenge to Hans 😉 @redgeographics

It's not like I'm short on challenges lately ;)


Did you know that...

The Grouper works great, but the GroupCounter works faster and does the same job?

Oh yeah, this works a treat. Works in BulkMode too. Nice one Lars


Oh yeah, this works a treat. Works in BulkMode too. Nice one Lars

You're welcome!


I would suggest you take a look at the ModuloCounter combined with a path reader. You should be able to use the ModuloCounter to help create Groups (https://www.safe.com/blog/2016/12/parallel-processing-tips-evangelist159/ ). Combine it with the Aggregator to build an attribute containing the 50 filenames going into the WorkspaceRunner.

 

The result should be a WorkspaceRunner which runs 160 times each with 50 GML files as the source.

This seems like a good solution, but does the WorkspaceRunner need quotation marks and comma separators between filenames? I've tried with and without and I just get warnings and it doesn't run.


Lots of possible solutions, but the easiest is perhaps to use the Grouper transformer from the FME Hub and use the _group_index as part of the dataset fanout.

Hi David, works perfectly, thanks for the solution!


it's as simple as

@ceil(@div(@Value(some_value),@Value(some_modulus)),0))

or floor or combine div, fmod

All can create groups based on some modulus.

And require only 1 line in a tester or attrobutecreator or..

 

 

 


Reply