Solved

Best method to resample, mosaic, and then tile large amount of raster files?

4 years ago
July 1, 2020
6 replies
203 views

lindsay
Contributor
16 replies

Hi,

The problem I'm trying to solve is as follows:

I have about 2,000 aerial imagery ECW files, averaging in size around 30MB (so around 60GB in total).
Each of the files cover 1km x 1km on the ground at a resolution of 10cm, so are 10,000px x 10,000px.
I have been asked to downsample each file to 2,048px x 2,048px and output the results as a series of 4km x 4km tiles (each tile being 8,192px x 8,192px).
I have an index feature class that lists each 'FileName' and its corresponding 'Tile' attribute.

I've attached an image that shows this spatially, below.

Large black squares and number labels represent the tiles that I want to output. The coloured squares with red outlines represent the original files.

Previously, when working with a much smaller amount of data, I've been able to use my index feature class, in conjunction with a StringConcatenater, to provide the full path of each file to an 'ER Mapper ECW' FeatureReader.

From there, I used RasterResampler and RasterMosaicker before using a FeatureWriter to output the new data as a single JPEG file.

However, this approach is very slow and likely to fall over when dealing with the large amount of data above, and also didn't account for tiling of the output.

I believe I need to use a smarter method to process one tile at a time and to name my output files with their corresponding 'Tile' name.

What is the best method to do this?

I'm not sure whether I should be using a custom transformer, WorkspaceRunner or PythonCaller to group by 'Tile' and process one at a time. And, if it is one of these, I'm not sure exactly how to implement it...

I feel like I should be able to group my raster processing into a custom transformer and pass one tile in at a time, but can't figure out how.

My work in progress screenshot, below:

Any advice would be most appreciated!

Many thanks,

Lindsay.

Best answer by helmoet

lindsay wrote:

Hi @helmoet,

Thanks very much for your quick and thoughtful response!

I had a look at your attached workbench and I think I understand the logic.

But I'm not sure that I need to use the bounding box clipping, as the boundaries and attributes of both my initial image files and my intended output tiles are captured in my index feature class, meaning I can use that to input a list of features and group by 'Tile'.

I do like the look of the option to Group By in the RasterMosaicker but, again, I think if I can just feed in one tile at a time, even that shouldn't be necessary, as the grouping would have happened ahead of time and the RasterMosaicker should only ever operate on a discrete tile.

For example, in the attached screenshot, I've used a ListBuilder to group by 'Tile' and create a list of the individual 'FileName' attributes:

If I can isolate the 'Raster Processing' green bookmark and feed one tile at a time, I would expect a single mosaic to be output. You can see that on either side of that bookmark, the input and output match the number of tiles, rather than individual files.

I've also attached an FMWT, which I hope contains a sample of the data I'm working with.

The imagery has been downsampled considerably and only covers 4 tiles, but the logic is what I am trying to achieve, except for processing one tile at a time.

I'm also now joining the index feature class back to the ECW files in order to maintain the 'Tile' attribute for naming the output.

Any further suggestions would be most appreciated!

tileaerialsample.fmwt

In that case you could just select the bookmark and turn it into a custom transformer. And after that, in the custom transformer's Transformer parameters, select for Parallel Processing at least "Minimal" which will give your custom transformer a group by parameter. You can use that one to process tile by tile. Be sure to sort your data by tile and the Group By Mode "Process When Group Changes".

Your transformer in the main canvas will look like this:

View original

Did this help you find an answer to your question?

helmoet
195 replies
4 years ago
July 1, 2020

reshuffleresizeraster.fmwHi @lindsay, would it be an idea to create the tiles you want to output as a feature class first with a rectangular geometry? You can than read all raster files, and replace them by their bounding box. Use a Clipper Transformer to clip the bounding boxes and sort the bounding boxes on the row and column of the tiles. Now you can re-read the raster files back in again, resample them down and this time clip the raster using the tile features group by row, column and a Group By Mode "Process When Group Changes". After that you won't use the RasterTiler any more and the RasterMosaicker can also be used in group by modus and "Process When Group Changes". You might be doing some more raster reading, but you won't run into memory trouble.

I attached a mockup workspace that uses FME Training sample datasets to mimic this behaviour.

lindsay
Author
Contributor
16 replies
4 years ago
July 2, 2020

helmoet wrote:

I attached a mockup workspace that uses FME Training sample datasets to mimic this behaviour.

Hi @helmoet,

Thanks very much for your quick and thoughtful response!

I had a look at your attached workbench and I think I understand the logic.

For example, in the attached screenshot, I've used a ListBuilder to group by 'Tile' and create a list of the individual 'FileName' attributes:

I've also attached an FMWT, which I hope contains a sample of the data I'm working with.

The imagery has been downsampled considerably and only covers 4 tiles, but the logic is what I am trying to achieve, except for processing one tile at a time.

I'm also now joining the index feature class back to the ECW files in order to maintain the 'Tile' attribute for naming the output.

Any further suggestions would be most appreciated!

tileaerialsample.fmwt

helmoet
195 replies
Best Answer
4 years ago
July 2, 2020

lindsay wrote:

Hi @helmoet,

Thanks very much for your quick and thoughtful response!

I had a look at your attached workbench and I think I understand the logic.

For example, in the attached screenshot, I've used a ListBuilder to group by 'Tile' and create a list of the individual 'FileName' attributes:

I've also attached an FMWT, which I hope contains a sample of the data I'm working with.

The imagery has been downsampled considerably and only covers 4 tiles, but the logic is what I am trying to achieve, except for processing one tile at a time.

I'm also now joining the index feature class back to the ECW files in order to maintain the 'Tile' attribute for naming the output.

Any further suggestions would be most appreciated!

tileaerialsample.fmwt

Your transformer in the main canvas will look like this:

lindsay
Author
Contributor
16 replies
4 years ago
July 2, 2020

helmoet wrote:

Your transformer in the main canvas will look like this:

@helmoet, that's perfect!

I just found a similar method, posted a couple of years ago and it looks like it is even easier now in FME 2020 to 'Group By' for a custom transformer, provided you know the trick.

My workbench is running right now and seems to be doing exactly what I wanted. It will be some time before it completes, but I'm pretty confident it will get there without falling over!

Thanks again for your help.

sbp
2 replies
3 years ago
November 13, 2021

Hi Lindsay

I am struggling with the same issue - any chance you would share you setup?

lindsay
Author
Contributor
16 replies
3 years ago
November 17, 2021

sbp wrote:

Hi Lindsay

I am struggling with the same issue - any chance you would share you setup?

Hi @sbp ,

It's been a little while since I developed it, but I've attached a version of my workbench, which I hope points you in the right direction.

I may have made some mistakes with the custom tranformer settings, but it seems to do the job...

Let me know how you go.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Best method to resample, mosaic, and then tile large amount of raster files?