Question

Processing of large amount of raster tiles (Mosaicking/retiling) makes FME engine crash


Hi,

 

 

we are currently in the process of dealing with a large amount of raster data and experiencing some problems.

 

 

Here is what we intend to do:

 

We have a large amount of aerial photography raster tiles in GeoTiff format. To integrate these raster data into our web mapping server we need to create overviews with lower resolution. This way we can achieve comparable performance on all zoom levels.

 

 

Now what we did to approach this:
  • Resample the original tiles (4000x4000 pixel) to one eighth of the original resolution
  • Mosaicking and retiling the resampled tiles. This step serves the purpose to reduce the number of tiles.

 

The overall number of original tiles is around 68,000. We experimented with 2 different solutions: a) resampling, mosaicking and retiling in one workbench and b) split up the process into two workbenches, one for resampling, the second for mosaicking and retiling.

 

 

In both of the cases the FME server engine crashed at some point, leading to a resubmission of the job. I suppose this is happening once all tiles have been read and the mosaicking starts (but we cannot really tell because FME overwrites the original job logfile after the automatic resubmission of the job). With both of the aforementioned approaches it takes around 50 hours till the engine crashes.

 

 

Based on our server logs we neither ran out of RAM nor did we have insufficient CPU capacity nor did we run out of harddisk space. I don't know if there are any parameters in FME server that we need to adjust so that the processes won't crash.

 

 

With regard to modifying the processes:

 

We could split up the tiles by gridcell and process each grid cell separately. One gridcell consists of a maximum of 100x100 tiles, i.e. 10,000 tiles. We could also further split up the grid cells, e.g. in quarters, so that we would end up with a maximum number of 2,500 tiles per gridcell set.

 

 

In summary, the modified process would be the following:
  • Extract gridcell (or gridcell sector) from the tiles into an attribute
  • Perform mosaicking via "group by" so that only tiles that belong to the same gridcell/gridcell sector are mosaicked

 

Do you think changing the process like that would solve the problem of crashing?

 

Is there any way to increase the speed FME reads in the GeoTIFFs? We have the feeling that reading is also getting slower and slower the more tiles FME reads.

 

Are there any other settings we should change to make the process more stable/performant?

 

 

Best regards,

 

Max

2 replies

Userlevel 4
Badge +13
Hi,

 

This article (http://fmepedia.safe.com/articles/Error_Unexpected_Behavior/Error-reading-large-number-of-Raster-files) maybe of helpful, making use of paraller processing is also an option to consider. Further I would check to see if the Raster Pyramider or WebMapTiler can be of use.
Badge +3
Hi,

 

 

Using grids to tile your jobs is a good idea.

 

If you sue the group by clause on the tile ID's consequently you can benefit form parallelprocessing option.

 

 

Also 32bit has a limtied memory acces capacity i believe, 4Gb or so. Maybe you can enlarge the swap?

 

Having 16Gb doesnt prevent crashes due to memory limits.

 

Also it can crash if you user/appdata where it stores the ffs reaches its limt. I ahve asked our ict to enlarge it, because laserdata made it crash all the time. (i previously bypassed it by tiling stategies)

 

 

Splitting the proces in a main proces calling for instance a number of grids and by the use of

 

 a workspacecaller you can have it finish sections before having the next batch run.

Reply