we are currently in the process of dealing with a large amount of raster data and experiencing some problems.
Here is what we intend to do:
We have a large amount of aerial photography raster tiles in GeoTiff format. To integrate these raster data into our web mapping server we need to create overviews with lower resolution. This way we can achieve comparable performance on all zoom levels.
Now what we did to approach this:
- Resample the original tiles (4000x4000 pixel) to one eighth of the original resolution
- Mosaicking and retiling the resampled tiles. This step serves the purpose to reduce the number of tiles.
The overall number of original tiles is around 68,000. We experimented with 2 different solutions: a) resampling, mosaicking and retiling in one workbench and b) split up the process into two workbenches, one for resampling, the second for mosaicking and retiling.
In both of the cases the FME server engine crashed at some point, leading to a resubmission of the job. I suppose this is happening once all tiles have been read and the mosaicking starts (but we cannot really tell because FME overwrites the original job logfile after the automatic resubmission of the job). With both of the aforementioned approaches it takes around 50 hours till the engine crashes.
Based on our server logs we neither ran out of RAM nor did we have insufficient CPU capacity nor did we run out of harddisk space. I don't know if there are any parameters in FME server that we need to adjust so that the processes won't crash.
With regard to modifying the processes:
We could split up the tiles by gridcell and process each grid cell separately. One gridcell consists of a maximum of 100x100 tiles, i.e. 10,000 tiles. We could also further split up the grid cells, e.g. in quarters, so that we would end up with a maximum number of 2,500 tiles per gridcell set.
In summary, the modified process would be the following:
- Extract gridcell (or gridcell sector) from the tiles into an attribute
- Perform mosaicking via "group by" so that only tiles that belong to the same gridcell/gridcell sector are mosaicked
Do you think changing the process like that would solve the problem of crashing?
Is there any way to increase the speed FME reads in the GeoTIFFs? We have the feeling that reading is also getting slower and slower the more tiles FME reads.
Are there any other settings we should change to make the process more stable/performant?
Best regards,
Max