Question

Fill raster nodata values with min/max/average value of nearest pixels

  • 3 November 2014
  • 8 replies
  • 68 views

Hi,

 

 

i have raster file where i need to fill some nodata pixels with value calculated from neighborhood pixels. Idea is to get average9, min or max value of surrounding pixels. I have tried several different offsetter-clipper-rasterexpressionevaluator loops. Unfortunately workspace is now very slow and result is not either perfect. Have anyone already solved this kind of problem? or any good ideas how this can be solved with FME?

 

 

-Arto

8 replies

Userlevel 2
Badge +17
Hi Arto,

 

 

A long story.

 

Step 1

 

1) Extract these raster properties. The properties will be used to re-create a raster finally.

 

_num_rows, _num_columns

 

_min_x, _min_y, _max_x, _max_y

 

 

2) Transform every cell into point geometry with the RasterCellCoercer.

 

Output Cell Geometry: Points

 

Extract Band Values As: Attributes

 

Extract Nodata Values: Yes

 

Preserve Attributes: Yes

 

Column / Row Attribute: _col_index / _row_index

 

 

3) Rename cell value attribute. e.g. "_cell_value"

 

 

 

Step 2

 

4) With the Tester, divide the data flow into 2 streams - "Points with Data" and "Nodata Points".

 

 

5) Duplicate the "Points with Data" into 8 streams, and increment/decrement "_col_index" and "_row_index" of them to create 8 neighbors.

 

Use 8 AttributeCreators  (the screenshot shows 4 for sample):

 

1. _col_index = _col_index + 1

 

2. _col_index = _col_index + 1, _row_index = _row_index + 1

 

3. _row_index = _row_index + 1

 

4. _col_index = _col_index - 1, _row_index = _row_index + 1

 

5. _col_index = _col_index - 1

 

6. _col_index = _col_index - 1, _row_index = _row_idnex - 1

 

7. _row_index = _row_idnex - 1

 

8. _col_idnex = _col_index + 1, _row_index = _row_index - 1

 

 

6) Merge the 8 neighbors to "Nodata Points" with the FeatureMerger.

 

Join On: _col_index => _col_index, _row_index => _row_index

 

Process Duplicate Supplier: Yes

 

Duplicate Suppliers List: _neighbors

 

 

7) For the Merged points, extract maximum and minimum "_cell_value" in the "_neighbors" list with the ListRangeExtractor, and calculate the average of them. Set the result to "_cell_value" attribute.

 

The NotMerged points weren't adjoining a point having data, they can be discarded.

 

 

 

Step 3

 

8) Send "Points with Data" from the 1st Tester and the modified "Nodata Points" (_cell_value has been changed) to the 3DForcer to transform them into 3D points (_cell_value => z).

 

 

9) Send the 3D points to the NumericRasterizer to re-create a raster with the original raster properties.

 

Size Specification: RowsColumns

 

Number of Columns (cells): _num_columns

 

Number of Rows (cells): _num_rows

 

Background Value: <Nodata value>

 

Fill Background with Nodata: Yes

 

Minimum X: _min_x

 

Minimum Y: _min_y

 

Maximum X: _max_x

 

Maximum Y: _max_y

 

 

 

The RasterCellCoercer is not so efficient. There may be more efficient and elegant way...

 

 

Takashi
Userlevel 4
Hi,

 

 

this is pretty far out of the box and not really within the realm of FME, but I've had good luck using GDAL Python (https://pypi.python.org/pypi/GDAL/) bindings to manipulate raster pixel values directly. Sadly, the fmeobjects API is (as of FME 2014) pretty lacking in the raster department, so your options there are very limited.

 

 

But if you've got the necessary Python skills, you will find the GDAL library very fast and, combined with e.g. numpy, very powerful for stuff like linear regression analysis, etc. You can also easily integrate it into a workspace using the PythonCreator, etc.

 

 

David

 

 
Userlevel 2
Badge +17
I agree that it's worth to consider introducing GDAL.

 

 

Another thought.

 

Although it probably cannot reach the GDAL efficiency, a Python script without GDAL can also be effective to increase efficiency in certain degree.

 

 

-----

 

# Python Script Example

 

class NodataCellValueModifier(object):

 

    def __init__(self):

 

        self.nodata = []

 

        self.neighbors = {}

 

        

 

    def input(self,feature):

 

        if feature.getAttribute('isNodata'):

 

            self.nodata.append(feature)

 

        else:

 

            col = feature.getAttribute('_col_index')

 

            row = feature.getAttribute('_row_index')

 

            self.neighbors['%s_%s' % (col, row)] = float(feature.getAttribute('_cell_value'))

 

        

 

    def close(self):

 

        for feature in self.nodata:

 

            col = int(feature.getAttribute('_col_index'))

 

            row = int(feature.getAttribute('_row_index'))

 

            # Collect neighbor values.

 

            values = []

 

            for (x, y) in [(1,0), (1,1), (0,1), (-1,1), (-1,0), (-1,-1), (0,-1), (1,-1)]:

 

                key = '%d_%d' % (col + x, row + y)

 

                if key in self.neighbors:

 

                    values.append(self.neighbors[key])

 

            # Overwrite _cell_value by the average of min and max of neighbor values.

 

            # This is just an example. You can modify the calculation rule here.

 

            if 0 < len(values):

 

                values.sort()

 

                feature.setAttribute('_cell_value', (values[0] + values[-1]) * 0.5)

 

                self.pyoutput(feature) 

 

?-----

 

I believe that it's much faster than the several AttributeCreators and the FeatureMerger. 

 

In addition, inserting a PointCloudCombiner before the NumericRasterizer could also be effective. It's an interesting usage of the point cloud technology in FME.
Userlevel 2
Badge +17
Added an improvement.

 

 

-----

 

# Python Script Example: Version 2

 

class NodataCellValueModifier(object):

 

    def __init__(self):

 

        self.nodata = []

 

        self.neighbors = {}

 

        

 

    def input(self,feature):

 

        if feature.getAttribute('isNodata'):

 

            self.nodata.append(feature)

 

        else:

 

            col = feature.getAttribute('_col_index')

 

            row = feature.getAttribute('_row_index')

 

            self.neighbors['%s_%s' % (col, row)] = float(feature.getAttribute('_cell_value'))

 

            self.pyoutput(feature) # Added in Version 2

 

        

 

    def close(self):

 

        for feature in self.nodata:

 

            col = int(feature.getAttribute('_col_index'))

 

            row = int(feature.getAttribute('_row_index'))

 

            # Collect neighbor values.

 

            values = []

 

            for (x, y) in [(1,0), (1,1), (0,1), (-1,1), (-1,0), (-1,-1), (0,-1), (1,-1)]:

 

                key = '%d_%d' % (col + x, row + y)

 

                if key in self.neighbors:

 

                    values.append(self.neighbors[key])

 

            # Overwrite _cell_value by the average of min and max of neighbor values.

 

            # This is just an example. You can modify the calculation rule here.

 

            if 0 < len(values):

 

                values.sort()

 

                feature.setAttribute('_cell_value', (values[0] + values[-1]) * 0.5)

 

                self.pyoutput(feature) 

 

-----
Hi,

 

 

Thank you for GDAL tip and special thanks to Takashi. I have now succeeded to get workspace much faster by using Takashi example workspace with some minor changes to get it work with my raster file. Running time is still some hours, but its not a problem anymore. It works and result is perfect.

 

 

-Arto
Badge +2
Hi,

 

 

this is pretty far out of the box and not really within the realm of FME, but I've had good luck using GDAL Python (https://pypi.python.org/pypi/GDAL/) bindings to manipulate raster pixel values directly. Sadly, the fmeobjects API is (as of FME 2014) pretty lacking in the raster department, so your options there are very limited.

 

 

But if you've got the necessary Python skills, you will find the GDAL library very fast and, combined with e.g. numpy, very powerful for stuff like linear regression analysis, etc. You can also easily integrate it into a workspace using the PythonCreator, etc.

 

 

David

 

 

Hi @david_r : how is FME on this four years later? I cannot find an adequate tool to do some interpolation on no data based on surrounding rastervalues. Is that correct?

Userlevel 4

Hi @david_r : how is FME on this four years later? I cannot find an adequate tool to do some interpolation on no data based on surrounding rastervalues. Is that correct?

Four years later we've gotten our long-awaited raster support in the fmeobjects Python API, it makes it a lot easier to manipulate raster contents, although you'll probably need some Python experience to fully profit from it.

@takashi posted som very nice examples on how to use the raster API here:

https://knowledge.safe.com/questions/38000/python-fme-objects-api-for-raster-manipulation.html

Badge +9

An FYI for this... the new RasterConvolver transformer (FME2018.1) may be helpful. Hypothetically, you could take a copy of the original raster and use the convolver to calculate the new values for nodata. Then merge them using the RasterExpressionEvaluator (Mode: Two Rasters). Hope this gives everyone some new ideas for this issue.

Reply