Skip to main content
Question

Transform NetCDF data to Feature Classes - how to reduce the processing time?


Forum|alt.badge.img

Dear FME-users,

 

 

I’m creating feature classes (polygons) from NetCDF-data. Among other things I use the transformer “RasterCellCoercer”, the transformation needs so much time.

 

 

I’ve 504 input and 504 output files. The geometry is always the same, just the values of the input data are different.

 

 

Do you’ve an idea how to reduce the processing time? Maybe a predefined grid?

 

 

Thank you so much and best regards.

 

 

Konrad

17 replies

david_r
Evangelist
  • May 15, 2017

Unfortunately, the RasterCellCoercer is really slow when dealing with large datasets.

Apparently you can also consider using the PointCloudCoercer which supposedly is a lot faster, but I've never used it myself.


takashi
Influencer
  • May 16, 2017

Hi @UBA_KP, generally, creating many features consume much resource and could take a long time. The RasterCellCoercer is a typical case. Since it creates large number of new features (number of columns x number of rows), could take a long time to complete the processing as @david_r mentioned.

Although Python API for raster manipulations has been introduced in FME 2017, I don't think a large effect can be expected even if you use the API, as long as the large number of features have to be created as the translation result.

However, if you could significantly reduce the number of features that should be created by setting some conditions, Python scripting could be a possible way to improve the performance and I think it would be worth to try.


itay
Supporter
Forum|alt.badge.img+17
  • Supporter
  • May 16, 2017

The RasterToPolygonCoercer can be an alternative but as mentioned before this does not necessarly mean that it will be faster.

If the resolution of the rasters is not very important, consider resampling before converting to polygons.


Forum|alt.badge.img
  • Author
  • May 16, 2017

Thank you so much for answering my question @david_r, @takashi and @itay.

 

 

Maybe there’s a possibility to factorize the spatial information and operations?

 

 

It means that the attribute values of the NetCDF data have to written into a list with an allocated ID. Afterwards the ID’s can combined with the ID’s of a predefined grid. It might be possible that the processing time can be reduced on this way.

 

 

What do you think about that? Maybe you’ve an idea. :)

 

 

You’re right @itay, the “RasterToPolygonCoercer” works much faster.

 

 

Thank you and best regards. :)

 

 

Konrad

Forum|alt.badge.img
  • Author
  • May 16, 2017

It seems that my question was not clear enough. Is it possible to write NetCDF data into a table without spatial information?


itay
Supporter
Forum|alt.badge.img+17
  • Supporter
  • May 16, 2017

Yes, remove the geometry with the GeometryRemover after extracting the data into attributes.


david_r
Evangelist
  • May 16, 2017
uba_kp wrote:

It seems that my question was not clear enough. Is it possible to write NetCDF data into a table without spatial information?

Which part of the NetCDF? Only the layer infos or also all the cell values? If you need the cell values, what's the database schema like?

Forum|alt.badge.img
  • Author
  • May 16, 2017
david_r wrote:
Which part of the NetCDF? Only the layer infos or also all the cell values? If you need the cell values, what's the database schema like?
I only need the cell values for every column/line.

 

 


Forum|alt.badge.img
  • Author
  • May 16, 2017
itay wrote:

Yes, remove the geometry with the GeometryRemover after extracting the data into attributes.

I've done this but the table is still empty. I'd like to have all cell values, for this I've to use the "RasterToPolygonCoercer" or "RasterToPointCoercer". This needs a lot of time and I'm looking for alternatives.

 


david_r
Evangelist
  • May 16, 2017
uba_kp wrote:
I only need the cell values for every column/line.

 

 

Not sure I understand. You need all the cell values then?

takashi
Influencer
  • May 16, 2017
uba_kp wrote:

It seems that my question was not clear enough. Is it possible to write NetCDF data into a table without spatial information?

What kind of table is required? For example, does a CSV table (comma separated cell values x rows) satisfy the requirement?

 


scyphers
Participant
Forum|alt.badge.img+5
  • Participant
  • May 16, 2017

Hi Konrad,

You have already seen how quick the list representation of NetCDF data is very quick. If you can ignore the rasterness of the data in what you are trying to accomplish i would do so. But for slice/chunking NetCDF, there might be a RCaller recipe out there you could use which would be more dynamic.

But if your NetCDF is a known size every time.

Have you tried to use a RasterTiler? You could force the Row/Col layout of the tiles to be 1 row and n of how many columns your data has. this would allow you to limit the amount of dataset I/O (_tile_column = 1) you read and could express the list values into your features at that point.

and then drop the geometry altogether as simple table records.


Forum|alt.badge.img
  • Author
  • May 19, 2017
scyphers wrote:

Hi Konrad,

You have already seen how quick the list representation of NetCDF data is very quick. If you can ignore the rasterness of the data in what you are trying to accomplish i would do so. But for slice/chunking NetCDF, there might be a RCaller recipe out there you could use which would be more dynamic.

But if your NetCDF is a known size every time.

Have you tried to use a RasterTiler? You could force the Row/Col layout of the tiles to be 1 row and n of how many columns your data has. this would allow you to limit the amount of dataset I/O (_tile_column = 1) you read and could express the list values into your features at that point.

and then drop the geometry altogether as simple table records.

Thank you very much for your input @scyphers. I'll try this very soon.

 


Forum|alt.badge.img
  • Author
  • May 19, 2017
david_r wrote:
Not sure I understand. You need all the cell values then?
Thank you @david_r. I'll try what @scyphers has recommended and I'll give you feedback afterwards.

Forum|alt.badge.img
  • Author
  • May 19, 2017
takashi wrote:
What kind of table is required? For example, does a CSV table (comma separated cell values x rows) satisfy the requirement?

 

Thank you @takashi, I don't need any table format like the CSV format. I'm interested to create a table-based process without geometry, because I think it's faster. At the end of my process I'd like to write all table values into a predefined grid. I'll try what @scyphers has recommended and I'll give you feedback afterwards.

 


fmelizard
Contributor
Forum|alt.badge.img+17
  • Contributor
  • May 21, 2017

I'm thinking that some of our new technology for handling large numbers of identical schema features would really really shine here (someday). In the meantime, about the best I can recommend is to run FME in Parallel across all these, using either workspacerunner or a custom transformer that does the work and runs in parallel...


Forum|alt.badge.img
  • Author
  • May 22, 2017
fmelizard wrote:

I'm thinking that some of our new technology for handling large numbers of identical schema features would really really shine here (someday). In the meantime, about the best I can recommend is to run FME in Parallel across all these, using either workspacerunner or a custom transformer that does the work and runs in parallel...

Hi @daleatsafe, I'm very expectant how to handle large numbers of identical schema features in future. Thank you very much and best reagrds.

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings