Question

Transform NetCDF data to Feature Classes - how to reduce the processing time?

7 years ago
May 15, 2017
17 replies
30 views

uba_kp
62 replies

Dear FME-users,

I’m creating feature classes (polygons) from NetCDF-data. Among other things I use the transformer “RasterCellCoercer”, the transformation needs so much time.

I’ve 504 input and 504 output files. The geometry is always the same, just the values of the input data are different.

Do you’ve an idea how to reduce the processing time? Maybe a predefined grid?

Thank you so much and best regards.

Konrad

david_r
8321 replies
7 years ago
May 15, 2017

Unfortunately, the RasterCellCoercer is really slow when dealing with large datasets.

Apparently you can also consider using the PointCloudCoercer which supposedly is a lot faster, but I've never used it myself.

takashi
7597 replies
7 years ago
May 16, 2017

Hi @UBA_KP, generally, creating many features consume much resource and could take a long time. The RasterCellCoercer is a typical case. Since it creates large number of new features (number of columns x number of rows), could take a long time to complete the processing as @david_r mentioned.

Although Python API for raster manipulations has been introduced in FME 2017, I don't think a large effect can be expected even if you use the API, as long as the large number of features have to be created as the translation result.

However, if you could significantly reduce the number of features that should be created by setting some conditions, Python scripting could be a possible way to improve the performance and I think it would be worth to try.

+17

itay
Supporter
1441 replies
7 years ago
May 16, 2017

The RasterToPolygonCoercer can be an alternative but as mentioned before this does not necessarly mean that it will be faster.

If the resolution of the rasters is not very important, consider resampling before converting to polygons.

uba_kp
Author
62 replies
7 years ago
May 16, 2017

Thank you so much for answering my question @david_r, @takashi and @itay.

Maybe there’s a possibility to factorize the spatial information and operations?

It means that the attribute values of the NetCDF data have to written into a list with an allocated ID. Afterwards the ID’s can combined with the ID’s of a predefined grid. It might be possible that the processing time can be reduced on this way.

What do you think about that? Maybe you’ve an idea. :)

You’re right @itay, the “RasterToPolygonCoercer” works much faster.

Thank you and best regards. :)

Konrad

uba_kp
Author
62 replies
7 years ago
May 16, 2017

It seems that my question was not clear enough. Is it possible to write NetCDF data into a table without spatial information?

+17

itay
Supporter
1441 replies
7 years ago
May 16, 2017

Yes, remove the geometry with the GeometryRemover after extracting the data into attributes.

david_r
8321 replies
7 years ago
May 16, 2017

uba_kp wrote:

It seems that my question was not clear enough. Is it possible to write NetCDF data into a table without spatial information?

Which part of the NetCDF? Only the layer infos or also all the cell values? If you need the cell values, what's the database schema like?

uba_kp
Author
62 replies
7 years ago
May 16, 2017

david_r wrote:

Which part of the NetCDF? Only the layer infos or also all the cell values? If you need the cell values, what's the database schema like?

I only need the cell values for every column/line.

uba_kp
Author
62 replies
7 years ago
May 16, 2017

itay wrote:

Yes, remove the geometry with the GeometryRemover after extracting the data into attributes.

I've done this but the table is still empty. I'd like to have all cell values, for this I've to use the "RasterToPolygonCoercer" or "RasterToPointCoercer". This needs a lot of time and I'm looking for alternatives.

david_r
8321 replies
7 years ago
May 16, 2017

uba_kp wrote:

I only need the cell values for every column/line.

Not sure I understand. You need all the cell values then?

takashi
7597 replies
7 years ago
May 16, 2017

uba_kp wrote:

It seems that my question was not clear enough. Is it possible to write NetCDF data into a table without spatial information?

What kind of table is required? For example, does a CSV table (comma separated cell values x rows) satisfy the requirement?

scyphers
Participant
22 replies
7 years ago
May 16, 2017

Hi Konrad,

You have already seen how quick the list representation of NetCDF data is very quick. If you can ignore the rasterness of the data in what you are trying to accomplish i would do so. But for slice/chunking NetCDF, there might be a RCaller recipe out there you could use which would be more dynamic.

But if your NetCDF is a known size every time.

Have you tried to use a RasterTiler? You could force the Row/Col layout of the tiles to be 1 row and n of how many columns your data has. this would allow you to limit the amount of dataset I/O (_tile_column = 1) you read and could express the list values into your features at that point.

and then drop the geometry altogether as simple table records.

uba_kp
Author
62 replies
7 years ago
May 19, 2017

scyphers wrote:

Hi Konrad,

But if your NetCDF is a known size every time.

and then drop the geometry altogether as simple table records.

Thank you very much for your input @scyphers. I'll try this very soon.

uba_kp
Author
62 replies
7 years ago
May 19, 2017

david_r wrote:

Not sure I understand. You need all the cell values then?

Thank you @david_r. I'll try what @scyphers has recommended and I'll give you feedback afterwards.

uba_kp
Author
62 replies
7 years ago
May 19, 2017

takashi wrote:

What kind of table is required? For example, does a CSV table (comma separated cell values x rows) satisfy the requirement?

Thank you @takashi, I don't need any table format like the CSV format. I'm interested to create a table-based process without geometry, because I think it's faster. At the end of my process I'd like to write all table values into a predefined grid. I'll try what @scyphers has recommended and I'll give you feedback afterwards.

+17

fmelizard
Contributor
3725 replies
7 years ago
May 21, 2017

I'm thinking that some of our new technology for handling large numbers of identical schema features would really really shine here (someday). In the meantime, about the best I can recommend is to run FME in Parallel across all these, using either workspacerunner or a custom transformer that does the work and runs in parallel...

uba_kp
Author
62 replies
7 years ago
May 22, 2017

fmelizard wrote:

Hi @daleatsafe, I'm very expectant how to handle large numbers of identical schema features in future. Thank you very much and best reagrds.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Transform NetCDF data to Feature Classes - how to reduce the processing time?