Skip to main content

So I learned that *.fts files are part of the ongoing tuning efforts in FME and are written to FMETable subfolder in FME_TEMP directory.

I was literally shocked yesterday when my FME_TEMP drive (256Gb SSD) suddenly ran full because FMETable seemingly "exploded" without limit from reading and processing 5Gb of ffs files.

My impression is that this is a first in FME 2020 (maybe 2019 too ?) as the FME processing active at the time was little more than reading 5Gb (190 different ffs files) and writing them out as a single set of ffs. This process up to now never was in any way critical but now is a blocking point in my environment.

Is it in any way explainable why 60 million features consume 250Gb+ as *.fts files - when they only need 5Gb as ffs files ?

I expect not more than 40-50Gb in the FileGeoDB which will receive this whole dataset !

B.T.W.: Memory use was next to nothing during this whole adventure ...

I came across excessive disk space use when FME writes raster data to disk in the internal ffs (+raster) format.

With all patience and understanding but this behaviour is blowing the roof off my 16-years FME experience ...

Anybody with similar experiences out there ?

Hi @mhab Was Feature Caching enabled?


Hi @mhab Was Feature Caching enabled?

Not that I know of. It is running in BATCH from a DOS Skript ...

85Mio features in 55 files in FMETable at the moment, with 340Gb in total ...


Not that I know of. It is running in BATCH from a DOS Skript ...

85Mio features in 55 files in FMETable at the moment, with 340Gb in total ...

Please file a case at https://www.safe.com/support/report-a-problem/ Thanks!


I've only seen this when I accidentally left feature caching on and ran a parent workspace that ran a child workspace a few hundred times :-o


So after some testing, and while I was about to prepare the reprocase I think I found the reason for the extraordinary space consumption.

It is a FeatureJoiner where all data enter left, leave through the unjoined left port and then pass through a GeometryFilter, where Raster+PointCloud is searched for and everything passes through the <unfiltered> port.

This combination causes ALL data to be converted to a feature table, which is enourmous in size and written to FME_TEMP/FMETable. It takes very very long to get through the GeometryFilter.

I tried to pass all the data through the workspace without the FeatureJoiner+GeometryFilter - which are of no use in this case - just for comparison, and the difference was: 17h -> 2h (85Mio features) and 334Gb down to almost none in FME_TEMP/FMETable.

So I think this is just bad luck and no support case.

B.T.W. I wonder how much FME_TEMP I will need once all is tuned and uses FeatureTables ...

Hopefully by then, space will be cheaper than now ;-)

Michael

 


So after some testing, and while I was about to prepare the reprocase I think I found the reason for the extraordinary space consumption.

It is a FeatureJoiner where all data enter left, leave through the unjoined left port and then pass through a GeometryFilter, where Raster+PointCloud is searched for and everything passes through the <unfiltered> port.

This combination causes ALL data to be converted to a feature table, which is enourmous in size and written to FME_TEMP/FMETable. It takes very very long to get through the GeometryFilter.

I tried to pass all the data through the workspace without the FeatureJoiner+GeometryFilter - which are of no use in this case - just for comparison, and the difference was: 17h -> 2h (85Mio features) and 334Gb down to almost none in FME_TEMP/FMETable.

So I think this is just bad luck and no support case.

B.T.W. I wonder how much FME_TEMP I will need once all is tuned and uses FeatureTables ...

Hopefully by then, space will be cheaper than now ;-)

Michael

 

So I also checked the behaviour in FME 2018 and found that FME 2018 needs even more FME_TEMP space (~+30%) with the same dataset.


So after some testing, and while I was about to prepare the reprocase I think I found the reason for the extraordinary space consumption.

It is a FeatureJoiner where all data enter left, leave through the unjoined left port and then pass through a GeometryFilter, where Raster+PointCloud is searched for and everything passes through the <unfiltered> port.

This combination causes ALL data to be converted to a feature table, which is enourmous in size and written to FME_TEMP/FMETable. It takes very very long to get through the GeometryFilter.

I tried to pass all the data through the workspace without the FeatureJoiner+GeometryFilter - which are of no use in this case - just for comparison, and the difference was: 17h -> 2h (85Mio features) and 334Gb down to almost none in FME_TEMP/FMETable.

So I think this is just bad luck and no support case.

B.T.W. I wonder how much FME_TEMP I will need once all is tuned and uses FeatureTables ...

Hopefully by then, space will be cheaper than now ;-)

Michael

 

As a conclusion I ended up removing the FeatureJoiner from the workflow as it was needed only for a very special case. I handle the special case detection by some external scripting hack and set a parameter. This saves 30% performance and allows for much smaller FME_TEMP now.

Lesson learned !!


Hi Michael, definitely we'll want to look into this -- wouldn't expect that at all. If you can send us a repro -- particularly we'll be interested in the data types of your input data, we'll be pleased to dig in.


Hi Michael, definitely we'll want to look into this -- wouldn't expect that at all. If you can send us a repro -- particularly we'll be interested in the data types of your input data, we'll be pleased to dig in.

Data is on the way and some background info of what is done.


Just wanted to update that we are currently experimenting with adding compression to temporary feature tables. Early results are looking positive for some very good improvements in FME 2021. Seriously good.


Reply