Skip to main content

In large and complex FME workspaces, memory usage can become a major performance bottleneck, especially when feature caching is enabled for debugging or when large feature streams are processed sequentially.

It would be incredibly useful to have a dedicated transformer (e.g., DropFeatureCache, ClearMemory, or similar) that can be placed mid-flow to explicitly clear cached data, or release memory from earlier processing paths.

This would be especially helpful in long chains of transformations, loops, or branching logic where intermediate data is no longer needed but still retained in memory.

 

NewGathering Interest

Hi ​@ronnie.utter, thanks for raising this pain point. We’re working on some exciting changes to how Feature Caching works. While it may not take the form of a dedicated transformer like the ones you’ve suggested, I believe it will help alleviate some of the concerns around heavy memory usage. 


Hi ​@ronnie.utter, thanks for raising this pain point. We’re working on some exciting changes to how Feature Caching works. While it may not take the form of a dedicated transformer like the ones you’ve suggested, I believe it will help alleviate some of the concerns around heavy memory usage. 

Thank you for your answer. Looking forward to review the development ☀️


@JennaKAtSafe ,One thought I also have is this: Is it true that unused output ports, such as in the FeatureMerger, still store cache or use memory? If so, would a function like enabe a "Clear data from unused ports" option in the transformer help? Could this be a good setting to use?


The following idea has been merged into this idea:

All the votes have been transferred into this idea.

The following idea has been merged into this idea:

All the votes have been transferred into this idea.

@ronnie.utter Yes, I can confirm that currently, feature caching occurs for disconnected ports and will take time during workspace execution, as it writes files to disk. However, this doesn’t necessarily increase the process’s memory usage, since each feature cache acts as an on-disk terminus — meaning the features don’t need to be held in memory beyond that point.


I’m so glad this is already out there as an idea. Was about to suggest it otherwise.

I rely on caching a lot when troubleshooting or developing transformation pipelines. But when you get into the millions of records and those get cached in FME_TEMP, I fill up a lot of disk space. Switching off caching for portions of my workspace that have already been tested or that I’m currently not interested in would be a huge enhancement.