Skip to main content
Solved

I have two input files, one of MapInfo Tab format and another one is CSV. Both files have around 180 millions row each. I'm using FeatureMerger to combine both files but it's taking more than 20 days to process it.

  • December 19, 2022
  • 2 replies
  • 22 views

Can someone please suggest something to  improve the performance?

Best answer by david_r

I would try using the InlineQuerier and perhaps also the FeatureJoiner for this. Also make sure that you're using a recent version of FME.

Finally, make sure to switch off both breakpoints and feature caching in FME Workbench, this can have a huge influence on performance when working with such volumes.

It's possible that the best performance will be had by first reading everything into a proper database with lots of RAM, properly indexing all the relevant columns and doing the join in SQL, e.g. using an SQLExecutor or SQLCreator.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

david_r
Celebrity
  • 8391 replies
  • Best Answer
  • December 19, 2022

I would try using the InlineQuerier and perhaps also the FeatureJoiner for this. Also make sure that you're using a recent version of FME.

Finally, make sure to switch off both breakpoints and feature caching in FME Workbench, this can have a huge influence on performance when working with such volumes.

It's possible that the best performance will be had by first reading everything into a proper database with lots of RAM, properly indexing all the relevant columns and doing the join in SQL, e.g. using an SQLExecutor or SQLCreator.


Forum|alt.badge.img+2
  • 1891 replies
  • December 19, 2022

@csj5483​ SpatiaLite is a good option as a staging database - pretty much the same as InlineQuerier. But try FeatureJoiner first.