Solved

I have two input files, one of MapInfo Tab format and another one is CSV. Both files have around 180 millions row each. I'm using FeatureMerger to combine both files but it's taking more than 20 days to process it.

  • 19 December 2022
  • 2 replies
  • 3 views

Can someone please suggest something to  improve the performance?

icon

Best answer by david_r 19 December 2022, 14:23

View original

2 replies

Userlevel 4

I would try using the InlineQuerier and perhaps also the FeatureJoiner for this. Also make sure that you're using a recent version of FME.

Finally, make sure to switch off both breakpoints and feature caching in FME Workbench, this can have a huge influence on performance when working with such volumes.

It's possible that the best performance will be had by first reading everything into a proper database with lots of RAM, properly indexing all the relevant columns and doing the join in SQL, e.g. using an SQLExecutor or SQLCreator.

Badge +2

@csj5483​ SpatiaLite is a good option as a staging database - pretty much the same as InlineQuerier. But try FeatureJoiner first.

Reply