Question

Multicore processing in Python

  • 11 October 2023
  • 4 replies
  • 28 views

Is there a way of doing parallel processing on multiple CPU cores inside a PythonCaller?

 

I don't want to use a custom transformer for that because I cannot easily group my features using the group processing option.

 

I have already tried the concurrent.futures.ProcessPoolExecutor but it fails in FME. The code works fine outside of FME but in a PythonCaller it raises a PicklingError.


4 replies

Userlevel 5
Badge +29

I have answered here: https://community.safe.com/s/question/0D5Dm00001875NyKAI/pythoncaller-is-not-working

Thanks for your example, but that's just multi-threading unfortunately. It doesn't run on multiple CPU cores.

The problem with my solution is that it uses pickling for inter-process communication and it seems like pickling doesn't work inside of a PythonCaller. So I need to find out how to get the pickle module working in FME or find another solution for multicore processing in Python (that works in FME). But I'm not a Python expert.

Userlevel 5

If you really need to do processing over multiple CPU cores, my recommendation would be to use an established framework, such as e.g. Dask, Ray or Dispy. This would also give you the possibility to do processing over multiple servers, computing clusters, etc. You most probably won't be able to pass native FMEFeature objects, so you'll want to create your own business classes and map the FMEFeature objects to these inside the PythonCaller before passing them along.

Userlevel 5
Badge +29

I'm curious as to what you're trying to do. I'd be surprised if there isn't a solution in native FME. Whilst it might not be as efficient as doing it purely in parallel python, it would be better than not being able to do it at all

Reply