Hi @geochoice2017,
There is this blog post which you could try and check out although that might not fit your use case: https://www.safe.com/blog/2019/01/fme-does-computer-vision/
Alternatively I would say if you have a third party tool or a Python script you can leverage these with FME. FME's Python Caller will let you run python inside your workflow, indeed you can use arcpy too. If ESRI has this as a tool then this might be the easiest approach.
In addition to the PythonCaller you might find The SystemCaller helpful, the SystemCaller can call any system call, for example it is used in conjunction with LASTools to perform point cloud classification via a command line interface. There are other tools in FME which can help with this workflow like the FeatureReader, FeatureWriter and TempPathNameCreator which can be used with the SystemCaller to dump data then process it and then bring it back into FME again.
While using the System Caller is helpful like this it may not always be a performant approach.
FME also supports Looping inside of Custom Transformers, here you can specify the number of iterations, or set it to infinite or until all the data comes out of the loop. This is good for iterative processes but comes with it's own set of headaches.
In addition if you also have a Web API which you want to use to have the data processed server side somewhere then you can use the HTTPCaller to move data around and make requests via HTTP.
Would be really interesting if you are able to get a workflow like this going in FME. Good Luck!