We had a phone call with the guys at Safe and talked about this point, among other items. I subsequently received an email with the information below. It turns out that there's an undocumented REST API call that can be made to do a quick "health check" to see if the services are running but it still does not test full job processing capability. Anyway, their response is interesting and provides some further context around what the available calls are and what they are checking...
as discussed in our call today, here is the REST endpoint
for the FME Sever health check:
<FMEServer_URL>:<PORT>/fmerest/v3/healthcheck
This is the endpoint that we use to check whether our FME
Cloud instances are up and running. However I am not 100% anymore if this is
what you were looking for. I had a discussion with a developer and it turns out
that this call just checks whether the REST API webapp is running. It does not
tell you:
1. Is the FME Core process running
2. Is the FME Engine process running
So it is not exactly a indication whether a job can be
processed.
The /fmerest/v3/healthcheck endpoint will either
return an HTTP 200 response code or not respond at all.
The call
<FMEServer_URL>:<PORT>/fmerest/v3/info
will give an indication whether the FME Core process is
up, because it calls this process. That said, we would not recommend to hit
this endpoint too often as it is single threaded and can put a lot of load on
the process. This is the actual reason why we use the healthcheck endpoint.
The /fmerest/v3/info call can return 2 different HTTP status codes depending on the state
of the core:
- 200 core is up & running
- 503 Web app can't connect to core
Pinging this endpoint about every 10s should be
fine.
Another option is the call
<FMEServer_URL>:<PORT>/fmerest/v3/transformations/engines
This will tell you whether engines are registered and
running and therefore this is the best indicator whether a job can be processed
or not. However this call will be slower and probably also shouldn't be hit in
a too high interval.
As conclusion I'd say the healthcehck gives you an idea
wither the tomcat and the machine is up and it can be hit in short intervals.
If the FME Server core is running on a different machine this check won't
detect if the core is down. The info and engines endpoints will give you more
information about the overall status but should also not be hit too often.