Solved

FME 2017.1 Engine(s) shuts down/stops unexpectedly


Badge +8

I submitted a critical ticket for this, but I was wondering if anyone else has experienced engines shutting down and not coming back up for a long time or without manual intervention to reboot the service or FME engine service. We saw the error below 1 time yesterday on our test box engine, and I noticed it on 4 of our 6 production engines this morning after we upgraded.

I'm seeing the following error in our engine logs:

2017-08-16 10:27:08| 142.5| 0.0|INFORM|Handling FME Engine admin request: FME_ShutdownService

2017-08-16 10:27:08| 142.5| 0.0|INFORM|FME Engine is shutting down

icon

Best answer by gerhardatsafe 17 August 2017, 22:53

View original

18 replies

Badge

It's not an error, as it shuts down nicely. But the question is: what causes this behaviour? Seems as if some process (or a user script running with admin privileges perhaps?) is interfering and tells the engine to shut down.

Userlevel 4
Badge +25

It's not an error, as it shuts down nicely. But the question is: what causes this behaviour? Seems as if some process (or a user script running with admin privileges perhaps?) is interfering and tells the engine to shut down.

And why is it causing that behaviour on 4 of the 6 engines, but not the other 2?

 

 

Badge

Not an answer but resetting engines to 0, then setting it back to 6 will bring engines back. This is a tough one to crack. Not seen this before.

Badge

It's not an error, as it shuts down nicely. But the question is: what causes this behaviour? Seems as if some process (or a user script running with admin privileges perhaps?) is interfering and tells the engine to shut down.

Plus it will get down to 0 engines from 6. If you set engines to 0 then reset to 6, 2 or 3 engines may come back.

 

 

Badge +3
Yes I experience the same. Just set up 2017.1 on a test system and I find the engines stop.:

 

2017-08-17 18:48:01|   6.2|  0.0|INFORM|Translation Finished (service '7070'). Return message is '0:Translation Successful|NumFeaturesOutput=18|LogFileName=job_3.log' 2017-08-17 18:48:01|   6.2|  0.0|INFORM|Finished post-translation commands 2017-08-17 18:49:35|   6.2|  0.0|INFORM|Handling FME Engine admin request: FME_ShutdownService 2017-08-17 18:49:35|   6.2|  0.0|INFORM|FME Engine is shutting down
Badge

All engines are restarted once they reach their maximum for translation result successes (this is 100 by default). This means this shut down is usually expected and the log message seen here, itself, is not an indicator of an issue. That's why it is logged as INFORMATIONAL and not as an error or warning (as @sander_s pointed out correctly).

We a currently investigating and working actively with @runneals to resolve the issue he's experiencing.

UPDATE:

If you are experiencing similar symptoms on FME Server 2017.1 build 17539 please contact us via support@safe.com or Live Chat and we will provide instructions & resources on how to apply a possible fix. Please also share your engine & log files for the time you were experiencing this issue with us. Our current solution is still part of the investigation and will be available in the next official release if it is confirmed that it resolves the issue and we had a chance to stress test it.

Thank you for your support & help in this matter!

Badge +3

All engines are restarted once they reach their maximum for translation result successes (this is 100 by default). This means this shut down is usually expected and the log message seen here, itself, is not an indicator of an issue. That's why it is logged as INFORMATIONAL and not as an error or warning (as @sander_s pointed out correctly).

We a currently investigating and working actively with @runneals to resolve the issue he's experiencing.

UPDATE:

If you are experiencing similar symptoms on FME Server 2017.1 build 17539 please contact us via support@safe.com or Live Chat and we will provide instructions & resources on how to apply a possible fix. Please also share your engine & log files for the time you were experiencing this issue with us. Our current solution is still part of the investigation and will be available in the next official release if it is confirmed that it resolves the issue and we had a chance to stress test it.

Thank you for your support & help in this matter!

Thanks a lot @GerhardAtSafe, looking forward the solution.

 

The engines stop and they don't restart. Also, it is a new install, it hasn't submitted 100 jobs yet. I'm at job no. 16 and running in a single engine again.

 

Badge +8
Yes I experience the same. Just set up 2017.1 on a test system and I find the engines stop.:

 

2017-08-17 18:48:01|   6.2|  0.0|INFORM|Translation Finished (service '7070'). Return message is '0:Translation Successful|NumFeaturesOutput=18|LogFileName=job_3.log' 2017-08-17 18:48:01|   6.2|  0.0|INFORM|Finished post-translation commands 2017-08-17 18:49:35|   6.2|  0.0|INFORM|Handling FME Engine admin request: FME_ShutdownService 2017-08-17 18:49:35|   6.2|  0.0|INFORM|FME Engine is shutting down
We also ran into some more serious problems than this like not being able to re-add engines, even after a system reboot.

 

The easiest way to get around this temporarily is to increase your MAX_TRANSACTION_RESULT_ in your fmeServerConfig.txt file. We are using the following params, which will prevent the engine from re-booting until after 10,000 successful jobs or 1,000 failed jobs.

 

MAX_TRANSACTION_RESULT_SUCCESSES=10000
 

 

MAX_TRANSACTION_RESULT_FAILURES=1000
 

 

The worst case scenario is that it may hang some jobs, but that can be easily mitigated by going into the web UI and canceling them (which is a LOT better than having engines that don't come back up.

 

Badge +8
Yes I experience the same. Just set up 2017.1 on a test system and I find the engines stop.:

 

2017-08-17 18:48:01|   6.2|  0.0|INFORM|Translation Finished (service '7070'). Return message is '0:Translation Successful|NumFeaturesOutput=18|LogFileName=job_3.log' 2017-08-17 18:48:01|   6.2|  0.0|INFORM|Finished post-translation commands 2017-08-17 18:49:35|   6.2|  0.0|INFORM|Handling FME Engine admin request: FME_ShutdownService 2017-08-17 18:49:35|   6.2|  0.0|INFORM|FME Engine is shutting down
@revesz Contact the FME nerds in support for the patch. :D
Badge

All engines are restarted once they reach their maximum for translation result successes (this is 100 by default). This means this shut down is usually expected and the log message seen here, itself, is not an indicator of an issue. That's why it is logged as INFORMATIONAL and not as an error or warning (as @sander_s pointed out correctly).

We a currently investigating and working actively with @runneals to resolve the issue he's experiencing.

UPDATE:

If you are experiencing similar symptoms on FME Server 2017.1 build 17539 please contact us via support@safe.com or Live Chat and we will provide instructions & resources on how to apply a possible fix. Please also share your engine & log files for the time you were experiencing this issue with us. Our current solution is still part of the investigation and will be available in the next official release if it is confirmed that it resolves the issue and we had a chance to stress test it.

Thank you for your support & help in this matter!

Looks like the patch worked over the weekend. All engines remained alive and functioned properly. @runneals will take a closer look when he comes in and report back.

 

Userlevel 4
Badge +25

I think one of my customers is running in to the same issue, any chance the patch can be made publicly available @GerhardAtSafe?

Badge +8
Looks like the patch worked over the weekend. All engines remained alive and functioned properly. @runneals will take a closer look when he comes in and report back.

 

Looks like it worked! :) Thanks again to the awesome SAFE'ers that threw together a patch in <8 hours!

 

Badge
I've got exactly the same issue right now since I installed 2017.1 a few days ago. Only after manually rebooting the FME Engine service it works for the active queue, but disappears afterwards.

 

 

Badge

Edit: I had en explanation here how to fix it, but this was not the solution. But I spoke to the helpdesk in the mean time and the problem is fixed now with the patch!

Badge

All engines are restarted once they reach their maximum for translation result successes (this is 100 by default). This means this shut down is usually expected and the log message seen here, itself, is not an indicator of an issue. That's why it is logged as INFORMATIONAL and not as an error or warning (as @sander_s pointed out correctly).

We a currently investigating and working actively with @runneals to resolve the issue he's experiencing.

UPDATE:

If you are experiencing similar symptoms on FME Server 2017.1 build 17539 please contact us via support@safe.com or Live Chat and we will provide instructions & resources on how to apply a possible fix. Please also share your engine & log files for the time you were experiencing this issue with us. Our current solution is still part of the investigation and will be available in the next official release if it is confirmed that it resolves the issue and we had a chance to stress test it.

Thank you for your support & help in this matter!

It would be great if I can receive the patch too!

 

 

Badge +3

All engines are restarted once they reach their maximum for translation result successes (this is 100 by default). This means this shut down is usually expected and the log message seen here, itself, is not an indicator of an issue. That's why it is logged as INFORMATIONAL and not as an error or warning (as @sander_s pointed out correctly).

We a currently investigating and working actively with @runneals to resolve the issue he's experiencing.

UPDATE:

If you are experiencing similar symptoms on FME Server 2017.1 build 17539 please contact us via support@safe.com or Live Chat and we will provide instructions & resources on how to apply a possible fix. Please also share your engine & log files for the time you were experiencing this issue with us. Our current solution is still part of the investigation and will be available in the next official release if it is confirmed that it resolves the issue and we had a chance to stress test it.

Thank you for your support & help in this matter!

Sweet, quick and spot on patch. Engines are running happily with it. :)
Userlevel 3
Badge +13

I think one of my customers is running in to the same issue, any chance the patch can be made publicly available @GerhardAtSafe?

Hi @redgeographics. Apologies that your customer is encountering this issue. For the patch, please contact Safe support, and they will be happy to help you out! Please share the engine & log files for the time the issue was experienced.

 

Badge +11

Not an answer but resetting engines to 0, then setting it back to 6 will bring engines back. This is a tough one to crack. Not seen this before.

If this does not work for you, check running processes and see if there are any "orphan" FMEEngine.exe processes. They may need to be manually terminated before any additional FME Server Engines will come back online.

Reply