Question

How to smartly monitor a large number of jobs on a production fme server

  • 4 January 2022
  • 5 replies
  • 5 views

Userlevel 1
Badge +22

I'm having a hard time monitoring our production fme server due to the high number of jobs it runs. Typically it runs 3000+ jobs each day.

We have a system of workspaces, that automatically monitors and processes relevant requests from a national database for multiple customers. The processes run every 5 minutes half of every business day, which produces most of the mentioned number of jobs.

The jobs that process the requests are identical, and a string of modular jobs are launched for each request. This makes it difficult to monitor the progress of each request just by using the built-in web interface.

Are there any smarter ways to utilize the standard interface for monitoring, or must I create my own procedures, and my own logging (to a db) ?

Cheers.

Still using 2020.2.5


5 replies

Userlevel 4

One of the FME Server instances that I'm managing sees around 10k jobs per 24 hours, all year round, and as you say, at a certain volume you quickly realize that the web GUI isn't really built with that in mind.

 

I'm believing that you're using a schedule to trigger the process every 5 minutes, am I right? If so, you could convert the schedule to an automation, which is much more powerful, and use the success/failure output ports from the workspace action to do something accordingly.

 

My personal preference is to send an email only when something fails (with automations it's also easy to include the log file, which is really helpful), and then configure a rule in my inbox to automatically move those mails to a dedicated spot where they don't pollute your inbox, but where they're still visible. You could also use the Log a message action that will add a custom message to the automation log. That way you could also log less critical errors or warnings that don't need immediate attention.

 

If you haven't used it yet, I'd also recommend looking into using the FME Server Automations Writer in the workspace, so that you can e.g. send custom messages to the "Log a message" action or the outgoing email action.

Userlevel 1
Badge +22

One of the FME Server instances that I'm managing sees around 10k jobs per 24 hours, all year round, and as you say, at a certain volume you quickly realize that the web GUI isn't really built with that in mind.

 

I'm believing that you're using a schedule to trigger the process every 5 minutes, am I right? If so, you could convert the schedule to an automation, which is much more powerful, and use the success/failure output ports from the workspace action to do something accordingly.

 

My personal preference is to send an email only when something fails (with automations it's also easy to include the log file, which is really helpful), and then configure a rule in my inbox to automatically move those mails to a dedicated spot where they don't pollute your inbox, but where they're still visible. You could also use the Log a message action that will add a custom message to the automation log. That way you could also log less critical errors or warnings that don't need immediate attention.

 

If you haven't used it yet, I'd also recommend looking into using the FME Server Automations Writer in the workspace, so that you can e.g. send custom messages to the "Log a message" action or the outgoing email action.

Thanks David.

I know of automation of course, but haven't looked deeper into using it for production purposes (yet).

I just ran a small test to see, whether the automation itself occupied an engine, which wouldn't be good as we have a single-engine server (for testing), but lucklily it doesn't.

It seems that I may take a deep dive into using automations to connect my modular workspaces into a single logical workflow instead of using serverjobsubmitter.

Can I use automation to locate the jobs it created, so I can easily find and review the relevant job log files ?

Userlevel 1
Badge +22

One of the FME Server instances that I'm managing sees around 10k jobs per 24 hours, all year round, and as you say, at a certain volume you quickly realize that the web GUI isn't really built with that in mind.

 

I'm believing that you're using a schedule to trigger the process every 5 minutes, am I right? If so, you could convert the schedule to an automation, which is much more powerful, and use the success/failure output ports from the workspace action to do something accordingly.

 

My personal preference is to send an email only when something fails (with automations it's also easy to include the log file, which is really helpful), and then configure a rule in my inbox to automatically move those mails to a dedicated spot where they don't pollute your inbox, but where they're still visible. You could also use the Log a message action that will add a custom message to the automation log. That way you could also log less critical errors or warnings that don't need immediate attention.

 

If you haven't used it yet, I'd also recommend looking into using the FME Server Automations Writer in the workspace, so that you can e.g. send custom messages to the "Log a message" action or the outgoing email action.

Another question about automations:

Will the existence of automations writers in the workspaces have any impact if the workspaces aren't run via automations ? I.e., can I make a "soft" porting from the old logic to the new ?

Userlevel 4

Thanks David.

I know of automation of course, but haven't looked deeper into using it for production purposes (yet).

I just ran a small test to see, whether the automation itself occupied an engine, which wouldn't be good as we have a single-engine server (for testing), but lucklily it doesn't.

It seems that I may take a deep dive into using automations to connect my modular workspaces into a single logical workflow instead of using serverjobsubmitter.

Can I use automation to locate the jobs it created, so I can easily find and review the relevant job log files ?

As you discovered, the automation itself doesn't require an engine, which I think is a huge part of the attraction when chaining workspaces.

The automation itself has its own log file, and it can also point you to the jobs that the automation triggered, which is very useful:

automations_log

Userlevel 4

Another question about automations:

Will the existence of automations writers in the workspaces have any impact if the workspaces aren't run via automations ? I.e., can I make a "soft" porting from the old logic to the new ?

When running outside of an automation, the automation writer will either do nothing, or you can redirect the output to a local directory.

 

Reply