Question

What FME Server Event Notifications would be useful to you?

  • 25 September 2018
  • 16 replies
  • 2 views

Badge +11

We have noticed that many users have been asking for improved system health monitoring and reporting for FME Server.

Many of the requests are driven by the need to respond to events in a timely manner. These events can range from Job and Core / Engine Failure to Publishing a Workspace to Tracking Logins.

What FME Server event notifications are you or your organization most interested in?

We would love to hear feedback! Please add your comments and use-cases or scenarios below, and consider using the "upvote" system or sub-comments to show support for other posts that you might be interested in.

To get your ideas flowing, here are some examples I have gathered from the Knowledge Center and through a few technical supports cases:

Notifications when...

- FME Server Starts

 

- Job Starts

 

- Job Exceeds Time Limit

 

- FME Server Job Queue is Empty

 

- Repository Changes (Publish, Update, Delete)

 

- License Expiry

16 replies

Userlevel 4

Excellent topic! Here's my wishlist of where I'd love to have notifications:

Job queue contains >n items

 

Use case: Track peak usage and proactively react to high saturation.

Job hangs, i.e. no activity in job log for >n time units

 

Use case: Detect jobs that hang completely (failing timeouts, table locks, faulty engine, race conditions, etc.)

Less than n time units left of license

 

Use case: Notify server admin well in advance of license expiry, which is particularly important for failover licenses that expire every year.

Any failing job in a user defined repository

 

Use case: Server admin can detect failed jobs regardless of how the job was started (API, jobsubmitter, datastreaming, etc) and regardless of how the job notification was configured.

Jobs that are automatically cancelled for exceeding queued job expiry time

 

Use case: Failure notifications are currently not triggered if job is cancelled due to exceeding max queued time. We really need to know when this happens.
Userlevel 1
Badge +17
Resources changes (Adds, Updates and Deletes)

 

 

Userlevel 1
Badge +17
Security changes

 

 

Userlevel 1
Badge +17
System Cleanup

 

 

Userlevel 4
Badge +25

Engine starts / shuts down

Badge +8

Additionally more detailed notifications (like have variables for all errors, warnings, info, etc from log)

Badge +11

Additionally more detailed notifications (like have variables for all errors, warnings, info, etc from log)

@runneals - Are there any particular logs or set of logs that you would find most useful if there were a "Default" to which are monitored?

 

And, considering how many log files FME Server generates, how would you imagine FME Server presents the options to choose which ones are important to you?

 

Userlevel 2
Badge +12

Database table watch: trigger a workspace to run after an Insert, Update or Delete has taken place in a database table.

Can be done from the database, but would be great if it could be modeled in FME Server.

Userlevel 2
Badge +12

Database table watch: trigger a workspace to run after an Insert, Update or Delete has taken place in a database table.

Can be done from the database, but would be great if it could be modeled in FME Server.

And I notice this is most likely not supposed to be in this topic.

 

 

Badge +11

Thank you all for your contributions to this thread @david_r, @stalknecht, @redgeographics, @runneals, and @erik_jan!

Our development team will be starting to implement (what we are currently internally referring to as) 'System Event Notifications' for FME Server 2019. Not all of the suggestions here will be available in the first iteration, but as the feature grows we will definitely be looking towards our FME Community to help us understand what is most useful to our users!

You may notice some features rolling out into the beta stream, but I will make a second announcement here when the pieces are connected and to invite further feedback.

We are in the early stages so I cannot accurately comment on "what is in, and what is not".

And please feel free to continue to use this thread to share your ideas.

Userlevel 4

Thank you all for your contributions to this thread @david_r, @stalknecht, @redgeographics, @runneals, and @erik_jan!

Our development team will be starting to implement (what we are currently internally referring to as) 'System Event Notifications' for FME Server 2019. Not all of the suggestions here will be available in the first iteration, but as the feature grows we will definitely be looking towards our FME Community to help us understand what is most useful to our users!

You may notice some features rolling out into the beta stream, but I will make a second announcement here when the pieces are connected and to invite further feedback.

We are in the early stages so I cannot accurately comment on "what is in, and what is not".

And please feel free to continue to use this thread to share your ideas.

This is fantastic news, thanks!
Badge +11

Excellent topic! Here's my wishlist of where I'd love to have notifications:

Job queue contains >n items

 

Use case: Track peak usage and proactively react to high saturation.

Job hangs, i.e. no activity in job log for >n time units

 

Use case: Detect jobs that hang completely (failing timeouts, table locks, faulty engine, race conditions, etc.)

Less than n time units left of license

 

Use case: Notify server admin well in advance of license expiry, which is particularly important for failover licenses that expire every year.

Any failing job in a user defined repository

 

Use case: Server admin can detect failed jobs regardless of how the job was started (API, jobsubmitter, datastreaming, etc) and regardless of how the job notification was configured.

Jobs that are automatically cancelled for exceeding queued job expiry time

 

Use case: Failure notifications are currently not triggered if job is cancelled due to exceeding max queued time. We really need to know when this happens.
@david_r - Re: "Less than n time units left of license"

 

If you had configured to monitor / notify when there is (for example) 5 Days remaining until an FME Server License will expire, would you expect or prefer to receive ONE notification only, or MULTIPLE notifications?

 

 

*For the "multiple" notifications, there is an assumption this check is made once daily, and the license is not refreshed after multiple days from the initial notification event.
Userlevel 4
@david_r - Re: "Less than n time units left of license"

 

If you had configured to monitor / notify when there is (for example) 5 Days remaining until an FME Server License will expire, would you expect or prefer to receive ONE notification only, or MULTIPLE notifications?

 

 

*For the "multiple" notifications, there is an assumption this check is made once daily, and the license is not refreshed after multiple days from the initial notification event.
Hi Rylan, excellent question! I haven't thought much about it, but I think I could live with both alternatives. The notification would probably usually trigger a mail subscriber, so one mail should (in theory) be enough. On the other hand, if the time frame is short, getting one mail every day might not be such a bad idea ;-)
Badge +3

Additionally more detailed notifications (like have variables for all errors, warnings, info, etc from log)

Yes, some more details would be useful. I miss these parameters from the failed job notification:

 

  • All error (red) lines - "Fatal error occurred" is not too helpful but "File Copy Writer: Error attempting to copy from..." may help.
  • Rejected feature parameter - eg.: spotting null or wrongly formatted dates

 

Badge +22

Security token about to expire.

Badge +8

Excellent topic! Here's my wishlist of where I'd love to have notifications:

Job queue contains >n items

 

Use case: Track peak usage and proactively react to high saturation.

Job hangs, i.e. no activity in job log for >n time units

 

Use case: Detect jobs that hang completely (failing timeouts, table locks, faulty engine, race conditions, etc.)

Less than n time units left of license

 

Use case: Notify server admin well in advance of license expiry, which is particularly important for failover licenses that expire every year.

Any failing job in a user defined repository

 

Use case: Server admin can detect failed jobs regardless of how the job was started (API, jobsubmitter, datastreaming, etc) and regardless of how the job notification was configured.

Jobs that are automatically cancelled for exceeding queued job expiry time

 

Use case: Failure notifications are currently not triggered if job is cancelled due to exceeding max queued time. We really need to know when this happens.

In regards to job queue contains >n items, that would be SOOOOO nice, as when services like arcgis online go down, our queue fills up and kills our server productivity.

Reply