Question

WorkBenches Getting Failed in FME Server (Workbenches using AGOL Transformer)


Badge

Hello,

Some strange thing is going on in fme server. Sometime Web Connection is working and some time getting failed.

Product Used :-

FME Server 2018.0.1.1 - Build 18312 - win64

ArcGIS Online

Steps I have used to setup communication between FME server and AGOL.

1. I have created app in ArcGIS online and taken its Client id , Client Secret & registered my fme server redirect uri there.

2. Created my own web service from ESRI ArcGIS online and updated Client id , Client Secret & Redirect Uri ( ).

3. Created new Web Connection using above Web Service.Then test it and authorized it.

4. Created new workbench which uses AGOL feature reader and used above web connection to communicate(after checking in desktop upload to FME server with my authorized web connection).

5. Re-authorize web connection in FME Server

6. Scheduled Workbench for every 30 minutes. So here is the problem my workbench sometimes work ok and sometimes it is getting failed (error - Python Exception <HTTPError>: 401 Client Error: Unauthorized for url: https://www.arcgis.com/sharing/rest/portals/self?f=json&token;=UOq3X6bKO6NJwMxlG30RoLv7fqHsmGKfJuL4PEejxlm8hjQHsa6iKlolmH5VBrlSCjbUDI0q-HRggK4mQbnIf7s2LWMno5lpAiN6ER28vV9VQexdHL7ikyfMl80lV7j7XNVcnHa8ZnkoOXCzwRn20fzL5VvK6_uPAi9C9mCnG_k419FAbbpwL7o4wlI8V3ukig18h16-8rjIx5HD-l-Qi1YjIaA6I1j4oYr3BolhhmZsne9HtOcUU6NilmUfyVTJ)

7. Snap of a FME server Jobs Log to give you an idea how often it is getting failed .

By seeing the error it seems that refresh token or authorization code itself getting expired but as per the architecture OAUTH2.0 it should refresh itself without any problem. Once my workbench start crashing in FME server then I use to authorize manually in FME server to make it running but after sometimes same problem pops up.

It will be great if somebody can give me some clarity around this or any suggestions .

Thanks In Advance


17 replies

Badge +7

Hi @gangwarmanoj, thanks for your post.

From @philippeb's comment here it looks like between step 4 and step 5 you'll need to manually authorize the web connection within FME Server after uploading it for it to work indefinitely and properly refresh. Could you give this a try and see if it helps?

 

 

Best,

 

Nathan
Badge

Hi @gangwarmanoj, thanks for your post.

From @philippeb's comment here it looks like between step 4 and step 5 you'll need to manually authorize the web connection within FME Server after uploading it for it to work indefinitely and properly refresh. Could you give this a try and see if it helps?

 

 

Best,

 

Nathan
HI @NathanAtSafe ,

 

Thanks for the reply.

 

 

Actually I forgot :) to mention this step (manual authorization of Web Connection in fme server before running workspace - edited my question and added that step, thanks).

 

I tired both the things (authorized manually in Server and without authorize)

 

If we don't authorize manually then workspace didn't ran single time but once it re-authorized in fme server it sometimes worked and sometimes didn't (refer serverjob-log.png file which I have attached in my original question).

 

 

I tired many things like used feature reader instead of AGOL reader . In feature reader I used same web connection and interestingly its successful rate was higher than AGOL but not 100% :(.

 

 

Thanks,

 

Manoj

 

 

Badge
Adding another snap of error which use to come apart from unauthorized url. I guess this error is related to token again.anothererror.png

 

 

Badge +7
HI @NathanAtSafe ,

 

Thanks for the reply.

 

 

Actually I forgot :) to mention this step (manual authorization of Web Connection in fme server before running workspace - edited my question and added that step, thanks).

 

I tired both the things (authorized manually in Server and without authorize)

 

If we don't authorize manually then workspace didn't ran single time but once it re-authorized in fme server it sometimes worked and sometimes didn't (refer serverjob-log.png file which I have attached in my original question).

 

 

I tired many things like used feature reader instead of AGOL reader . In feature reader I used same web connection and interestingly its successful rate was higher than AGOL but not 100% :(.

 

 

Thanks,

 

Manoj

 

 

Hi @gangwarmanoj,

 

Thanks for your follow up. I created a test here and I think I am seeing some similar behaviour. I'll investigate a bit more and try to find the cause, or if this in fact unexpected.

 

Thanks,

 

Nathan

 

Badge
Hi @gangwarmanoj,

 

Thanks for your follow up. I created a test here and I think I am seeing some similar behaviour. I'll investigate a bit more and try to find the cause, or if this in fact unexpected.

 

Thanks,

 

Nathan

 

@NathanAtSafe thanks :)

 

Badge +7
@NathanAtSafe thanks :)

 

@gangwarmanoj,

 

From my testing it seems like the issue occurs when more than one client uses the same web connection to AGOL (or perhaps uses the same AGOL app to authenticate). Does this sound like your situation?

 

Steps I took to test:

 

Publish the same workspace with the same AGOL Web service and connection to two separate FME Server instances and run every ten minutes on a schedule. One FME Server instance produced the errors you're seeing, while the other didn't. When I turned off one FME Server instance, the one that was failing stopped failing.

 

Best,

 

Nathan

 

Badge
@gangwarmanoj,

 

From my testing it seems like the issue occurs when more than one client uses the same web connection to AGOL (or perhaps uses the same AGOL app to authenticate). Does this sound like your situation?

 

Steps I took to test:

 

Publish the same workspace with the same AGOL Web service and connection to two separate FME Server instances and run every ten minutes on a schedule. One FME Server instance produced the errors you're seeing, while the other didn't. When I turned off one FME Server instance, the one that was failing stopped failing.

 

Best,

 

Nathan

 

Thanks Nathan for testing this on your side.

 

Below is the current scenario in my organization.

 

We have only one fme server in production. Created one app in arcgis online and by using its client id and secret we have created one AGOL web service.

 

 

We have multiple workbenches which uses same arcgis online user. So we have created one standard connection in our fme server which is shared by multiple workbenches.

 

 

Same connection we have created in our desktop instances for our development purpose so that we can easily publish our workbench without changing web connections.

 

 

As per your test you are suggesting that we should create different web connection in desktop(not similar to fme server)?

 

and different web connections for different workbenches in fme server (although they are using same user of AGOL)

 

 

Or we should create one more AGOL app which we can use to create desktop connections and other one exclusively for fme server only and create separate connections.

 

 

Thanks

 

Manoj

 

 

Badge +7

Hi @gangwarmanoj, thanks for your post.

From @philippeb's comment here it looks like between step 4 and step 5 you'll need to manually authorize the web connection within FME Server after uploading it for it to work indefinitely and properly refresh. Could you give this a try and see if it helps?

 

 

Best,

 

Nathan

Hi @gangwarmanoj,

 

 

I tried to recreate this issue using FME Server and FME Desktop with the same AGOL app and web connection, but was not able to. So, I've only been able to reproduce using two FME Servers using the same app and web connection.

 

 

Nevertheless, I would suggest trying to create one more AGOL app for FME Desktop use only, and seeing if that eliminates the error you're seeing in FME Server. If it does not, there is some other issue at play here.

 

 

Thanks,

 

Nathan
Badge

Hi @gangwarmanoj,

 

 

I tried to recreate this issue using FME Server and FME Desktop with the same AGOL app and web connection, but was not able to. So, I've only been able to reproduce using two FME Servers using the same app and web connection.

 

 

Nevertheless, I would suggest trying to create one more AGOL app for FME Desktop use only, and seeing if that eliminates the error you're seeing in FME Server. If it does not, there is some other issue at play here.

 

 

Thanks,

 

Nathan
Thank Nathan.

 

I will give a try and let you know my analysis.

 

 

Thanks for all your help :)

 

 

Badge +7
Thank Nathan.

 

I will give a try and let you know my analysis.

 

 

Thanks for all your help :)

 

 

Hi @gangwarmanoj,

 

One quick question: does your FME Server have SSL configured?

 

Nathan

 

Badge
Hi @gangwarmanoj,

 

One quick question: does your FME Server have SSL configured?

 

Nathan

 

Nope. One more thing I would like to add that proxy is configured in fme server.

 

 

Badge

Hi @gangwarmanoj,

 

 

I tried to recreate this issue using FME Server and FME Desktop with the same AGOL app and web connection, but was not able to. So, I've only been able to reproduce using two FME Servers using the same app and web connection.

 

 

Nevertheless, I would suggest trying to create one more AGOL app for FME Desktop use only, and seeing if that eliminates the error you're seeing in FME Server. If it does not, there is some other issue at play here.

 

 

Thanks,

 

Nathan

HI @NathanAtSafe

I tried using same client id with my desktop and fme server but it doesn't seems to be getting impacted. I strongly sense that problem is something else. used separate client id's but still fme server workbenches getting failed. It has no pattern. One day it worked without any problem then next day failure rate is 90%.

 

Regards

Manoj

 

 

Badge +7

HI @NathanAtSafe

I tried using same client id with my desktop and fme server but it doesn't seems to be getting impacted. I strongly sense that problem is something else. used separate client id's but still fme server workbenches getting failed. It has no pattern. One day it worked without any problem then next day failure rate is 90%.

 

Regards

Manoj

 

 

Thanks for your reply @gangwarmanoj,

I'll check with the developer to see if we can do anything to improve this. Stay tuned.

Best,

Nathan

Badge +7

HI @NathanAtSafe

I tried using same client id with my desktop and fme server but it doesn't seems to be getting impacted. I strongly sense that problem is something else. used separate client id's but still fme server workbenches getting failed. It has no pattern. One day it worked without any problem then next day failure rate is 90%.

 

Regards

Manoj

 

 

Hi @gangwarmanoj,

We're still investigating the root cause of this issue. However, I wanted to let you know of a possible workaround for your case.

 

Workaround:

Have the schedule post to a topic on failure. Create a workspace subscription that runs the AGOL job and that is triggered by the failure topic. You can also have this workspace subscription post to the same topic on failure, so that every time the AGOL job fails, FME Server will try to re-run it. *Usually* the failure only happens once, and then the job resumes successfully, so this workaround should at least ensure that every time the job should run, FME Server will re-run it until it's a success.

 

Hope this helps for now.

Best,

Nathan

Badge

Hi @gangwarmanoj,

We're still investigating the root cause of this issue. However, I wanted to let you know of a possible workaround for your case.

 

Workaround:

Have the schedule post to a topic on failure. Create a workspace subscription that runs the AGOL job and that is triggered by the failure topic. You can also have this workspace subscription post to the same topic on failure, so that every time the AGOL job fails, FME Server will try to re-run it. *Usually* the failure only happens once, and then the job resumes successfully, so this workaround should at least ensure that every time the job should run, FME Server will re-run it until it's a success.

 

Hope this helps for now.

Best,

Nathan

Thanks @NathanAtSafe for sharing the workaround. I guess this will make my life little bit easy :)

Badge

Hi @gangwarmanoj,

We're still investigating the root cause of this issue. However, I wanted to let you know of a possible workaround for your case.

 

Workaround:

Have the schedule post to a topic on failure. Create a workspace subscription that runs the AGOL job and that is triggered by the failure topic. You can also have this workspace subscription post to the same topic on failure, so that every time the AGOL job fails, FME Server will try to re-run it. *Usually* the failure only happens once, and then the job resumes successfully, so this workaround should at least ensure that every time the job should run, FME Server will re-run it until it's a success.

 

Hope this helps for now.

Best,

Nathan

Hi @NathanAtSafe ,

I tried to make a subscription and it is working fine. But I have one question on this

what if there is a network outage or something else for longer period than in that case FME server will go into loop and crash.

Is it possible to set number of times it should retry and then quits.

 

Thanks in Advance

Manoj

 

Badge +7

Hi @NathanAtSafe ,

I tried to make a subscription and it is working fine. But I have one question on this

what if there is a network outage or something else for longer period than in that case FME server will go into loop and crash.

Is it possible to set number of times it should retry and then quits.

 

Thanks in Advance

Manoj

 

Hi @gangwarmanoj,

Unfortunately we don't have built-in functionality for limiting retries.. yet (this has been asked for before). If you want to avoid the possibility of a loop, have the workspace subscription post to a separate topic on failure. This guarantees one retry if the scheduled job fails. In theory you would need to add one workspace subscription and one failure topic for each retry you want. As the commenter in this linked Q&A above states, a retry counter that persists would add rubustness to this kind of fail-over setup.

 

Hope this helps. Please let me know if you have any further questions.

 

Best,

 

Nathan

Reply