Skip to main content

In a related topic I’ve asked about api’s and JWT authorization: 

 

The currently only way is to do this via python. I’ve got myself a script that I call via a systemcaller that runs the python script in a conda environment (not keen on installing extra python stuff in the FME locations). This works as I can fetch my bearer token to do api calls.

However there is an expiration time on the bearer token (1hour).

So now I’m wondering how to tackle this the best way:

I want to do several http calls in parallel (default 25) but if I’m using the token voor 3500 seconds, I want to call my CT for getting a new bearer token. Any not processed calls (I need to make 3k+) should now use the new token. If I again come to this 3500 seconds another refers etc.

So what would be a good way to achieve this kind of looping? As the http caller has a limitation that if there is a loop port, multi threading should be set to 1, beating my purpose.

I also don’t want to duplicate transformers, because I don’t know how many iterations I would need might be 1 or 5 …

 

So currently I do 1 api call to know how many features I need to get ex 373702, the limit per call is 100 => 3738 calls need to be made, so I clone my 1 record 3738 times and do the http calls in parallel .

Ideally everything that is rejected with 401 error (like above screenshot), should be looped through the BearerTokenGenerator again and reprocessed in the http caller. However no multithreading is possible

 

 

Anyone some ideas about how to handle this?

One thought that I had was to create groups of 500 or 1k calls and put those also in a CT with a group by clause. Per group (of 1K so 4 groups in my example) I would get a token, would be able to do parallel threads (and probably even parallelization on the CT itself, if the other server can handle that).
But it still seems dirty as I don’t know if 500 or 1k per group would be ok or not.

So anyone any other ideas for this?

Interesting question!

I’ve come up with the attached approach (2024.1). Note:

  • Its high level conceptual, will need to adjust to suit
  • Will need some further testing to use with an HTTPCaller in parallel (done my best to replicate it)
  • Needs to be a linked custom transformer as the solution includes looping and a blocking transformer (featuremerger)

 

There are two files attached, a workspace, and a custom transformer.

 

The workspace

Pretty basic, just emulates a bunch of features coming in and breaks them out of bulk mode

Custom Transformer

This is where the spice is.

First loop

The features come in and one is sampled off to generate the token, the token in then merged back onto the rest of the features. In one branch we remove, and another we expose the ‘_expiry’ attribute. This does nothing on the first run through.

All features are then passed through a VariableRetriever, currently, there is no value on that variable so the resulting attribute has nothing on it.

Features then run through a test and check if the token we retrieved early is different to the value in the variable, which for the first bunch of features is true

These features then call the api. Eventually, one of the calls fails because the token has expired (Tester)

This failed feature then goes to a VariableSetter and sets the variable to contain the failed token

 

 

Features are still flowing through the upstream VariableRetriever, but now the variable contains the token, so when Tester_2 checks, all features now fail.

These skip calling the API (because we now know the token has expired) and get looped back to the start.

Subsequent loops

All the failed features from the first run now loop back through to the start and run through the process again. However this time, they have the token attribute (_expiry) on them, so we remove it on one branch, and expose it on the branch where we generate the token. The reason for this is that the Sampler is set to group by the token, so we only generate one new token when the previous one has expired and the un processed attributes loop back around to the start


Is there much overhead in getting a new token? If its a pretty quick process and the system is not restrictive (I presume it won’t be). If so, I would use something similar to that “group” approach you talk about. I’m am also suspecting that your token service, is not providing a token expiration in its response, that you could use to identify if a token needs to be renewed?

Sometimes I will update limited-life tokens throughout a process, just because I can’t guarantee that they won’t exceed their lifespan. I do it at a timeline that I know we will be well under the token expiry. And yeah, it not as clean as we may like but its hard to do anything but, if the token services doesn’t provide good details of when a token will expire.


Reply