Solved

HTTPCaller does not work with a simple GET url


Badge +11

Hi,

Could I have some explanation why HTTPCaller does not bring any data from a simple GET from this website:

 

https://www.puntopack.es/buscar-el-punto-pack-mas-cercano/?codePays=ES&codePostal=32005

 

Not sure if there are any parameters that I'd need to add.

The REJECTED ERROR value: "HTTP/2 503"

_fme_rejection_code: "NO_RESULT"

 

Uploaded a screenshot of the HttpCaller.

Many thanks in advance!

J.

 

icon

Best answer by hkingsbury 22 June 2022, 00:20

View original

7 replies

Userlevel 2
Badge +13

Hello @juanmahere​ , thanks for providing the URL! If I try to make the same GET request in Postman, I get a pretty similar result. It looks like this website doesn't like people scraping. After a quick search, it looks like they have an API you can leverage (see Page 8, I think this is the most up to date doc, but it may be worth double checking). Would you mind trying to use the API? Best, Kailin.

Userlevel 5
Badge +29

Hello @juanmahere​ , thanks for providing the URL! If I try to make the same GET request in Postman, I get a pretty similar result. It looks like this website doesn't like people scraping. After a quick search, it looks like they have an API you can leverage (see Page 8, I think this is the most up to date doc, but it may be worth double checking). Would you mind trying to use the API? Best, Kailin.

Further to that, a 503 error indicates that the server is not able to handle the request, suggesting that there is something not 100% on the server. In this case it seems to be that they're blocking scraping. You could possibly get around it by including a different user agent or referer

Badge +11

Hello @juanmahere​ , thanks for providing the URL! If I try to make the same GET request in Postman, I get a pretty similar result. It looks like this website doesn't like people scraping. After a quick search, it looks like they have an API you can leverage (see Page 8, I think this is the most up to date doc, but it may be worth double checking). Would you mind trying to use the API? Best, Kailin.

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

Userlevel 5
Badge +29

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

Looking at the above the only header you should need is the requestverificationtoken. This will go in the headers part of the httpcaller

image 

I would hope you don't need the cookie header, if you do, thats going to need some additional calls and configuration

 

Badge +11

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

Hi @hkingsbury​ ,

Thanks for replying to this issue. Unfortunately, I'm sending that request you suggested, and I find no way to get it through.

Here is an example, passing the values as Published Params, and using "requestverificationtoken" on headerget_function_header:

 

 

Userlevel 5
Badge +29

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

Hmm, having a look at the site, its sitting behind cloudflare. The response i'm getting is that cloudflare has blocked it. I can get around it by setting a referer, cookie, user-agent and the token. However I imagine that both the token and cookie and time based and will expire eventually.

 

Basically its been setup to stop what you're doing in regards to scraping. If you need the data from it, your best bet is going to be to contact the company directly and see if they can supply it as a file, or have a developers portal you can sign up to

Badge +11

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

@hkingsbury​  many thanks for the investigation, as you suggested, it may take less pain if I directly contact the company. On the other hand, I was more interested in the FME HttpCall setup rather than the data, honestly.

Thanks everybody for your tips!

Juanma,

Reply