Skip to main content
Solved

HTTPCaller does not work with a simple GET url


juanmahere
Supporter
Forum|alt.badge.img+11

Hi,

Could I have some explanation why HTTPCaller does not bring any data from a simple GET from this website:

 

https://www.puntopack.es/buscar-el-punto-pack-mas-cercano/?codePays=ES&codePostal=32005

 

Not sure if there are any parameters that I'd need to add.

The REJECTED ERROR value: "HTTP/2 503"

_fme_rejection_code: "NO_RESULT"

 

Uploaded a screenshot of the HttpCaller.

Many thanks in advance!

J.

 

Best answer by hkingsbury

juanmahere wrote:

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

Hmm, having a look at the site, its sitting behind cloudflare. The response i'm getting is that cloudflare has blocked it. I can get around it by setting a referer, cookie, user-agent and the token. However I imagine that both the token and cookie and time based and will expire eventually.

 

Basically its been setup to stop what you're doing in regards to scraping. If you need the data from it, your best bet is going to be to contact the company directly and see if they can supply it as a file, or have a developers portal you can sign up to

View original
Did this help you find an answer to your question?

7 replies

kailinatsafe
Safer
Forum|alt.badge.img+21

Hello @juanmahere​ , thanks for providing the URL! If I try to make the same GET request in Postman, I get a pretty similar result. It looks like this website doesn't like people scraping. After a quick search, it looks like they have an API you can leverage (see Page 8, I think this is the most up to date doc, but it may be worth double checking). Would you mind trying to use the API? Best, Kailin.


hkingsbury
Celebrity
Forum|alt.badge.img+54
  • Celebrity
  • June 16, 2022
kailinatsafe wrote:

Hello @juanmahere​ , thanks for providing the URL! If I try to make the same GET request in Postman, I get a pretty similar result. It looks like this website doesn't like people scraping. After a quick search, it looks like they have an API you can leverage (see Page 8, I think this is the most up to date doc, but it may be worth double checking). Would you mind trying to use the API? Best, Kailin.

Further to that, a 503 error indicates that the server is not able to handle the request, suggesting that there is something not 100% on the server. In this case it seems to be that they're blocking scraping. You could possibly get around it by including a different user agent or referer


juanmahere
Supporter
Forum|alt.badge.img+11
  • Author
  • Supporter
  • June 17, 2022
kailinatsafe wrote:

Hello @juanmahere​ , thanks for providing the URL! If I try to make the same GET request in Postman, I get a pretty similar result. It looks like this website doesn't like people scraping. After a quick search, it looks like they have an API you can leverage (see Page 8, I think this is the most up to date doc, but it may be worth double checking). Would you mind trying to use the API? Best, Kailin.

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman


hkingsbury
Celebrity
Forum|alt.badge.img+54
  • Celebrity
  • June 19, 2022
juanmahere wrote:

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

Looking at the above the only header you should need is the requestverificationtoken. This will go in the headers part of the httpcaller

image 

I would hope you don't need the cookie header, if you do, thats going to need some additional calls and configuration

 


juanmahere
Supporter
Forum|alt.badge.img+11
  • Author
  • Supporter
  • June 20, 2022
juanmahere wrote:

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

Hi @hkingsbury​ ,

Thanks for replying to this issue. Unfortunately, I'm sending that request you suggested, and I find no way to get it through.

Here is an example, passing the values as Published Params, and using "requestverificationtoken" on headerget_function_header:

 

 


hkingsbury
Celebrity
Forum|alt.badge.img+54
  • Celebrity
  • Best Answer
  • June 21, 2022
juanmahere wrote:

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

Hmm, having a look at the site, its sitting behind cloudflare. The response i'm getting is that cloudflare has blocked it. I can get around it by setting a referer, cookie, user-agent and the token. However I imagine that both the token and cookie and time based and will expire eventually.

 

Basically its been setup to stop what you're doing in regards to scraping. If you need the data from it, your best bet is going to be to contact the company directly and see if they can supply it as a file, or have a developers portal you can sign up to


juanmahere
Supporter
Forum|alt.badge.img+11
  • Author
  • Supporter
  • June 22, 2022
juanmahere wrote:

Thanks @kailinatsafe​  and @hkingsbury​  for your tips.

I moved forward and from Postman, I could make the request using this structure (screenshot), but I cannot reproduce/translate into FME. Any ideas?

get_function2_postman

@hkingsbury​  many thanks for the investigation, as you suggested, it may take less pain if I directly contact the company. On the other hand, I was more interested in the FME HttpCall setup rather than the data, honestly.

Thanks everybody for your tips!

Juanma,


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings