Skip to main content

All,

Need some help. I have a httpcaller hit a webpage but the package it brings back is only a partial response body. The first file attached is the response body from the HTTPCaller and the second is from copying the html using the Chrome developer tools. Do I just have a setting wrong?

 

Using FME 2021.1 with the default HTTPCaller settings for 'Get'

@Takashi Iijima​ 

Note that HTML can also contain Javascript code which can dynamically load and modify parts of the page itself, e.g. based on your browser type (phone vs desktop browser), cookies (have you previously interacted with this page), your login, etc. The HTTPCaller will only retrieve the "initial" state of the HTML, it won't execute any Javascript code that's being referenced.


Thanks David! So the issue (had no idea) could be the Javascript. Is there another way in FME to grab it all? Or should I punt and try to learn and use BeautifulSoup? Pretty new to this so thank you so much for your advice!


Thanks David! So the issue (had no idea) could be the Javascript. Is there another way in FME to grab it all? Or should I punt and try to learn and use BeautifulSoup? Pretty new to this so thank you so much for your advice!

Its very likely that the data you're wanting to read is coming from an API which you can use FME to read. To find that URL, open up developer tools in your browser (F12) and open up the Network Tab, refresh the page and then have a look and see if you can find what you're looking for in there


Reply