Skip to main content
Question

HTTPCaller not grabbing all the html code


All,

Need some help. I have a httpcaller hit a webpage but the package it brings back is only a partial response body. The first file attached is the response body from the HTTPCaller and the second is from copying the html using the Chrome developer tools. Do I just have a setting wrong?

 

Using FME 2021.1 with the default HTTPCaller settings for 'Get'

@Takashi Iijima​ 

3 replies

david_r
Celebrity
  • July 14, 2023

Note that HTML can also contain Javascript code which can dynamically load and modify parts of the page itself, e.g. based on your browser type (phone vs desktop browser), cookies (have you previously interacted with this page), your login, etc. The HTTPCaller will only retrieve the "initial" state of the HTML, it won't execute any Javascript code that's being referenced.


Thanks David! So the issue (had no idea) could be the Javascript. Is there another way in FME to grab it all? Or should I punt and try to learn and use BeautifulSoup? Pretty new to this so thank you so much for your advice!


hkingsbury
Celebrity
Forum|alt.badge.img+55
  • Celebrity
  • July 16, 2023
natehewes13 wrote:

Thanks David! So the issue (had no idea) could be the Javascript. Is there another way in FME to grab it all? Or should I punt and try to learn and use BeautifulSoup? Pretty new to this so thank you so much for your advice!

Its very likely that the data you're wanting to read is coming from an API which you can use FME to read. To find that URL, open up developer tools in your browser (F12) and open up the Network Tab, refresh the page and then have a look and see if you can find what you're looking for in there


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings