Question

Read data from a secured website

  • 3 May 2023
  • 1 reply
  • 7 views

Badge

Hello,

I have several website from which I need to read data, mostly stored in tables.

For the public website, no issue as I am using I am trying to reader 'HTML Reader'.

For the secured website, the sequence would need to

  1. navigate first to the login page, (https://members.embuild.be/fr/user/login)
  2. enter credentials,
  3. navigate to the secured page containing the desired table
  4. and read the html table.

After reading several posts, my designed workflow would:

  1. HttpCaller URL: https://members.embuild.be/fr/user/login, method: POST with body 'Mulitpart / Form data' and the upload: Name - Type - Value = name - String Upload - my@email.com and Name - Type - Value = pass - String Upload - myPassword.
  2. HttpCaller (En Serie from the first one), URL: https://members.embuild.be/fr/chiffres-et-donnees/indices/cout-des-materiaux/indices-des-materiaux, method : GET, Save response body to an attribute.
  3. ...

Unfortunately, I cannot go thru the first step. http status code is 200 OK, but the response body of the second http caller is still the html code of the login page.

Any advice or help on th sequence of readers/transformers and configuration would be appreciated.

 


1 reply

Userlevel 2
Badge +17

Hi @thierrymeessen​,

In the first HTTPCaller, please try setting the parameter HTTP Client Options - Save Cookies to Yes. Subsequent HTTPCallers will use any cookies set by the first in their requests. This should mimic the way the authentication is passed from the initial web page.

If this works, save response of the second HTTPCaller to a temp file with an AttributeFileWriter, then use a FeatureReader with the HTML Table reader to read the temp file.

Reply