Skip to main content

Hi,

I have been asked if FME can retrieve signal data from a site with no API. I have credentials to get into the site, and once in I can pass a list of site ids to a url to retrieve a csv file of the data.

I'm having issues passing my credentials in the header of the HTTPCaller as UserName and Password to get past the login page.

Any idea if this is possible? This is the site in question:

https://www.detecdatapro2.com/login

With having no credentials to access the site i can only make assumptions here.

 

I imagine when you successfully login the site will return a token of sorts that will need to be included on subsequent calls, or it sets a cookie that proves you have access.

 

I also wonder if you could try access the url of the data and use Basic authentication and specify your username and password in FME.

 

Even longer shot, have you tried putting UserName and Password in the header when you access the data url?


Hi @nedwaterman​,

It may be possible. You can use Chrome's developer tools or Fiddler to track what the form on the page is POSTing back to the server when you enter the correct password, then reproduce that with a HTTPCaller transformer. If you set the HTTPCaller parameter Advanced - HTTP Client Options - Save Cookies to Yes, the authentication cookie received from the first HTTPCaller will be added to the any subsequent HTTPCallers in the workspace.

You will likely need to do the same thing for the page where you enter the list of site ids.

One thing that concerns me is that the initial login page has a '__RequestVerificationToken' entry that needs to be passed back, but is pre-populated with a different value for each copy of the page. You may need to add a HTTPCaller at the very beginning to read the login page to extract that token, in order to submit it with your login.

The existence of the token seems designed to block an automated approach to obtaining the date from this site, so I would check that you are conforming to the service's terms and conditions.


Hi @nedwaterman​,

It may be possible. You can use Chrome's developer tools or Fiddler to track what the form on the page is POSTing back to the server when you enter the correct password, then reproduce that with a HTTPCaller transformer. If you set the HTTPCaller parameter Advanced - HTTP Client Options - Save Cookies to Yes, the authentication cookie received from the first HTTPCaller will be added to the any subsequent HTTPCallers in the workspace.

You will likely need to do the same thing for the page where you enter the list of site ids.

One thing that concerns me is that the initial login page has a '__RequestVerificationToken' entry that needs to be passed back, but is pre-populated with a different value for each copy of the page. You may need to add a HTTPCaller at the very beginning to read the login page to extract that token, in order to submit it with your login.

The existence of the token seems designed to block an automated approach to obtaining the date from this site, so I would check that you are conforming to the service's terms and conditions.

Thanks Dave & hkingsbury. I'll try the latter approach. I'm concerned that we want to use FME to bypass site security to extract data so will feed this back to my line management before I proceed.


Thanks Dave & hkingsbury. I'll try the latter approach. I'm concerned that we want to use FME to bypass site security to extract data so will feed this back to my line management before I proceed.

I wouldn't say you're by passing security (implies you can access without credentials), you're just spoofing it using FME instead of a browser


Thanks Dave & hkingsbury. I'll try the latter approach. I'm concerned that we want to use FME to bypass site security to extract data so will feed this back to my line management before I proceed.

That's true - thanks for your help! I may be back for more advice.


I'm using this and am not getting past the login page:

imagefrom examination the username field is UserName and the password Password, which I have also passed in the Header, but no joy. Bit beyond my current skills, but any idea on what I am doing wrong?

I can extract the _RequestVerificationToken but only by grabbing the whole attribute:

imageI can split it out but then have no idea how I return it back to the login page

 


I'm using this and am not getting past the login page:

imagefrom examination the username field is UserName and the password Password, which I have also passed in the Header, but no joy. Bit beyond my current skills, but any idea on what I am doing wrong?

I can extract the _RequestVerificationToken but only by grabbing the whole attribute:

imageI can split it out but then have no idea how I return it back to the login page

 

When you pass the credentials through the header (not as basic auth) what is the response saying? may be some hints in there.

 

I tried looking for some platform documentation on detectronic/detecdatapro2 but didn't really come up with much. Have you reached out to the provider/company and raised that you're wanting to pull data through automatically? They may have some undocumented (publicly) ways you can achieve that


I'm using this and am not getting past the login page:

imagefrom examination the username field is UserName and the password Password, which I have also passed in the Header, but no joy. Bit beyond my current skills, but any idea on what I am doing wrong?

I can extract the _RequestVerificationToken but only by grabbing the whole attribute:

imageI can split it out but then have no idea how I return it back to the login page

 

You need to duplicate the login page call, which is a POST with the body containing the username, password and token:

UserName=myuser&Password=mypassword&__RequestVerificationToken=2qPatvEjFf4RnGOaw8Sraazq1k5F62b9VB6VFanopOxeJobVXsDB8H6oLRzUJxU8vh_OHUTUrmBtJKmeMLr3OvCpeSw1cRqeIHI9wt9ReI41

Do not use the Authentication settings.

Screen Shot 2023-11-02 at 12.49.58 PMI am attaching an initial workspace to play with.


You need to duplicate the login page call, which is a POST with the body containing the username, password and token:

UserName=myuser&Password=mypassword&__RequestVerificationToken=2qPatvEjFf4RnGOaw8Sraazq1k5F62b9VB6VFanopOxeJobVXsDB8H6oLRzUJxU8vh_OHUTUrmBtJKmeMLr3OvCpeSw1cRqeIHI9wt9ReI41

Do not use the Authentication settings.

Screen Shot 2023-11-02 at 12.49.58 PMI am attaching an initial workspace to play with.

Thanks Dave. I've tried that and have passed all the credentials to the page (and subsequent pages) but all I get back in the _response_body is a copy of the login page. Guessing the security is even more knotty than I imagined!


Reply