Question

XML and HTTPCaller and %EF%BB%BF


Hello,

 

 

I'm using HTTPCaller to call a Web Feature Service URL as a single feature and pull out a further series of web feature services using XMLFragmenter to generate a subsequent list of URLs. This approach has worked a few other times and works well in this instance .... until I use a second HTTP Caller to call the URLs using the attribute generated from the XMLFragmenter (and a bit of string replacement and concatenation).

 

 

At first, there is no apparent problem. Even looking at the Data Inspector attached to the HTTPCaller rejection port, it appears to be straightforward. However, pasting the URL into a browser directly leads to a failure and copying and repasting the URL into Word reveals the characters %EF%BB%BF as a hidden part of the URL. This also appears as a square (control character?) and the characters ï»¿ as a surprise part of the URL when the _error attribute is looked at in the Data Inspector table. .

 

 

 

As you'll have guessed by now, I am not a web developer and I'm fairly new to FME. But I understand this is a UTF-8 Byte Order Mark, alternatively represented as ISO-8859-1. This is interesting, and I've learned something new, but how do I get rid of it? My aim is to generate a feature and  for each URL called from the second HTTPCaller transformer and do more XML digging from the _response_body attribute.

I've tried StringReplacer with and basic regular expressions to no avail. 

 

 

Grateful for any suggestions!

 

 

Regards,

 

 

Iain

7 replies

Userlevel 2
Badge +17
Hi Iain,

 

 

What kind of error messages have been logged by FME when the second HTTPCaller failed?

 

 

Takashi
Hello Takashi,

 

#

 

It's a 502 Bad Gateway error in the XML response body - the _error from the rejection port contains the URL with the characters ï»¿ 

 

 

Many thanks,

 

 

Iain

 

 
Userlevel 2
Badge +17
A 5xx error code indicates a server side error. Usually a 4xx error would occur if the request was bad.

 

Can you access the server through the HTTPCaller if you enter the URL manually without BOM?

Yes... works fine in that case. 

(1) copy the URL from the attribute; (2) paste it into the web browser (it fails first time); (3) copy and paste into Word and remove the BOM; (4) paste into HTTPCaller

 

 

BOM only appears to be exposed (and removable) when the website is called.

Hmm ... I wonder if I could put in a follow-on HTTP caller and remove the exposed BOM characters and ping the website again....  seems a bit inelegant and I'm not sure what is going on upstream.

 

 
Adding 2 StringReplacers to the rejection port (to take out the leading error message and the  characters) then using the copied and renamed error attribute for HTTPCaller downstream works, but it is very slow....
Userlevel 2
Badge +17
Good to hear you got a workaround to remove the BOM. But I'm still not sure why and where the BOM has been added to the URL...
Yes... it works now. Thanks for the suggestion about manually typing the URL into the HTTPCaller... this prompted the workaround idea. I'll try and work out what is going on upstream, but at least the process now functions. 

Reply