Skip to main content

Hi All, i am reading data from a webservice (creator - xml templater - httpCaller) and the data i receive back is not in the format i expected..... Do i need to play with encoding or something else here ?

i have truncated the xml to just a small section included below.

i receive:

  1. <DocumentElement>
  2. <Results_x0020_for_x0020__x0027_Index_x003D_survey_x0026_StreetName_x003D_KASHMIR_x0020_RD_x0026_StreetNo_x003D_31_x0027_>
  3. <Global_x0020_Council_x0020_ID>20060926T154516_3_Rodney</Global_x0020_Council_x0020_ID>
  4. <Asset_x0020_No.>36848</Asset_x0020_No.>
  5. <Contract_x0020_No.>3853</Contract_x0020_No.>
  6. <Contractor>Project Max</Contractor>
  7. <Date_x0020_Inspected>26/09/2006</Date_x0020_Inspected>
  8. <Completion_x0020_Status>IC</Completion_x0020_Status>

i was expecting more like this below , ie without the "_x0020_" etc values

  1. DocumentElement>
  2. <Resultsfor'Index=survey&StreetName;=KASHMIR RD&StreetNo;=31’>
  3. <Global Council ID>20060926T154516_3_Rodney</Global Council ID>
  4. <Asset No.>36848</Asset No.>
  5. <Contract No.>3853</Contract No.>
  6. <Contractor>Project Max</Contractor>
  7. <Date Inspected>26/09/2006</Date Inspected>
  8. <Completion Status>IC</Completion Status>

Any hints or suggestions on how to remedy this ?

Thanks Steve

Hi @goatboy, how about using the StringPairReplacer?

Replacement Pairs: _x0020_ " " _x0026_ & _x0027_ ' _x003D_ =

Note: The resulting text will not be a valid XML document any longer. i.e. XML transformers cannot be used to parse that.


Hi @goatboy, how about using the StringPairReplacer?

Replacement Pairs: _x0020_ " " _x0026_ & _x0027_ ' _x003D_ =

Note: The resulting text will not be a valid XML document any longer. i.e. XML transformers cannot be used to parse that.

Thanks @takashi  . Any idea why the feed would be coming thru with these codes? I am hoping to avoid replacing strings. As you mentioned, i am still keen to use the XML tools to parse the data later in the workbench. 


Thanks @takashi . Any idea why the feed would be coming thru with these codes? I am hoping to avoid replacing strings. As you mentioned, i am still keen to use the XML tools to parse the data later in the workbench.

That looks like a variation of hex encoding.

Does your original data come from sharepoint?

 


That looks like a variation of hex encoding.

Does your original data come from sharepoint?

 

it comes from a amazonaws service i believe

http://ec2-54-252-37-255.ap-southeast-2.compute.amazonaws.com/ImportService.asmx

steve


Try a StringReplacer

Text to Match: _x00([0-9a-z]*)_

Replacement Text: %\\1

Followed by a TextDecoder with the Encoding Type set to URL (Percent Encoding)


Try a StringReplacer

Text to Match: _x00([0-9a-z]*)_

Replacement Text: %\\1

Followed by a TextDecoder with the Encoding Type set to URL (Percent Encoding)

Many Thanks JDH, i will give that a try and report back.

Steve


Try a StringReplacer

Text to Match: _x00([0-9a-z]*)_

Replacement Text: %\\1

Followed by a TextDecoder with the Encoding Type set to URL (Percent Encoding)

Many Thanks JDH, that seemed to work. i still am wondering why it came thru like that but thats a question for another day.....

Thanks Again

Steve


Many Thanks JDH, i will give that a try and report back.

Steve

I actually changed my mind.

 

 

I prefer

 

_x(_0-9a-z]*)_

 

U+\\1

 

With the textDecoder set to Unicode Code Point

 

 

It covers the non latin characters better.

Reply