Skip to main content

Hi there. Have an HTTPCaller making a SOAP request for me as a moderately complex security header was needed and I couldn't figure out how to drive the SOAPSender to do the same, but that's not my problem. The return response, when saved to a file, can be opened in 7z, and the properties show it as a zip file, containing an XML file (the honey pot I'm after). The problem is looking at the actual file in notepad++, it looks like it's a multipart MIME message. The first few lines are:

------=_Part_11980_1370403347.1468922126650
Content-Type: application/xop+xml; charset=utf-8; type="text/xml"


<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Header>
<wsse:Security xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd" SOAP-ENV:mustUnderstand="1"><wsu:Timestamp xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd" wsu:Id="XWSSGID-1467538852463-1879353698"><wsu:Created>2016-07-19T09:55:26.650Z</wsu:Created><wsu:Expires>2016-07-19T10:00:26.650Z</wsu:Expires></wsu:Timestamp></wsse:Security></SOAP-ENV:Header><SOAP-ENV:Body><ns2:GetExtractResponse xmlns:ns2="http://ws.edubase.texunatech.com"><ns2:Extract><xop:Include xmlns:xop="http://www.w3.org/2004/08/xop/include" href="cid:b0cdd12c-cb91-4bcd-973a-086bf5de4139%40ws.edubase.texunatech.com"/></ns2:Extract></ns2:GetExtractResponse></SOAP-ENV:Body></SOAP-ENV:Envelope>
------=_Part_11980_1370403347.1468922126650
Content-Type: application/octet-stream
Content-ID: <b0cdd12c-cb91-4bcd-973a-086bf5de4139@ws.edubase.texunatech.com>
Content-Transfer-Encoding: binary


PK  0óH        ............


............

------=_Part_11980_1370403347.1468922126650--

So my first problem is trying to figure out how to read this in with a feature reader. Anyone with a potential lead on how to solve this?

Looks like the binary part of the MIME block is a zip file (header "PK" is a giveaway). If you can extract that part you can probably save it using the Data File Writer and then read it back using the XML reader pointing at the XML file inside the zip.

If you feel up to it, an option would be to use a PythonCaller and the MIME functionality of the email parsing module, it would give you the most control over the outcome I think.

Another option might be to use a separate command line tool to unpack the mime message into the its separate files, there are some options described here. You can use the SystemCaller transformer for this. This is probably the easiest solution.


Tried extracting only the MIME block containing the PK header, as the response headers defined the MIME separators, using a RegxAttributeSplitter, but something seems to go wrong with the encoding somewhere (what you can see in the variable in debug changes) in that custom transformer, and the output is invalid (I stripped everything before the PK). I'll look into the RegxAttributeSplitter and see if there's any forced encoding anywhere.


Thanks for the help. Not the answer I was looking for, but considering my deadlines, it will have to do for now. Used the SystemCaller, against the response from the HTTPCaller saved as a file, to call 7zip to extract it to a directory in the project folder, so I can read in the XML via a feature reader. Very clunky solution but it will do for now (I hope).


Reply