Skip to main content

Hi,

How can I best accomplish reading a generic (any) XML file without specifying any initial root tags etc. ?

I.e. return the whole XML document as a single fragment.

Do I need to create some special XRS or xfMap templates, or can it be accomplished with the simpler Feature Paths ?

Cheers

If you just want a single fragment you might as well use a TextLine reader and set it to read the whole file at once. You'll then get a single feature with one attribute containing your entire XML file. You wouldn't be able to do much with it though, so are you sure that's what you want?


an idea is to first analyze the xml for its tags and pass that to a second workspace or feature reader to read the xml.


Hi @lifalin2016, the XML reader can also be used.

When adding the reader, check the Single Merged Feature Type for the Workflow Options in the Add Reader dialog, and set writer parameters as below.

  • Configuration Type: Feature Paths

  • Feature Paths Configuration/Elements to Match: //*

  • Flatten Options: Uncheck the Enable Flattening checkbox

Here, the //* (two slashes and an asterisk) indicates the document root element with any name.

If you read an XML document with this setting, the reader feature type will output a single feature having "xml_fragment" attribute, which stores the entire XML document.

[Addition] /* (a single slash and an asterisk) seems to work too.


Hi @lifalin2016, the XML reader can also be used.

When adding the reader, check the Single Merged Feature Type for the Workflow Options in the Add Reader dialog, and set writer parameters as below.

  • Configuration Type: Feature Paths

  • Feature Paths Configuration/Elements to Match: //*

  • Flatten Options: Uncheck the Enable Flattening checkbox

Here, the //* (two slashes and an asterisk) indicates the document root element with any name.

If you read an XML document with this setting, the reader feature type will output a single feature having "xml_fragment" attribute, which stores the entire XML document.

[Addition] /* (a single slash and an asterisk) seems to work too.

Thanks Takashi, it worked.

 

 

I needed to read multiple XML files in a ZIP package to ensure before attempting a translation that they don't contain any errors. And each XML file has its very own schema setup, hence the need to read "generically".

 

 

Unfortunately the XML files in question are generated by an external tool outside our jurisdiction that clearly doesn't validate its own output, so we have to make sure we don't waste hours trying to import them needlessly.

 

 

This approach may come in handy in other cases too :-)

 

 

Cheers

 


an idea is to first analyze the xml for its tags and pass that to a second workspace or feature reader to read the xml.

note to myself: read carefully!!!

 

 


Reply