Skip to main content
Solved

XML Newbie - field names and how to parse


Forum|alt.badge.img

Howdy y’all!

XML newbie here and I have to build a workspace to parse a xml file. I’ve been messing around with the xml reader a bit this morning and cannot seem to get any results. One of the things I noticed immediately is that the field names seem to be only contained in the metadata. Can someone out there maybe show me how to parse these data and write them to something more favorable (excel, shapefile, geodatabase etc)?

Here’s a link to the xml file I’m trying to translate - https://drive.google.com/open?id=0B52_Im1hHfSbOWM2cnA0NDAxdGc

Thanks in advance for any/all help.

Best answer by deanatsafe

Hi @bbbutler ,

The XML tutorial Trent posted is a good place to start. Unfortunately the source XML you need to read from has a rather highly normalized, complex structure, making it a challenge to work with. Basically it has one metadata element with field names and then a series of row elements with value series that correspond to the field names in the metadata element. So what you end up with is a field name list and then a series of value lists. The problem then isn't so much reading the XML as it is transforming the lists that are read into flat FME features.

incidentxml-to-csv.zip

The attached workspace demonstrates 2 approaches. You can just read the row values into one list per feature, then use a ListExploder, AttributeCreator and Aggregator to pivot the list into col0 to col21. Then its a matter of manually mapping this to your destination schema. Or you can use a dynamic schema approach that reads the metadata and value lists, merges them into item{}.name and item{}.value - name / value pairs, and then uses the FeatureBuilder custom transformer from the Hub to generate records from these lists. Unfortunately, at the moment there is some python involved here as FME has limitations when trying to convert between unstructured and structured lists.

View original
Did this help you find an answer to your question?

trentatsafe
Safer
Forum|alt.badge.img+6

Hello @bbbutler,

 

Here is an XML tutorial that may help you get started with reading XML and transforming it:

 

https://knowledge.safe.com/articles/29553/tutorial-getting-started-with-xml.html

deanatsafe
Safer
Forum|alt.badge.img+6

Hi @bbbutler ,

The XML tutorial Trent posted is a good place to start. Unfortunately the source XML you need to read from has a rather highly normalized, complex structure, making it a challenge to work with. Basically it has one metadata element with field names and then a series of row elements with value series that correspond to the field names in the metadata element. So what you end up with is a field name list and then a series of value lists. The problem then isn't so much reading the XML as it is transforming the lists that are read into flat FME features.

incidentxml-to-csv.zip

The attached workspace demonstrates 2 approaches. You can just read the row values into one list per feature, then use a ListExploder, AttributeCreator and Aggregator to pivot the list into col0 to col21. Then its a matter of manually mapping this to your destination schema. Or you can use a dynamic schema approach that reads the metadata and value lists, merges them into item{}.name and item{}.value - name / value pairs, and then uses the FeatureBuilder custom transformer from the Hub to generate records from these lists. Unfortunately, at the moment there is some python involved here as FME has limitations when trying to convert between unstructured and structured lists.


takashi
Contributor
Forum|alt.badge.img+17
  • Contributor
  • August 9, 2017

If you renamed each <value> element to a proper item name, you could create your desired features easily by XML fragmentation and flattening - XMLFragmenter. Based on this idea, I created this workspace example:

xmlxqueryudpdater-xmlfragmenter-bulkattributerenam.fmwt (FME 2017.0.1.1)


Forum|alt.badge.img
takashi wrote:

If you renamed each <value> element to a proper item name, you could create your desired features easily by XML fragmentation and flattening - XMLFragmenter. Based on this idea, I created this workspace example:

xmlxqueryudpdater-xmlfragmenter-bulkattributerenam.fmwt (FME 2017.0.1.1)

Thank you @takashi this worked perfectly! You are amazing!

 


Forum|alt.badge.img
deanatsafe wrote:

Hi @bbbutler ,

The XML tutorial Trent posted is a good place to start. Unfortunately the source XML you need to read from has a rather highly normalized, complex structure, making it a challenge to work with. Basically it has one metadata element with field names and then a series of row elements with value series that correspond to the field names in the metadata element. So what you end up with is a field name list and then a series of value lists. The problem then isn't so much reading the XML as it is transforming the lists that are read into flat FME features.

incidentxml-to-csv.zip

The attached workspace demonstrates 2 approaches. You can just read the row values into one list per feature, then use a ListExploder, AttributeCreator and Aggregator to pivot the list into col0 to col21. Then its a matter of manually mapping this to your destination schema. Or you can use a dynamic schema approach that reads the metadata and value lists, merges them into item{}.name and item{}.value - name / value pairs, and then uses the FeatureBuilder custom transformer from the Hub to generate records from these lists. Unfortunately, at the moment there is some python involved here as FME has limitations when trying to convert between unstructured and structured lists.

Thanks @DeanAtSafe this is also awesome!

 


takashi
Contributor
Forum|alt.badge.img+17
  • Contributor
  • August 10, 2017
deanatsafe wrote:

Hi @bbbutler ,

The XML tutorial Trent posted is a good place to start. Unfortunately the source XML you need to read from has a rather highly normalized, complex structure, making it a challenge to work with. Basically it has one metadata element with field names and then a series of row elements with value series that correspond to the field names in the metadata element. So what you end up with is a field name list and then a series of value lists. The problem then isn't so much reading the XML as it is transforming the lists that are read into flat FME features.

incidentxml-to-csv.zip

The attached workspace demonstrates 2 approaches. You can just read the row values into one list per feature, then use a ListExploder, AttributeCreator and Aggregator to pivot the list into col0 to col21. Then its a matter of manually mapping this to your destination schema. Or you can use a dynamic schema approach that reads the metadata and value lists, merges them into item{}.name and item{}.value - name / value pairs, and then uses the FeatureBuilder custom transformer from the Hub to generate records from these lists. Unfortunately, at the moment there is some python involved here as FME has limitations when trying to convert between unstructured and structured lists.

Hi @DeanAtSafe,

 

 

> to convert between unstructured and structured lists.

 

 

I sometimes use a BulkAttributeRenamer to do that.

 

 

In the above example, it would be ideal if "item{}" could be hidden after renaming. However, unfortunately, currently there is no transformers that have the "Attributes/Lists to Hide" option, except the PythonCaller and the TclCaller. I hope the option parameter will be added to some other transformers too. See also these Ideas.

 


takashi
Contributor
Forum|alt.badge.img+17
  • Contributor
  • August 10, 2017
takashi wrote:

If you renamed each <value> element to a proper item name, you could create your desired features easily by XML fragmentation and flattening - XMLFragmenter. Based on this idea, I created this workspace example:

xmlxqueryudpdater-xmlfragmenter-bulkattributerenam.fmwt (FME 2017.0.1.1)

Another approach. JSON processing can be applied to convert two lists separately storing names and values to a set of regular attributes. See this workspace example to learn more.

 

jsontemplater-jsonflattener.fmwt (FME 2017.0.1.1)

 

 


takashi
Contributor
Forum|alt.badge.img+17
  • Contributor
  • August 10, 2017
takashi wrote:

If you renamed each <value> element to a proper item name, you could create your desired features easily by XML fragmentation and flattening - XMLFragmenter. Based on this idea, I created this workspace example:

xmlxqueryudpdater-xmlfragmenter-bulkattributerenam.fmwt (FME 2017.0.1.1)

One more. XQuery expression can convert XML to JSON directly.

 

xmlxqueryupdater-jsonfragmenter.fmwt (FME 2017.0.1.1)

 

 


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings