Question

Converting repetitive XML attributes into tabular structure

  • 3 December 2018
  • 6 replies
  • 4 views

Badge +1

Hi,

I am trying to flatten information stored in XML into a tabular structure. Some attributes can be repetitive (for instance a water area name can be provided in different languages or different alphabets). My guess is that the XMLXQueryExploder transformer should be able to play a role there or maybe simply a ListExploder. I guess using a XQuery would be more appropriate but I didn't figure out the appropriate syntax yet. Basically I would need to extract the name information from the attached XML and populate these into a tabular structure such as shp or gdb table.

The desired layout would look like:

FeatureIdNameTypeISOLanguageCodeLanguageCodePrimaryNameIsoScriptCodeNameTextNotationAlphabetGeoActionTypeGeoView93e05831-25b5-4df2-b264-179c9b7914b2StandardNameENGEnglishTRUELatnPersian GulfLatinRemoveZAL93e05831-25b5-4df2-b264-179c9b7914b2StandardNameENGEnglishTRUELatnArabian GulfLatinAddZAL93e05831-25b5-4df2-b264-179c9b7914b2StandardNameARAArabicFALSEArabاÙ?Ø®Ù?Ù?ج اÙ?عربÙ?ArabicAddZAL00004153-3200-3c00-0000-000041dcf9acStandardNameUNDTransliterationTRUELatnIndian OceanLatin

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameZHOChineseFALSEHans�度�HanSimplified

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameZHOChineseFALSEHant�度�HanTraditional

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameENGEnglishFALSELatnIndian OceanLatin

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameFREFrenchFALSELatnOcéan IndienLatin

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameDEUGermanFALSELatnIndischer OzeanLatin

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameITAItalianFALSELatnOceano IndianoLatin

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameJPNJapaneseFALSEJpanã?¤ã?³ã??æ´?Japanese

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameKORKoreanFALSEKoreì�¸ë�?ì??Korean

 

 

 

 

 

 

 

 

 

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameRUSRussianFALSECyrlÐ?ндийÑ�кий Ð?кеанCyrillic

 

 

00004153-3200-3c00-0000-000041dcf9acStandardNameRUSRussianFALSELatnIndiyskiy OkeanLatin

 

 

 

 

 

 

 

 

 

 

 

00004436-3800-3c00-0000-0000434b2ce8 StandardNameDEUGermanTRUELatnJadebusenLatin

 

 

00004436-3800-3c00-0000-0000434b2ce8 AlternateNameDEUGermanFALSELatnOldoogrinneLatin

 

 

Each water area has its own feature id and this one is repeated for each name combination.

Could you suggest which approach is the most appropriate for doing this? Actually I would just need to explode each list (if there is one) and also the child lists in iterative way until all lists have been exploded.

Thank you!

Olivier


6 replies

Badge +6

Hello @olivier,

 

 

I would recommend trying to use the XMLFragmenter over the XMLQueryExtractor, it is both easier and less cryptic to use than the Extractor. Within the XMLFragmenter, you have the ability to exclude elements('elements to ignore'). This will require Flattening to be turned on, this is off by default.

The XMLFragmenter functions like an XML reader, an example of which can be found here: https://knowledge.safe.com/articles/28309/tutorial-getting-started-with-xml-reading.html

 

 

The trickiest part will be getting the Feature path correct for the XMLFragmenter.

 

 

I hope that helps!
Badge +1

Hello @olivier,

 

 

I would recommend trying to use the XMLFragmenter over the XMLQueryExtractor, it is both easier and less cryptic to use than the Extractor. Within the XMLFragmenter, you have the ability to exclude elements('elements to ignore'). This will require Flattening to be turned on, this is off by default.

The XMLFragmenter functions like an XML reader, an example of which can be found here: https://knowledge.safe.com/articles/28309/tutorial-getting-started-with-xml-reading.html

 

 

The trickiest part will be getting the Feature path correct for the XMLFragmenter.

 

 

I hope that helps!

Thank you @trentatsafe. I’ll look into the XMLFragmentor as soon as I can.

Userlevel 2
Badge +17

I guess: an element called "complexAttribute" having "TTOM-Name:NameSet.NameTransliterationSet" [Edited] as its "attributeType" attribute should be mapped to a record in the destination table. However, some element/attribute values from its parent node and sibling nodes should also be saved into corresponding fields.

If my guessing above was correct, combination of XMLFragmenter(s) and XMLXQueryExtractor(s) might help you. Also, if you could clarify mapping rules between each field in the destination table and element/attribute in the source XML document, we could suggest a more specific solution.

Userlevel 2
Badge +17

I guess: an element called "complexAttribute" having "TTOM-Name:NameSet.NameTransliterationSet" [Edited] as its "attributeType" attribute should be mapped to a record in the destination table. However, some element/attribute values from its parent node and sibling nodes should also be saved into corresponding fields.

If my guessing above was correct, combination of XMLFragmenter(s) and XMLXQueryExtractor(s) might help you. Also, if you could clarify mapping rules between each field in the destination table and element/attribute in the source XML document, we could suggest a more specific solution.

This is an example using pairs of XMLFragmenter and XMLXQueryExtracctor: xmlfragmenter-xmlxqueryextractor-example.fmwt (FME 2018.1.1.0)

Hope this helps.

Userlevel 2
Badge +17

I guess: an element called "complexAttribute" having "TTOM-Name:NameSet.NameTransliterationSet" [Edited] as its "attributeType" attribute should be mapped to a record in the destination table. However, some element/attribute values from its parent node and sibling nodes should also be saved into corresponding fields.

If my guessing above was correct, combination of XMLFragmenter(s) and XMLXQueryExtractor(s) might help you. Also, if you could clarify mapping rules between each field in the destination table and element/attribute in the source XML document, we could suggest a more specific solution.

Upgraded: xmlfragmenter-xmlxqueryextractor-example-2.fmwt (FME 2018.1.1.0)

Badge +1

Thank you so much @takashi ! This is exactly the solution I was looking for. Your workspace leads to the structure I needed. So the clue to this is using pairs of XMLFragmenter and XMLXQueryExtractor transformers. Your fmw template also allowed me to get a better understanding of the XQuery syntac in order to extract specific info.

Reply