Question

Converting repetitive XML attributes into tabular structure

5 years ago
3 December 2018
6 replies
4 views

olivier
Contributor
59 replies

Hi,

I am trying to flatten information stored in XML into a tabular structure. Some attributes can be repetitive (for instance a water area name can be provided in different languages or different alphabets). My guess is that the XMLXQueryExploder transformer should be able to play a role there or maybe simply a ListExploder. I guess using a XQuery would be more appropriate but I didn't figure out the appropriate syntax yet. Basically I would need to extract the name information from the attached XML and populate these into a tabular structure such as shp or gdb table.

The desired layout would look like:

FeatureIdNameTypeISOLanguageCodeLanguageCodePrimaryNameIsoScriptCodeNameTextNotationAlphabetGeoActionTypeGeoView93e05831-25b5-4df2-b264-179c9b7914b2StandardNameENGEnglishTRUELatnPersian GulfLatinRemoveZAL93e05831-25b5-4df2-b264-179c9b7914b2StandardNameENGEnglishTRUELatnArabian GulfLatinAddZAL93e05831-25b5-4df2-b264-179c9b7914b2StandardNameARAArabicFALSEArabØ§Ù?Ø®Ù?Ù?Ø¬ Ø§Ù?Ø¹Ø±Ø¨Ù?ArabicAddZAL00004153-3200-3c00-0000-000041dcf9acStandardNameUNDTransliterationTRUELatnIndian OceanLatin

00004153-3200-3c00-0000-000041dcf9acStandardNameZHOChineseFALSEHanså�°åº¦æ´?HanSimplified

00004153-3200-3c00-0000-000041dcf9acStandardNameZHOChineseFALSEHantå�°åº¦æ´?HanTraditional

00004153-3200-3c00-0000-000041dcf9acStandardNameENGEnglishFALSELatnIndian OceanLatin

00004153-3200-3c00-0000-000041dcf9acStandardNameFREFrenchFALSELatnOcéan IndienLatin

00004153-3200-3c00-0000-000041dcf9acStandardNameDEUGermanFALSELatnIndischer OzeanLatin

00004153-3200-3c00-0000-000041dcf9acStandardNameITAItalianFALSELatnOceano IndianoLatin

00004153-3200-3c00-0000-000041dcf9acStandardNameJPNJapaneseFALSEJpanã?¤ã?³ã??æ´?Japanese

00004153-3200-3c00-0000-000041dcf9acStandardNameKORKoreanFALSEKoreì�¸ë�?ì??Korean

…

00004153-3200-3c00-0000-000041dcf9acStandardNameRUSRussianFALSECyrlÐ?Ð½Ð´Ð¸Ð¹Ñ�ÐºÐ¸Ð¹ Ð?ÐºÐµÐ°Ð½Cyrillic

00004153-3200-3c00-0000-000041dcf9acStandardNameRUSRussianFALSELatnIndiyskiy OkeanLatin

…

00004436-3800-3c00-0000-0000434b2ce8 StandardNameDEUGermanTRUELatnJadebusenLatin

00004436-3800-3c00-0000-0000434b2ce8 AlternateNameDEUGermanFALSELatnOldoogrinneLatin

Each water area has its own feature id and this one is repeated for each name combination.

Could you suggest which approach is the most appropriate for doing this? Actually I would just need to explode each list (if there is one) and also the child lists in iterative way until all lists have been exploded.

Thank you!

Olivier

6 replies

Hello @olivier,

I would recommend trying to use the XMLFragmenter over the XMLQueryExtractor, it is both easier and less cryptic to use than the Extractor. Within the XMLFragmenter, you have the ability to exclude elements('elements to ignore'). This will require Flattening to be turned on, this is off by default.

The XMLFragmenter functions like an XML reader, an example of which can be found here: https://knowledge.safe.com/articles/28309/tutorial-getting-started-with-xml-reading.html

The trickiest part will be getting the Feature path correct for the XMLFragmenter.

I hope that helps!

olivier
Author
Contributor
59 replies
5 years ago
10 December 2018

Hello @olivier,

The XMLFragmenter functions like an XML reader, an example of which can be found here: https://knowledge.safe.com/articles/28309/tutorial-getting-started-with-xml-reading.html

The trickiest part will be getting the Feature path correct for the XMLFragmenter.

I hope that helps!

Thank you @trentatsafe. I’ll look into the XMLFragmentor as soon as I can.

Userlevel 2

+17

takashi
Contributor
7538 replies
5 years ago
11 December 2018

I guess: an element called "complexAttribute" having "TTOM-Name:NameSet.NameTransliterationSet" [Edited] as its "attributeType" attribute should be mapped to a record in the destination table. However, some element/attribute values from its parent node and sibling nodes should also be saved into corresponding fields.

If my guessing above was correct, combination of XMLFragmenter(s) and XMLXQueryExtractor(s) might help you. Also, if you could clarify mapping rules between each field in the destination table and element/attribute in the source XML document, we could suggest a more specific solution.

Userlevel 2

+17

takashi
Contributor
7538 replies
5 years ago
11 December 2018

This is an example using pairs of XMLFragmenter and XMLXQueryExtracctor: xmlfragmenter-xmlxqueryextractor-example.fmwt (FME 2018.1.1.0)

Hope this helps.

Userlevel 2

+17

takashi
Contributor
7538 replies
5 years ago
11 December 2018

Upgraded: xmlfragmenter-xmlxqueryextractor-example-2.fmwt (FME 2018.1.1.0)

olivier
Author
Contributor
59 replies
5 years ago
17 December 2018

Upgraded: xmlfragmenter-xmlxqueryextractor-example-2.fmwt (FME 2018.1.1.0)

Thank you so much @takashi ! This is exactly the solution I was looking for. Your workspace leads to the structure I needed. So the clue to this is using pairs of XMLFragmenter and XMLXQueryExtractor transformers. Your fmw template also allowed me to get a better understanding of the XQuery syntac in order to extract specific info.

Converting repetitive XML attributes into tabular structure

1 Attachment

6 replies

Reply

Community Stats

1 Attachment

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded