Skip to main content
Solved

Getting attributes from a XML description extracted from a KML

  • February 7, 2019
  • 3 replies
  • 267 views

Forum|alt.badge.img

Hello everyone.

I know there are several questions posted about this same topic but I can't seem to get things working for my particular KML/XML description. I used this tutorial to get most of the way there: https://knowledge.safe.com/articles/19918/how-to-expose-feature-attributes-from-kml-tag.html

I posted there in a comment but I figured I'd start a new question since this is more specific.

I'm stuck on getting either the XQuery or the XML flattener to work. When I convert the description to XML, this is how it is formatted:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

 

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

 

<head>

 

<title></title>

 

</head>

<body>

 

<b>V18279201018990_000.jpg</b><br />

 

<i>Latitude:</i> 37.424416°<br />

 

<i>Longitude:</i> -82.930091°<br />

 

<i>Roll:</i> -15.832788°<br />

 

<i>Pitch:</i> 1.726352°<br />

 

<i>Heading:</i> 161.467052;<br />

 

<i>Height:</i> 230.000<br />

 

<i>GPS Week sec:</i> 591967.285977;0.000

 

</body>

 

</html>

I need to extract the name of the image, Lat/Long, and Pitch/Roll and Height as attributes. I am just not sure with how to pull out this particular format as it is different than other examples I have seen.

Using the XML flattener, I can set the "Elements to Match" to "body" and attributes to expose to "b" and that gives me the name of the image. What I can't figure out is how to set it up the get the other attributes since they aren't really set up in tags like the tutorial example.

Thank You,

Justin

 

 

Best answer by takashi

It's an advanced use of the XMLXQueryExtractor. Assuming that the HTML document is stores in an attribute called "_html" and "http://www.w3.org/1999/xhtml" is declared as the default namespace in the document.

XQuery Expression:

declare default element namespace "http://www.w3.org/1999/xhtml";
fme:set-attribute('Image', data(//body/b)),
for $i in //body/i
let $name := replace(data($i), '^(.+?):?$', '$1')
let $value := normalize-space($i/following-sibling::text()[1])
return fme:set-attribute($name, $value)

 

0684Q00000ArKTCQA3.png

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

3 replies

takashi
Celebrity
  • February 7, 2019

There may be some possible ways. One is to use two StringSearchers to extract attribute names and values and populate them into two lists separately.

This workflow saves attribute names and values into two lists called "_label{}.part" and "_value{}.part", for example.

Regular Expressions:

1st StringSearcher: <i>(.+?):</i>\s*.+?<|$
2nd StringSearcher: <i>.+?:</i>\s*(.+?)<|$

0684Q00000ArKPKQA3.png

You can then create non-list attributes from the two lists. See this thread.

Dynamically create attributes using two lists: one with labels and one with values


takashi
Celebrity
  • Best Answer
  • February 7, 2019

It's an advanced use of the XMLXQueryExtractor. Assuming that the HTML document is stores in an attribute called "_html" and "http://www.w3.org/1999/xhtml" is declared as the default namespace in the document.

XQuery Expression:

declare default element namespace "http://www.w3.org/1999/xhtml";
fme:set-attribute('Image', data(//body/b)),
for $i in //body/i
let $name := replace(data($i), '^(.+?):?$', '$1')
let $value := normalize-space($i/following-sibling::text()[1])
return fme:set-attribute($name, $value)

 

0684Q00000ArKTCQA3.png


Forum|alt.badge.img
  • Author
  • February 8, 2019

It's an advanced use of the XMLXQueryExtractor. Assuming that the HTML document is stores in an attribute called "_html" and "http://www.w3.org/1999/xhtml" is declared as the default namespace in the document.

XQuery Expression:

declare default element namespace "http://www.w3.org/1999/xhtml";
fme:set-attribute('Image', data(//body/b)),
for $i in //body/i
let $name := replace(data($i), '^(.+?):?$', '$1')
let $value := normalize-space($i/following-sibling::text()[1])
return fme:set-attribute($name, $value)

 

0684Q00000ArKTCQA3.png

Thank you, Takashi, that did the trick. I really appreciate your help with that query.