Skip to main content
Question

How to expose KML tag with XQuery


Forum|alt.badge.img

I am trying to extract the KML tags for the KML below (structure) using the XQueryxtractor transformer

declare default element namespace "http://www.w3.org/1999/xhtml";

for $x in /html/body/table/tr/td

return fme:set-attribute($x/td[1]/text(),$x/td[2]/text())

the result is coming out empty. Any suggestions ?

kml_description (encoded: UTF-16LE): <html xmlns:fo="http://www.w3.org/1999/XSL/Format">

<body> <table border="0" width="370" cellpadding="0" cellspacing="0"> <tr bgcolor="ffffff"> <th width="370" align="left"></th> <th width="370" align="left"></th> </tr> <tr> <td bgcolor="#ffffff">Manhole ID</td> <td>50283792</td> </tr> <tr> <td bgcolor="#ffffff">Life Cycle</td> <td>In-Service</td> </tr> <tr> <td bgcolor="#ffffff">Manhole Size</td> <td></td> </tr> <tr> <td bgcolor="#ffffff">Installation Date</td> <td>1/1/1111</td> </tr> <tr> <td bgcolor="#ffffff">Drawing Number</td> <td>MHE2-009-552</td> </tr> <tr> <td bgcolor="#ffffff">Grid</td> <td>4030420</td> </tr> </table><br></body> </html>

8 replies

itay
Supporter
Forum|alt.badge.img+17
  • Supporter
  • March 22, 2018

Looks like html to me so why not use the HTMLExtractor or simply read it as XML and use the XML reader to extract the data?


takashi
Influencer
  • March 22, 2018
  • Firstly make sure that the default namespace declaration in your expression is correct. I don't know if it's correct, since the XML fragment you have posted is incomplete (cannot see the root html tag and default namespace declaration),
  • In your expression, the variable $x indicates "td" elements specified by the XPath "/html/body/table/tr/td", and you intend to retrieve values of its child elements called "td". Naturally the expression returns the empty, since the "td" element doesn't have a child called "td". I think $x should be "tr" element here.


takashi
Influencer
  • March 22, 2018
takashi wrote:
  • Firstly make sure that the default namespace declaration in your expression is correct. I don't know if it's correct, since the XML fragment you have posted is incomplete (cannot see the root html tag and default namespace declaration), 
  • In your expression, the variable $x indicates "td" elements specified by the XPath "/html/body/table/tr/td", and you intend to retrieve values of its child elements called "td". Naturally the expression returns the empty, since the "td" element doesn't have a child called "td". I think $x should be "tr" element here.

I noticed that this <br> element in your HTML fragment is not closed.

 

</table><br></body>
That is, the HTML fragment is not a valid XML, so the XMLXQueryExtractor cannot parse it. You will have to modify it to a valid XML, or consider adopting other approach. e.g. HTMLExtractor.

Forum|alt.badge.img
takashi wrote:
  • Firstly make sure that the default namespace declaration in your expression is correct. I don't know if it's correct, since the XML fragment you have posted is incomplete (cannot see the root html tag and default namespace declaration),
  • In your expression, the variable $x indicates "td" elements specified by the XPath "/html/body/table/tr/td", and you intend to retrieve values of its child elements called "td". Naturally the expression returns the empty, since the "td" element doesn't have a child called "td". I think $x should be "tr" element here.

This is the updated version of the xQuery and is coming empty also.

 

 

this data is comming from an attribute call kml_description (encoded: UTF-16LE)

 

 

declare default element namespace "http://www.w3.org/1999/xhtml";

 

for $x in /html/body/table/tr/td/table/tr

 

return fme:set-attribute($x/td[1]/text(),$x/td[2]/text())

 

 

 

 


takashi
Influencer
  • March 22, 2018
takashi wrote:
  • Firstly make sure that the default namespace declaration in your expression is correct. I don't know if it's correct, since the XML fragment you have posted is incomplete (cannot see the root html tag and default namespace declaration), 
  • In your expression, the variable $x indicates "td" elements specified by the XPath "/html/body/table/tr/td", and you intend to retrieve values of its child elements called "td". Naturally the expression returns the empty, since the "td" element doesn't have a child called "td". I think $x should be "tr" element here.

The HTML fragment you have posted is incomplete (the starting html tag is missing). Also, as I mentioned in the previous comment, the <br> element is not closed.

 

The XMLXQueryExtractor cannot parse it anyway.

 

If you modified the HTML fragment to a valid XML fragment and if the default namespace was correct, this expression could work as expected.

 

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/table/tr
where 1 lt fn:count($x/td)
return fme:set-attribute($x/td[1]/text(),$x/td[2]/text())

 


Forum|alt.badge.img
takashi wrote:
  • Firstly make sure that the default namespace declaration in your expression is correct. I don't know if it's correct, since the XML fragment you have posted is incomplete (cannot see the root html tag and default namespace declaration),
  • In your expression, the variable $x indicates "td" elements specified by the XPath "/html/body/table/tr/td", and you intend to retrieve values of its child elements called "td". Naturally the expression returns the empty, since the "td" element doesn't have a child called "td". I think $x should be "tr" element here.

The HTMLExtractor is working, but i am going to try your suggestion.

 

 

Thank you

 

 


Forum|alt.badge.img
takashi wrote:
  • Firstly make sure that the default namespace declaration in your expression is correct. I don't know if it's correct, since the XML fragment you have posted is incomplete (cannot see the root html tag and default namespace declaration),
  • In your expression, the variable $x indicates "td" elements specified by the XPath "/html/body/table/tr/td", and you intend to retrieve values of its child elements called "td". Naturally the expression returns the empty, since the "td" element doesn't have a child called "td". I think $x should be "tr" element here.

Is not working. The _result field is empty.

 

 


takashi
Influencer
  • March 22, 2018
takashi wrote:
  • Firstly make sure that the default namespace declaration in your expression is correct. I don't know if it's correct, since the XML fragment you have posted is incomplete (cannot see the root html tag and default namespace declaration),
  • In your expression, the variable $x indicates "td" elements specified by the XPath "/html/body/table/tr/td", and you intend to retrieve values of its child elements called "td". Naturally the expression returns the empty, since the "td" element doesn't have a child called "td". I think $x should be "tr" element here.

It's fine. The expression sets new attributes internally, doesn't return any value.

 

Connect a Logger and/or Inspector, and see the result on the Translation Log or the Feature Information window in FME Data Inspector. If the expression works fine, you can see new attributes - "Manhole ID", "Life Cycle", and so on.

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings