Skip to main content
Solved

How can I extract the html table from kml_description?


Forum|alt.badge.img

I am trying to get the attributes from a point layer in a KMZ file stored in a HTML table within kml_description...... 

I have tried using HTMLToXHTMLConverter, but the features always fail. Currently using FME 2013 SP3.  FME Data Inspector is able to read and list them under "Attributes". 

I followed the instructions from:

https://knowledge.safe.com/articles/19918/how-to-expose-feature-attributes-from-kml-tag.html

 Here is an example of the html code that comes between <description> and </description>

<html lang="en">
<head><meta charset="utf8"/><style> * margin: 0} htmlbody height: 100%margin: 0padding: 0} h2 textalign: centerbackground: #e5eCf9} #summaryData fontfamily: 'Helvetica Neue', Arial, Helvetica, sansseriffontsize: 12pxwidth: 350px} #summaryData thpaddingright: 20pxwidth: 40%} .fire_title background: #ffffffmarginbottom: 15px} .box textalign: leftpadding: 1.5empaddingtop: 1emmarginbottom: 1.5embackground: #e5eCf9width: 100%} .spacer height: 15px} </style></head>
<body>
<div id="summaryData"><h2>Friday, June 3, 2016</h2>
<table>
<tr><th>Type</th><td>WF</td></tr></table><h2>Totals</h2>
<table>
<tr><th>Area Burned</th><td>28.22 acres</td></tr><tr/>
<tr><th>CO2</th><td>17.91 tons</td></tr>
<tr><th>CO</th><td>2.44 tons</td></tr>
<tr><th>PM10</th><td>0.23 tons</td></tr>
<tr><th>VOC</th><td>0.57 tons</td></tr>
<tr><th>SO2</th><td>0.01 tons</td></tr>
<tr><th>NOX</th><td>0.01 tons</td></tr>
<tr><th>NH3</th><td>0.04 tons</td></tr>
<tr><th>CH4</th><td>0.12 tons</td></tr>
<tr><th>PM25</th><td>0.2 tons</td></tr>
</table></div></body></html>

Best answer by takashi

Hi @colin_forsyth, I was able to convert your sample HTML doc to  an XHTML doc with the HTMLToXHTMLConvertor transformer. However, the schema is different from the example in the article that you linked, you therefore will have to define your own XQuery expression. e.g.

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/div/table
for $y in $x/tr
return fme:set-attribute($y/th/text(), $y/td/text())

If the first table (Type: WF) is not necessary:

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/div/table[2]/tr
return fme:set-attribute($x/th/text(), $x/td/text())
View original
Did this help you find an answer to your question?

4 replies

takashi
Influencer
  • Best Answer
  • June 2, 2016

Hi @colin_forsyth, I was able to convert your sample HTML doc to  an XHTML doc with the HTMLToXHTMLConvertor transformer. However, the schema is different from the example in the article that you linked, you therefore will have to define your own XQuery expression. e.g.

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/div/table
for $y in $x/tr
return fme:set-attribute($y/th/text(), $y/td/text())

If the first table (Type: WF) is not necessary:

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/div/table[2]/tr
return fme:set-attribute($x/th/text(), $x/td/text())

Forum|alt.badge.img

Thanks Takashi. It is working for me now with the XQuery.


Don't know if anyone else wants to take a crack at another difficult KML to parse: but I'm at a loss:

 

<?xml version="1.0" encoding="UTF-8"?><kml xmlns="http://www.opengis.net/kml/2.2"><Document><description>Area: Left: 885936.156160; Right: 926105.126601; Bottom: 15171486.649022; Top: 15190745.601094</description>

<Region><LatLonAltBox><north>41.7898664686</north>

<west>-89.7675684189</west>

<east>-89.6182244316</east>

<south>41.7405034708</south></LatLonAltBox></Region>

<name>spatialNET View Around Query Boundary: RF</name><Folder><description>Fiber Entities</description>

<visibility>1</visibility>

<open>0</open>

<name>Fiber Network</name><Folder><description>All Fiber Splice Cases in the target area.</description>

<visibility>1</visibility>

<Snippet></Snippet>

<open>0</open>

<name>Fiber Splice Cases</name><Placemark id="{SPLICE_CASE,10199669}"><description><![CDATA[

<img src="google_header.png"/>

<table>

<tr><td><h1>Splice Case: RFAVEA-F-DS08</h1></tr>

<tr><td><table>

<tr><td><h3>Attributes:</h3></td></tr>

<tr><td><table>

<tr><th>CLLI Code</th><td>None</td></tr><tr><th>Nodal Location:</th><td>920787.625332,15181617.4465</td></tr><tr><th>Entity status</th><td>Proposed<br/>Modified<br/>New<br/>Design Change</td></tr><tr><th>Account Code</th><td>None</td></tr><tr><th>Street Address</th><td>None</td></tr><tr><th>Billing Address</th><td>None</td></tr><tr><th>Number of Cables Spliced</th><td>0</td></tr><tr><th>Site Code</th><td>None</td></tr><tr><th>Designation</th><td>RFAVEA-F-DS08</td></tr><tr><th>Symbol Scale</th><td>None</td></tr><tr><th>Alternate Name</th><td>None</td></tr><tr><th>Construction Status</th><td>None</td></tr><tr><th>Location</th><td>None</td></tr><tr><th>Contact</th><td>None</td></tr><tr><th>Owner</th><td>None</td></tr><tr><th>Fiber Design Profile</th><td>None</td></tr><tr><th>Site Type</th><td>FOSC 450-B (24)</td></tr><tr><th>Type Description</th><td>TFD - Distribution Splice Case - B (24)</td></tr><tr><th>State</th><td>None</td></tr><tr><th>Town</th><td>None</td></tr><tr><th>ZIP Code</th><td>None</td></tr><tr><th>Nodal Rotation:</th><td>0.154072566076</td></tr><tr><th>Service Status Code</th><td>I</td></tr><tr><th>Service Status Date</th><td>None</td></tr><tr><th>Owning Drawing</th><td>None</td></tr><tr><th>ID codes for owner</th><td></td></tr><tr><th>Media Type</th><td>F</td></tr><tr><th>Incoming Cables</th><td><a href="#{FIBER_CABLE_UNCON,10200655};balloonFlyto">24-Armor SMode Loose: RFAVEA-F-DF08</a></td></tr><tr><th>Outgoing Cables</th><td></td></tr><tr><th>Passthrough Cables</th><td></td></tr><tr><th>Equipment Attribute 1</th><td>None</td></tr><tr><th>Equipment Attribute 2</th><td>None</td></tr><tr><th>Size of equipment</th><td>None</td></tr><tr><th>Equipment Type</th><td>None</td></tr><tr><th>Installation Date</th><td>None</td></tr><tr><th>Plant Owner</th><td>None</td></tr><tr><th>Calculated Latitude and Longitude</th><td> 41.767841/-89.638834</td></tr><tr><th>Noun (class descriptor)</th><td>Fiber Splice Case</td></tr><tr><th>Format (entity descriptor)</th><td>Splice Case: RFAVEA-F-DS08</td></tr><tr><th>Operational State</th><td>0</td></tr><tr><th>Operational State</th><td>In Service</td></tr><tr><th>Service Status</th><td>New</td></tr><tr><th>Workflow State</th><td>0</td></tr><tr><th>Workflow State</th><td>Real World</td></tr>

</table></td></tr>

 

<tr><td><h3>Documents:</h3></td></tr>

<tr><td><table>

 

</table></td></tr>

</table></td></tr>

</table>

<img src="google_footer.png" />

]]></description>


xiaomengatsafe
Safer
Forum|alt.badge.img+3
mwilliamson wrote:

Don't know if anyone else wants to take a crack at another difficult KML to parse: but I'm at a loss:

 

<?xml version="1.0" encoding="UTF-8"?><kml xmlns="http://www.opengis.net/kml/2.2"><Document><description>Area: Left: 885936.156160; Right: 926105.126601; Bottom: 15171486.649022; Top: 15190745.601094</description>

<Region><LatLonAltBox><north>41.7898664686</north>

<west>-89.7675684189</west>

<east>-89.6182244316</east>

<south>41.7405034708</south></LatLonAltBox></Region>

<name>spatialNET View Around Query Boundary: RF</name><Folder><description>Fiber Entities</description>

<visibility>1</visibility>

<open>0</open>

<name>Fiber Network</name><Folder><description>All Fiber Splice Cases in the target area.</description>

<visibility>1</visibility>

<Snippet></Snippet>

<open>0</open>

<name>Fiber Splice Cases</name><Placemark id="{SPLICE_CASE,10199669}"><description><![CDATA[

<img src="google_header.png"/>

<table>

<tr><td><h1>Splice Case: RFAVEA-F-DS08</h1></tr>

<tr><td><table>

<tr><td><h3>Attributes:</h3></td></tr>

<tr><td><table>

<tr><th>CLLI Code</th><td>None</td></tr><tr><th>Nodal Location:</th><td>920787.625332,15181617.4465</td></tr><tr><th>Entity status</th><td>Proposed<br/>Modified<br/>New<br/>Design Change</td></tr><tr><th>Account Code</th><td>None</td></tr><tr><th>Street Address</th><td>None</td></tr><tr><th>Billing Address</th><td>None</td></tr><tr><th>Number of Cables Spliced</th><td>0</td></tr><tr><th>Site Code</th><td>None</td></tr><tr><th>Designation</th><td>RFAVEA-F-DS08</td></tr><tr><th>Symbol Scale</th><td>None</td></tr><tr><th>Alternate Name</th><td>None</td></tr><tr><th>Construction Status</th><td>None</td></tr><tr><th>Location</th><td>None</td></tr><tr><th>Contact</th><td>None</td></tr><tr><th>Owner</th><td>None</td></tr><tr><th>Fiber Design Profile</th><td>None</td></tr><tr><th>Site Type</th><td>FOSC 450-B (24)</td></tr><tr><th>Type Description</th><td>TFD - Distribution Splice Case - B (24)</td></tr><tr><th>State</th><td>None</td></tr><tr><th>Town</th><td>None</td></tr><tr><th>ZIP Code</th><td>None</td></tr><tr><th>Nodal Rotation:</th><td>0.154072566076</td></tr><tr><th>Service Status Code</th><td>I</td></tr><tr><th>Service Status Date</th><td>None</td></tr><tr><th>Owning Drawing</th><td>None</td></tr><tr><th>ID codes for owner</th><td></td></tr><tr><th>Media Type</th><td>F</td></tr><tr><th>Incoming Cables</th><td><a href="#{FIBER_CABLE_UNCON,10200655};balloonFlyto">24-Armor SMode Loose: RFAVEA-F-DF08</a></td></tr><tr><th>Outgoing Cables</th><td></td></tr><tr><th>Passthrough Cables</th><td></td></tr><tr><th>Equipment Attribute 1</th><td>None</td></tr><tr><th>Equipment Attribute 2</th><td>None</td></tr><tr><th>Size of equipment</th><td>None</td></tr><tr><th>Equipment Type</th><td>None</td></tr><tr><th>Installation Date</th><td>None</td></tr><tr><th>Plant Owner</th><td>None</td></tr><tr><th>Calculated Latitude and Longitude</th><td> 41.767841/-89.638834</td></tr><tr><th>Noun (class descriptor)</th><td>Fiber Splice Case</td></tr><tr><th>Format (entity descriptor)</th><td>Splice Case: RFAVEA-F-DS08</td></tr><tr><th>Operational State</th><td>0</td></tr><tr><th>Operational State</th><td>In Service</td></tr><tr><th>Service Status</th><td>New</td></tr><tr><th>Workflow State</th><td>0</td></tr><tr><th>Workflow State</th><td>Real World</td></tr>

</table></td></tr>

 

<tr><td><h3>Documents:</h3></td></tr>

<tr><td><table>

 

</table></td></tr>

</table></td></tr>

</table>

<img src="google_footer.png" />

]]></description>

Hi @mwilliamson, Welcome to the FME Forum. I'd highly encourage you to post this as a separate New Question. That way you can provide more details on what information you'd like to parse out of the KML, in your scenario, and a new question will likely get more attention.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings