Solved

Extracting attributes from a KML in odd format

  • 25 May 2023
  • 8 replies
  • 7 views

Hello, 

 

I am having issues extracting info from a kml. Here is how the information is stored in the kml_description. What kind of transformer would I use to extract the info.

 

<style type="text/css">td {white-space:nowrap} th {align:left} td.padr {padding-right:0.5cm}</style><table><tr><th></th><th></th></tr><tr><td class="padr">GPS Date:</td><td>03/29/17</td></tr><tr><td class="padr">GPS Time:</td><td>02:26:15pm</td></tr><tr><td class="padr">Datafile:</td><td>R032913A.cor</td></tr><tr><td class="padr">Avg Horz Prec:</td><td>0.9</td></tr><tr><td class="padr">Worst Horz Prec:</td><td>1.2</td></tr></table>

 

icon

Best answer by daveatsafe 25 May 2023, 23:07

View original

8 replies

Userlevel 2
Badge +17

Hi @dchow​,

FME doesn't have a transformer to extract the table data from HTML, but it does have a reader that can. So you can write the kml_description attribute to a temporary file, then read it back in through a FeatureReader:

Screen Shot 2023-05-25 at 2.00.38 PMI have attached the workspace snippet for you. It could be wrapped up in a custom transformer if there is enough interest. The folder created by the TempPathnameCreator is automatically cleaned up when the workspace finishes.

Hi @dchow​,

FME doesn't have a transformer to extract the table data from HTML, but it does have a reader that can. So you can write the kml_description attribute to a temporary file, then read it back in through a FeatureReader:

Screen Shot 2023-05-25 at 2.00.38 PMI have attached the workspace snippet for you. It could be wrapped up in a custom transformer if there is enough interest. The folder created by the TempPathnameCreator is automatically cleaned up when the workspace finishes.

Hi @daveatsafe​ 

 

When I run the workflow at the FeatureReader I get an error "HTML Table Reader: No lists or tables were found in the HTML document".

 

Userlevel 2
Badge +17

Hi @daveatsafe​ 

 

When I run the workflow at the FeatureReader I get an error "HTML Table Reader: No lists or tables were found in the HTML document".

 

Because the temporary folder lasts only as long as the workspace is running, I don't think this workspace will work properly with Feature Caching. Please use Run - Run Entire Workspace to run the full workspace every time.

Hi @daveatsafe​ 

 

When I run the workflow at the FeatureReader I get an error "HTML Table Reader: No lists or tables were found in the HTML document".

 

Hi @daveatsafe​ 

 

Is there an email I can send the data to you with to test with. I've ran it using Run Entire Workspace but I am not getting attributes in the the fields that were exposed.

Userlevel 2
Badge +17

Hi @daveatsafe​ 

 

When I run the workflow at the FeatureReader I get an error "HTML Table Reader: No lists or tables were found in the HTML document".

 

Sure, please send the data to dave.campanas@safe.com.

Badge +10

You could look at using a HTML extractor to get the table information, if the table is always in the same format

image.png 

Badge +10

You could also use the method described here

 

https://community.safe.com/s/article/how-to-expose-feature-attributes-from-kml-tag

 

with the following xquery

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/table/tr 
return {
 if ($x/td[1] ne '') then fme:set-attribute($x/td[1]/text(),$x/td[2]/text()) 
 else ()
 }

image

You could also use the method described here

 

https://community.safe.com/s/article/how-to-expose-feature-attributes-from-kml-tag

 

with the following xquery

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/table/tr 
return {
 if ($x/td[1] ne '') then fme:set-attribute($x/td[1]/text(),$x/td[2]/text()) 
 else ()
 }

image

Hi @ebygomm​ ,

 

Those are great examples, I worked with Dave and we figured out a solution using the above workspace, the data had a mix of tables in the description. 

 

But I will keep these examples and try them out.

 

Thanks

Reply