Question

how to split attribute which is formatted as html (has <>) into several attribute and extract only required ones?

  • 6 August 2019
  • 1 reply
  • 1 view

Badge

I am working with kml file converting it to our database format and extracting some information along with the shapes. One of the attributes of kml has the info that I need. It looks like that (which seems to be html format):

<center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor='#E3E3F3'><th>AlertID</th><td>18112330</td></tr><tr bgcolor=''><th>DateTime</th><td>20181222124132.000</td></tr><tr bgcolor='#E3E3F3'><th>AlertRegionID</th><td>2111030</td></tr><tr bgcolor=''><th>RegionTypeID</th><td>1</td></tr></table></center>

 

From that one attribute I'd like to extract the important info minus all the tags, brackets and formatting as separate attributes:

AlertID 18112330

DateTime 20181222124132.000

AlertRegionID 2111030

RegionTypeID 1

 

 


1 reply

Userlevel 2
Badge +17

Hi @lidiad, the text is an HTML fragment but also can be parsed as an XML fragment.  I would use the XMLXQuery extractor to extract the required attributes from the XML fragment. Assuming that a feature attribute called "_html" stores the HTML/XML fragment.

XQuery Expression:

for $tr at $i in //tr
where 1 < $i
return fme:set-attribute(data($tr/th), data($tr/td))

0684Q00000ArKaKQAV.png

Reply