Skip to main content
Question

Using HTML Extractor

  • December 16, 2019
  • 1 reply
  • 14 views

Hi, I am trying to extract a HTML page.

I would appreciate if any of the expert can explain to me on how to extract :

<tr data-group="Batu Pahat" data-group-2="Sekolah Agama Seri Chomel">

<td data-field="Daerah" class="ew-rpt-grp-field-1">

<span data-class="tpx1_1_Maklumat_Bencana_Daerah_Aktif_Banjir_Johor_PusatPemindahan">Sekolah Agama Seri Chomel</span></td>

<td data-field="Keluarga" class="ew-table-alt-row"><span>6</span></td>

 

I would like to extract the bolded ones from the HTML page using HTML Extractor

Your help will be kindly appreciated

1 reply

takashi
Evangelist
  • December 16, 2019

The HTMLExtractor with this setting populates values of all span elements into a list attribute _list{}. See the help on the transformer to learn more. Assuming that the attribute called "html" stores the source HTML document.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings