Skip to main content
Question

Extract HTML table


pflegpet
Contributor
Forum|alt.badge.img+8

Hi,

I'm trying to extract the table from http://skpos.gku.sk/en/stanice.php with HTML table reader or an HTML extractor but in the output I only see the column headers and there is now list I could explode further. How can I get the complete table? Thank you for your help!

3 replies

takashi
Influencer
  • January 20, 2020

Hi @kasparlov, FME can read HTML contents only if they are provided as HTML source statically, but in my observation, the table body in the site will be created with JavaScript script dynamically on the client browser, so unfortunately I don't think you can read it directly with FME.

A possible workaround is, once save the page as an HTML file using a web browser and then read the table with the HTML Table reader or the HTMLExtractor.

In my quick test, the HTML file saved with Google Chrome could be read with FME, like this. The header (column names) should be modified.


pflegpet
Contributor
Forum|alt.badge.img+8
  • Author
  • Contributor
  • January 20, 2020
takashi wrote:

Hi @kasparlov, FME can read HTML contents only if they are provided as HTML source statically, but in my observation, the table body in the site will be created with JavaScript script dynamically on the client browser, so unfortunately I don't think you can read it directly with FME.

A possible workaround is, once save the page as an HTML file using a web browser and then read the table with the HTML Table reader or the HTMLExtractor.

In my quick test, the HTML file saved with Google Chrome could be read with FME, like this. The header (column names) should be modified.

Thank you, Takashi! Very helpful answer, as allways :)


bruceharold
Contributor
Forum|alt.badge.img+17
  • Contributor
  • January 21, 2020

Takashi as always has some insights, but a workaround might be to do a manual/clerical investigation to see where ultimately the data is hosted and grab it from there, for example clicking on one station name I find this log link:

 

ftp://epncb.oma.be/pub/center/oper/GKU.OC

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings