Skip to main content

Hi,

I'm trying to extract the table from http://skpos.gku.sk/en/stanice.php with HTML table reader or an HTML extractor but in the output I only see the column headers and there is now list I could explode further. How can I get the complete table? Thank you for your help!

Hi @kasparlov, FME can read HTML contents only if they are provided as HTML source statically, but in my observation, the table body in the site will be created with JavaScript script dynamically on the client browser, so unfortunately I don't think you can read it directly with FME.

A possible workaround is, once save the page as an HTML file using a web browser and then read the table with the HTML Table reader or the HTMLExtractor.

In my quick test, the HTML file saved with Google Chrome could be read with FME, like this. The header (column names) should be modified.


Hi @kasparlov, FME can read HTML contents only if they are provided as HTML source statically, but in my observation, the table body in the site will be created with JavaScript script dynamically on the client browser, so unfortunately I don't think you can read it directly with FME.

A possible workaround is, once save the page as an HTML file using a web browser and then read the table with the HTML Table reader or the HTMLExtractor.

In my quick test, the HTML file saved with Google Chrome could be read with FME, like this. The header (column names) should be modified.

Thank you, Takashi! Very helpful answer, as allways :)


Takashi as always has some insights, but a workaround might be to do a manual/clerical investigation to see where ultimately the data is hosted and grab it from there, for example clicking on one station name I find this log link:

 

ftp://epncb.oma.be/pub/center/oper/GKU.OC

 


Reply