Skip to main content

Hi,

I am trying to extract a table that is embedded wthin a webpage that is an endpoint. Now I can get the table via html extractor as a lump of html but what I want to do is write it out data as an excel table.

One option is to write it out as a html file and then use a html Table reader but I would like to cut out the step of writing out and writing in.

In essence I would like to handle the attribute that is a snippit of html that represents the table data I want to manipulate.

 

Would the approach outlined in this post: attributes-from-kml-tag @deanatsafe​ apply?

However I think I am missing something as I cannot get the last step to work.

Attached is workspace and sample html data.

Regards

Justin.

 

url example:

https://www.nratrafficdata.ie/c2/tfdaysreport.asp?sgid=ZvyVmXU8jBt9PJE$c7UXt6&spid=NRA_000000001803&reportdate=2021-01-01&enddate=2021-01-08&dir=-1&dim1bin=7

 

Capture

You can use the url directly in a HTML table reader


You can use the url directly in a HTML table reader

Hi @ebygomm​ almost what I am looking for but I have a feature class that contains the url which is built from parameters. This is a test case to illustrate what I would like to happen. This will need to handle 10s to 100s of calls to the end point with different parameter, so in effect in would need to these url in as inputs.

What this reader does is exactly what I would like to do with the html fragment I have, but I need to use information which will come from the feature for other parts of the workflow.


Hi @ebygomm​ almost what I am looking for but I have a feature class that contains the url which is built from parameters. This is a test case to illustrate what I would like to happen. This will need to handle 10s to 100s of calls to the end point with different parameter, so in effect in would need to these url in as inputs.

What this reader does is exactly what I would like to do with the html fragment I have, but I need to use information which will come from the feature for other parts of the workflow.

So you just use a FeatureReader with the HTML Table format if you need the url to come from input features


Hi @ebygomm​ almost what I am looking for but I have a feature class that contains the url which is built from parameters. This is a test case to illustrate what I would like to happen. This will need to handle 10s to 100s of calls to the end point with different parameter, so in effect in would need to these url in as inputs.

What this reader does is exactly what I would like to do with the html fragment I have, but I need to use information which will come from the feature for other parts of the workflow.

@ebygomm​ Thanks for you help, I'm new to this but I don't think that will work. I have successfully got the html snippit as an attribute which is what I want to push into an excel sheet. From what I can tell FeatureReader will get me to the same place I am currently stuck at. Anyway I'm going to go with the two part solution of pushing the table data into a html file and then in another workspace use HTMLTable reader.


See the attached workspace. I downloaded your workspace and added the logic from the article you mentioned here: https://community.safe.com/s/article/how-to-expose-feature-attributes-from-kml-tag

I used the XMLFragmenter rather than the XQueryExtractor, because the former is easier to configure and doesnt depend on an exact XQuery expression. The other trick is that there are several different ways to store HTML tables. KML tables are structured slightly differently than your html table. So first its a matter of extracting the table into different features with the XMLFragmenter. Next we need to use the AttributeExposer to expose the newly extracted attributes we want to work on. Then we use an AttributeCreator to define the row values from td{0} to td{10} in your case. Finally we write this out to csv.

html to csv


Reply