Skip to main content
Solved

Extracting webpage and map content from url

  • July 8, 2020
  • 2 replies
  • 13 views

Forum|alt.badge.img

Hi,

I would like to extract the project listing as well as the corresponding Lat/Long from the map content together with its attributes such as Project Title, Location, Project Type, Status etc. from the project link

url: https://iaac-aeic.gc.ca/050/evaluations/exploration?active=true&showMap=true&document_type=project

Sample project url:https://iaac-aeic.gc.ca/050/evaluations/proj/80774

I have limited past exposure to htmlextractor, any assistance will be much appreciated.

Best answer by jdh

 

Will get you most of the way there. You'll need to manipulate the lists to get usable attributes. The lat/long {0}.part can just be renamed, but you'll probably want the ListKeyValuePairExtractor custom transformer to get metadata attributes.
This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

jdh
Contributor
Forum|alt.badge.img+37
  • Contributor
  • 2002 replies
  • Best Answer
  • July 9, 2020

 

Will get you most of the way there. You'll need to manipulate the lists to get usable attributes. The lat/long {0}.part can just be renamed, but you'll probably want the ListKeyValuePairExtractor custom transformer to get metadata attributes.

Forum|alt.badge.img
  • Author
  • 11 replies
  • July 11, 2020

Thanks @jdh.