Skip to main content
Solved

Extracting webpage and map content from url


Forum|alt.badge.img

Hi,

I would like to extract the project listing as well as the corresponding Lat/Long from the map content together with its attributes such as Project Title, Location, Project Type, Status etc. from the project link

url: https://iaac-aeic.gc.ca/050/evaluations/exploration?active=true&showMap=true&document_type=project

Sample project url:https://iaac-aeic.gc.ca/050/evaluations/proj/80774

I have limited past exposure to htmlextractor, any assistance will be much appreciated.

Best answer by jdh

 

Will get you most of the way there. You'll need to manipulate the lists to get usable attributes. The lat/long {0}.part can just be renamed, but you'll probably want the ListKeyValuePairExtractor custom transformer to get metadata attributes.
View original
Did this help you find an answer to your question?

2 replies

jdh
Contributor
Forum|alt.badge.img+28
  • Contributor
  • Best Answer
  • July 9, 2020

 

Will get you most of the way there. You'll need to manipulate the lists to get usable attributes. The lat/long {0}.part can just be renamed, but you'll probably want the ListKeyValuePairExtractor custom transformer to get metadata attributes.

Forum|alt.badge.img
  • Author
  • July 11, 2020

Thanks @jdh.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings