Solved

HTMLExtractor help appreciated

2 years ago
2 November 2021
3 replies
6 views

+16

bruceharold
Contributor
325 replies

Hi, I'm inefficient with CSS selector statements - having no HTML authoring background. I want the CSV download links at this site and only the csv download links:

https://data.sandiego.gov/datasets/parking-citations/

I can get all 'a' tags and the href references no problem but have to test the result ends with '.csv'. Can anyone give me the right selector syntax for the element I'm after? Thanks all.

icon

Best answer by ebygomm 2 November 2021, 15:32

View original

3 replies

Userlevel 5

+25

redgeographics
Influencer
3345 replies
2 years ago
2 November 2021

Rather than tinker with CSS selectors I decided to go the easy way:

Extract all links from the entire web page, save it as a list, explode the list and select the ones ending in .csv

Screenshot 2021-11-02 at 15.25.50 In addition to that, the filenames are actually predictable, you don't really have to go through the HTMLExtractor at all...