Skip to main content
Question

How can I read through all files on a web server


I'm trying to automate my downloads of government environmental data. Unfortunately the zipped shapefiles change name because they put the date in the file name. I've found that they all end up in this location though: http://jncc.defra.gov.uk/files

So my current thinking is to get FME read this page and filter on the zip file name, then use httpcaller to download the right zip. When I go to this page in my we browser it displays a bit like a table - is there a way to get FME to read it as a table?

 

 

4 replies

david_r
Celebrity
  • April 18, 2019

Try using the HTMLExtractor:

This will return a list zip_files{} with one element for every zip file referenced (HREF) on that page.


david_r wrote:

Try using the HTMLExtractor:

This will return a list zip_files{} with one element for every zip file referenced (HREF) on that page.

This looks promising, thanks. at the moment I'm getting HTMLExtractor: <type 'exceptions.RuntimeError'>: maximum recursion depth exceeded - is the recursion limit in my settings?


david_r
Celebrity
  • April 18, 2019
whatahitson wrote:

This looks promising, thanks. at the moment I'm getting HTMLExtractor: <type 'exceptions.RuntimeError'>: maximum recursion depth exceeded - is the recursion limit in my settings?

Which version of FME are you using? I tested with FME 2019 and it worked fine.


david_r wrote:

Which version of FME are you using? I tested with FME 2019 and it worked fine.

2018. I was using 32 bit but ill give 64 bit a try


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings