Skip to main content
Question

How can I read through all files on a web server

  • April 18, 2019
  • 4 replies
  • 69 views

I'm trying to automate my downloads of government environmental data. Unfortunately the zipped shapefiles change name because they put the date in the file name. I've found that they all end up in this location though: http://jncc.defra.gov.uk/files

So my current thinking is to get FME read this page and filter on the zip file name, then use httpcaller to download the right zip. When I go to this page in my we browser it displays a bit like a table - is there a way to get FME to read it as a table?

 

 

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

4 replies

david_r
Celebrity
  • 8394 replies
  • April 18, 2019

Try using the HTMLExtractor:

This will return a list zip_files{} with one element for every zip file referenced (HREF) on that page.


  • Author
  • 10 replies
  • April 18, 2019

Try using the HTMLExtractor:

This will return a list zip_files{} with one element for every zip file referenced (HREF) on that page.

This looks promising, thanks. at the moment I'm getting HTMLExtractor: <type 'exceptions.RuntimeError'>: maximum recursion depth exceeded - is the recursion limit in my settings?


david_r
Celebrity
  • 8394 replies
  • April 18, 2019

This looks promising, thanks. at the moment I'm getting HTMLExtractor: <type 'exceptions.RuntimeError'>: maximum recursion depth exceeded - is the recursion limit in my settings?

Which version of FME are you using? I tested with FME 2019 and it worked fine.


  • Author
  • 10 replies
  • April 18, 2019

Which version of FME are you using? I tested with FME 2019 and it worked fine.

2018. I was using 32 bit but ill give 64 bit a try