Question

wfs downloader

  • 2 December 2016
  • 2 replies
  • 22 views

We use the same datasets on a number of projects. These are sourced from the following site http://environment.data.gov.uk/ds/catalogue/index.jsp#/catalogue, currently via download and manual management. We would like to build a process where users can set a tool running to download and manage the data. My thought is along the lines of:

  • 1. Go to the website
  • 2. Tick the WFS box for the layers you want on your project
  • 3. As you do this the WFS URL changes at the top of the page
  • 4. Copy this and paste it into a pre-made workbench
  • 5. Define the MBR of the project within the reader so only the relevant data is included – this part could be done by a clipper and a specific polygon in the workbench but then we would have to access the entire WFS. The thought is specifying an MBR in the reader moves the extent query to their server greatly reducing the data we have to download and the speed to run the final tool
  • 6. Run the workbench. It would create a File Geodatabase or any other format based on the services on the URL it has been given, whether that includes 1 or 100 layers. It would also populate an excel sheet or database with which data was download, when, where and other metadata type attributes

The issues I’m having include (amongst others)

  • FME wanting to read the content of the WFS service before adding anything to the workbench. Some of these WFS have 100,000’s of records so this initial read of everything can take a long time. I don’t really want the workbench to read every feature, they are already defined. Maybe this is related to ignoring the schema? Without ticking that box nothing happens at all though
  • When I paste a new URL into an existing workbench the new URL is not read and none of the feature type changes are recognised
  • Related to the first 2 you have to set the number of features to read. If this is too small it’ll just report the feature types it read before the limit was reached. This is even though you’ve told it to read all the layers already in the constraints box of the reader. Once read in you don’t seem to be able to re-read the feature types
  • When I make the workbench dynamic it will no longer write any point geometry – WFS points seem to be multipoint but an fGDB defaults to point so they are ignored

My thought is this has to be dynamic as adding each layer separately, while it will simply the build process will be very time consuming and a nightmare to keep up to date. I can make multiple layers translate to an unfussy format such as tab but when I introduce a format such as fGDB I can’t write the output. And when I change the URL it all falls apart. Do you think what I’m trying to do is even possible?


2 replies

Badge +16

Hi @har40428,

I think I is possible to achieve, for the issue of feature types I would first query the services for the feature types before passing it on to read the service itself.

The issue of number of features can possibly be solved by using the built in responce paging in the wfs reader using version 2.0.0

http://impossibleaddress2find.blogspot.nl/2016/11/wtwfs.html

To make this complex solution work I suggest reaching out to Safe for guidance on how to accomplish it.

Good luck!

Badge

Hi @har40428,

Adding to itay comments, maybe try using the FeatureReader transformer with the WFS reader set to 2.0.0 version.

You will then use the "Max Features" and "Count" parameters set to same constant (or vary them depending on how many pages you want per WFS instance), say 20000, to constraint the number of features per WFS reader instances, and then vary the "Start Index:" settable from an attribute value from the "Initiator" into the FeatureReader.

The first "Initiator" feature will have its "Start Index" at: "0" -> this would fetch the first 20,000 features for the first WFS instance read.

The second "Initiator" will have its "Start Index" set to "20000", for the next 20,000 features,

Third initiator to "40000" for the next 20,000 features, and so on...

I believe this should work, I quickly tried it with http://wfs.data.linz.govt.nz:80/0aaff66e22aa46d696b6c47029198e2b/v/x1694/wfs?SERVICE=WFS&REQUEST;=GetCapabilities&VERSION;=2.0.0, for testing I set "Max Features" and "Count" to "100", then "Start Index" was varied for the each 100 features.

Alternatively, this will depend on your use case, you could use one "Initiator" feature and set the "Max Features", "Count", and "Start Index" accordingly, in this case, one WFS reader will keep on paging until "Max Features" is reached.

Hope this helps.

Reply