Skip to main content

I have a workbench that reads features from a large database to find feature that match a certain set of criteria. I only need 2000 features to be output. At the moment I use a counter to count the matching features and only route features to the writer whilst the count is less than 2000. However the workbench continues to read features from the database until all features have been read. What I would like to be able to do is to stop the reader when 2000 suitable features have been found.

 

 

The workaround I have at the moment is to guess at how many feature I will need to check and set that as the maximum number of features to read. It can take several hours to find 2000 suitable features and several hours more to continue reading the remaining features that are not required from the database. I already have a reader Where clause to do as much filtering on the database as possible

I don't think you can dynamically change the number of features read once the workspace has started.

 

 

Since it's a database could you not use a SELECT TOP 2000 or LIMIT 2000 in your sql statement?

Depending on the database, there are also various pagination options, that you may be able to loop through until you reach your desired feature count.


If you expand the reader parameters in the navigator, have a look under the section "Features to read", where you can define "Max features to read" or "Max features to read per feature type":

You can also link these values to a published parameter, if you want.


Hi @roykirby, I haven't tested this and I don't know how performant this would be, but perhaps you could use the FeatureWriter to write out the 2000 features that you've found using the Counter, then have the summary port feature go to a Terminator and end the translation. This might stop some of the reading.


If you expand the reader parameters in the navigator, have a look under the section "Features to read", where you can define "Max features to read" or "Max features to read per feature type":

You can also link these values to a published parameter, if you want.

Thanks but the problem is that I don't know how many features I will need to read to get enough features that will pass all the tests in the workspace to the writer.

 

 


I don't think you can dynamically change the number of features read once the workspace has started.

 

 

Since it's a database could you not use a SELECT TOP 2000 or LIMIT 2000 in your sql statement?

Depending on the database, there are also various pagination options, that you may be able to loop through until you reach your desired feature count.

Thanks. I'm already filtering the with the WHERE clause as much as possible and limiting the input to 2000 would probably result in only 20 to 30 features being written as most features get rejected by the tests in the workbench.

 

 


Hi @roykirby, I haven't tested this and I don't know how performant this would be, but perhaps you could use the FeatureWriter to write out the 2000 features that you've found using the Counter, then have the summary port feature go to a Terminator and end the translation. This might stop some of the reading.

Thanks, I'll have a look and see if that is possible

 

 


Thanks. I'm already filtering the with the WHERE clause as much as possible and limiting the input to 2000 would probably result in only 20 to 30 features being written as most features get rejected by the tests in the workbench.

 

 

What database are you using?

 

 


Oracle 12c


Thanks but the problem is that I don't know how many features I will need to read to get enough features that will pass all the tests in the workspace to the writer.

 

 

Yeah, I realised that after I posted 🙂

In addition to the FeatureWriter, you could also look into writing your features using the SQLExecutor. If your features to write aren't too complex, it's fairly easy (geometries can e.g. be cast as WKT). The upside is that you'll get an immediate feedback out the Initiator port where you could use a Counter and a Terminator.

The FeatureWriter won't give you any feedback before everything has been written, which may be too late in some cases.


can you not transfer the criteria to the SQL using an SQL_Executor? Then test within the SQL.

And add a row_num <=2000 to the where clause.

Basically putting part of your transformers into the SQL.


Maybe it is possible to create a loop inside a custom transformer? Read 500 features, count the amount of features that meet your criteria and if this counter is below your threshold loop back one feature to read another 500 features.


Oracle 12 has FETCH and OFFSET

https://docs.oracle.com/javadb/10.8.3.0/ref/rrefsqljoffsetfetch.html

You could create a loop and use the SQLExecuter to fetch n number of rows, do you analysis, and if needed loop back to fetch the next n number of rows, until you have the amount of results you need.

 

At worst case you will be reading n-1 extra records.

Reply