Skip to main content
Are there any plans to add an "ORDER BY' parameter to readers that query SDE data?  I'm trying to process millions of records from versioned SDE feature classes in FME.  I need to read data from an SDE version so I think I'm limited to the ESRI Geodatabase reader to do that.  Since I'm using the StatisticsCalculator transformer to summarize values from the rows based on a common ID, I'd prefer to have the data read in order by those ID's.  That way I can utilize the "Input is Ordered by Group" option to make the transformer more efficient.  I could use a Sorter transformer to do the ordering before statistics are calculated, but that means I have to load millions of features into memory.  I'd rather just have the data ordered for me by the database on it's way into the workspace. 

 

 

Another alternative I tried was using the SQLCreator and have the database run the summary statistics for me with a GROUP BY statement.  I only need attributes, and not the SHAPE field, so it's okay to just make an ORACLE query. However, this "reader" cannot read SDE data in a version.  I tried using ESRI's new versioned view function in the sql statement so that the query would read data from the version in question. Here's what the sql looks like with those lines added:

 

 

FME_SQL_DELIMITER ;

 

EXEC sde.version_util.set_current_version('USERNAME.MySDEVersion');

 

select GROUPID,SUM(Field1),SUM(Field2)

 

from MyVersionedFeatureClass

 

group by GROUPID;

 

 

 This returned an error saying: 

 

 

An error occurred while accessing a table result for feature type `QueryFactory_SQLExecute'; message was `Execution of statement `EXEC sde.version_util.set_current_version('FME_SERVICE.duf_fme_updates')' did not succeed; error was `ORA-00900: invalid SQL statement'. (serverType=`ORACLE8i', serverName=`giswpdr', userName=`FME_SERVICE', password=`***', dbname=`')' (server type is `ORACLE8i', server name is `giswpdr', username is `FME_SERVICE', password is `***', database name is `')Error executing SQL statement: EXEC sde.version_util.set_current_version('FME_SERVICE.duf_fme_updates')

 

 

 

Any Ideas on making things go a little faster?  I just need to work in a version that's below default and be able to handle about 1.5 million records as efficiently as possible.
Hi,

 

 

It might be possible to add "order by" clause to the "WHERE Clause" parameter, like this.

 

-----

 

WHERE Clause: 1=1 order by ID

 

-----

 

 

Takashi
Thanks Takashi!  

 

 

That actually worked.  Records got loaded in order and I could leverage that option for StatisticsCalculator.  Can definitely see using this in the future to help with grouping data for parallel processing in other workflows.

 

 

Although for the record, for this excercise StatisticsCalculator was much slower using that group by function with the ORDER BY, so I reverted to loading all the records into memory and letting the statistics transformer chew on them in one go.  Out of the millions of records there are only about 5-30 features in each group on average.  I'm assuming when you tell it the data is ordered it's starting up some process for every group which slows things down if there are lots of little groups...as opposed to millions of records in just a few groups. 
If your requirement is only to calculate average of attributes for each group, the Aggregator could be more efficient than the StatisticsCalculator.

 

Group By: ID

 

Input is Ordered by Group: Yes

 

Mode: Attributes Only

 

Keep Input Attributes: No

 

Attributes to Average: <select attribute(s) that should be averaged>

Reply