Skip to main content
Solved

Extract unique values from attribute table using PythonCaller


dustin
Influencer
Forum|alt.badge.img+30

Hello - I have a file geodatabase where most feature classes contains an attribute 'text_desc'. I would like to extract all unique values, and concatenate those into a single comma-separated attribute to generate a report. Due to slow gdb read times in FME, I would like to do this in PythonCaller for efficiency.

 

Is this possible?

Best answer by ebygomm

dustin wrote:

I can, but with my limited experience, so far I have been unsuccessful in getting it work.

import fme
import fmeobjects
import arcpy
 
def getUnique(feature):
    
    arcpy.env.workspace = FME_MacroValues['SourceDataset_GEODATABASE_FILE']
    fc = feature.getAttribute('featureclass{}')
    fn = feature.getAttribute('fieldname')
    valueList = []
    for f in fc:
        with arcpy.da.SearchCursor(f,fn) as SCur:            
            for row in SCur:
                if not row[0in valueList:
                    valueList.append(row[0])
    feature.setAttribute("unique_values",",".join(valueList))

You could do this if you have the featureclasses stored in a list. There's probably more efficient ways

 

View original
Did this help you find an answer to your question?

11 replies

david_r
Evangelist
  • April 8, 2021

If this is an Enterprise Geodatabase (SDE), why not let the database do the work for you? It's going to so much faster than anything you could do with FME. Example SQL that you can use in e.g. a SQLExecutor:

SELEC DISTINCT text_desc
FROM my_feature_class

You will then only get the unique values back, and you can then use e.g. an Aggregator to concatenate the values.


dustin
Influencer
Forum|alt.badge.img+30
  • Author
  • Influencer
  • April 8, 2021
david_r wrote:

If this is an Enterprise Geodatabase (SDE), why not let the database do the work for you? It's going to so much faster than anything you could do with FME. Example SQL that you can use in e.g. a SQLExecutor:

SELEC DISTINCT text_desc
FROM my_feature_class

You will then only get the unique values back, and you can then use e.g. an Aggregator to concatenate the values.

Sorry, this is a File Geodatabase.


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • April 8, 2021

Can you use arcpy?


dustin
Influencer
Forum|alt.badge.img+30
  • Author
  • Influencer
  • April 8, 2021
ebygomm wrote:

Can you use arcpy?

I can, but with my limited experience, so far I have been unsuccessful in getting it work.


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • April 8, 2021
dustin wrote:

I can, but with my limited experience, so far I have been unsuccessful in getting it work.

A couple of examples, need a parameter holding the gdb location and the feature triggering the pythoncaller needs to have attributes holding the featureclass name and the fieldname

import fme
import fmeobjects
import arcpy
 
def getUnique(feature):
    #get parameter with location of fileGDB
    arcpy.env.workspace = FME_MacroValues['SourceDataset_GEODATABASE_FILE']
    #get featureclassname
    fc = feature.getAttribute('featureclass')
    #get field name
    fn = feature.getAttribute('fieldname')
    row=[x[0for x in arcpy.da.TableToNumPyArray(fc,fn)]
    uniquelist = set(row)
    #create new attribute with concatenated list joined with comma
    feature.setAttribute("unique_values",",".join(uniquelist))

or

import fme
import fmeobjects
import arcpy
 
def getUnique(feature):
    
    arcpy.env.workspace = FME_MacroValues['SourceDataset_GEODATABASE_FILE']
    fc = feature.getAttribute('featureclass')
    fn = feature.getAttribute('fieldname')
    with arcpy.da.SearchCursor(fc,fn) as SCur:
        valueList = []
        for row in SCur:
            if not row[0in valueList:
                valueList.append(row[0])
    feature.setAttribute("unique_values",",".join(valueList))

 

 


dustin
Influencer
Forum|alt.badge.img+30
  • Author
  • Influencer
  • April 8, 2021
dustin wrote:

I can, but with my limited experience, so far I have been unsuccessful in getting it work.

Thanks @ebygomm, the second works well with one feature class being input. If I wanted to query multiple feature class, such as 'Rivers' and 'Lakes', how could I go about that?


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • April 8, 2021
dustin wrote:

I can, but with my limited experience, so far I have been unsuccessful in getting it work.

Is it the same fieldname for each feature class? Do you want a single attribute containing all values in both featureclasses or something else?


dustin
Influencer
Forum|alt.badge.img+30
  • Author
  • Influencer
  • April 8, 2021
dustin wrote:

I can, but with my limited experience, so far I have been unsuccessful in getting it work.

Yes, it is the same fieldname "text_desc" on each feature class. I would need a single attribute output containing unique values of "text_desc" from all the feature classes. So, if a feature from Rivers had text_desc=1, and another Rivers had text_desc=2, and a Lakes feature had text_desc=1, the result would be "unique_values=1,2"


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • Best Answer
  • April 8, 2021
dustin wrote:

I can, but with my limited experience, so far I have been unsuccessful in getting it work.

import fme
import fmeobjects
import arcpy
 
def getUnique(feature):
    
    arcpy.env.workspace = FME_MacroValues['SourceDataset_GEODATABASE_FILE']
    fc = feature.getAttribute('featureclass{}')
    fn = feature.getAttribute('fieldname')
    valueList = []
    for f in fc:
        with arcpy.da.SearchCursor(f,fn) as SCur:            
            for row in SCur:
                if not row[0in valueList:
                    valueList.append(row[0])
    feature.setAttribute("unique_values",",".join(valueList))

You could do this if you have the featureclasses stored in a list. There's probably more efficient ways

 


dustin
Influencer
Forum|alt.badge.img+30
  • Author
  • Influencer
  • April 8, 2021
dustin wrote:

I can, but with my limited experience, so far I have been unsuccessful in getting it work.

Thanks @ebygomm​ ! It worked great.


david_r
Evangelist
  • April 8, 2021
dustin wrote:

I can, but with my limited experience, so far I have been unsuccessful in getting it work.

This is a great solution, but it can be somewhat slow if there are a lot of unique values. Here's a slightly different take that has a more linear performance, since it uses the Python built-in set function rather than doing repeated list lookups:

import fme
import fmeobjects
import arcpy
 
def getUnique(feature):
   arcpy.env.workspace = FME_MacroValues['SourceDataset_GEODATABASE_FILE']
   fc = feature.getAttribute('featureclass{}')
   fn = feature.getAttribute('fieldname')
   valueList = []
   for f in fc:
       with arcpy.da.SearchCursor(f, fn) as cursor:
           valueList.extend(list({row[0for row in cursor}))
   feature.setAttribute("unique_values",",".join(set(valueList)))

Tested with ~1.5 million records containing ~130k unique values (medium length strings):

  • Using list lookup: 15 minutes
  • Using sets: 17 seconds

If the number of unique values is very small, the two solutions are practically as fast.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings