You could create your list initiallly by using a stringsearcher with regular expression . and creating a list name for all matches, then using a list duplicate remover to get a list of unique characters.
No idea on how that would compare performance wise
Hi @rob14, I think using Python script could be more efficient. Assuming that an attribute called "_text" stores a text string, a PythonCaller with this script creates a list contains unique characters.
# PythonCaller Script Example
def processFeature(feature):
s = set(feature.getAttribute('_text'))
feature.setAttribute('_char{}', list(s))
Hi @rob14, I think using Python script could be more efficient. Assuming that an attribute called "_text" stores a text string, a PythonCaller with this script creates a list contains unique characters.
# PythonCaller Script Example
def processFeature(feature):
s = set(feature.getAttribute('_text'))
feature.setAttribute('_char{}', list(s))
Hi @takashi,
Thanks very much, nearly there. but I have 2 questions;
1. the script has worked and I can see the unique chars, however, how do I expose and explode the list "_char". When I tried to use list exploder the list is not seen, do I need to complete additional configuration in the PythonCaller Trasnformer?.
2. Also if I needed to also do this globally across all records to find a unique list across all records, (instead/as well as unique to a given record), is there a quick way to do that as well? (rather than python caller -> list exploder-> duplicate remover.
I am interested in being able to do both.
Thanks,
Rob
1. You can expose the list name "_char{}" with the Attributes to Expose parameter in the PythonCaller parameters dialog.
2. This script creates a list from all the input features, then outputs a single feature having the list at last.
# PythonCaller Script Example 2
import fmeobjects
class FeatureProcessor(object):
def __init__(self):
self.chars = set(f])
def input(self, feature):
self.chars |= set(feature.getAttribute('_text'))
def close(self):
feature = fmeobjects.FMEFeature()
feature.setAttribute('_char{}', list(self.chars))
self.pyoutput(feature)
In addition, if you finally need to explode the feature on the list, the close method can be modified like this, instead of using the ListExploder afterword.
def close(self):
for i, c in enumerate(self.chars):
feature = fmeobjects.FMEFeature()
feature.setAttribute('_char', c)
feature.setAttribute('_element_index', i)
self.pyoutput(feature)
Hi @takashi
You are a Star.
Thanks very much that was lighting quick to run!.
Regards,
Rob