I have a very simple workspace that just tests whether a
string has a “.” Or a “:’ in it and if it does then these characters are deleted.
I am using the Testfilter to test for the characters, and then a stringreplacer
to delete them and then I must make sure the string is all caps. Is there a way
to do these three operations by using the String Functions in the Testfilter? I
haven’t been able to figure out how the string functions work. Also, will using
the string functions speed up the translation? I have 47 million records to test
and right now it takes 4 hours to run. I am using FME2018. I have included the
workspace and source CSV file. Thanks.
Hi @bd,
You can indeed do this all using String Functions however I do not believe they are available in the Test Filter so I would use the Attribute Manager and set up Conditional values for the STRING attribute.
Set up two tests to look for whether the string contains "." or ":" and then set the net attribute value to
@UpperCase(@ReplaceString(@value(STRING),.,""))
@UpperCase = to make sure outcome is all caps followed by
@ReplaceString(<string>,<before>,<after>) and change before to . or : accordingly. For the after you must put "" otherwise FME doesn't recognize that this is asking the string to be replaced with nothing and will fail.
Lastly set anything else to also be changed to upper case.
Having quickly done both methods on your test data I did find using string functions faster - 4.1 vs 7.1 seconds using FME 2018.1 (approx 40%), so I do believe this would improve performance. Ps your fmw file was not uploaded so this was using a mock up based on how I believe you did it before.
You can also do this with a single statement in an AttributeManager using the UpperCase and ReplaceRegEx string functions
PythonCaller:
class FeatureProcessor(object):
def __init__(self):
pass
def input(self,feature):
feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
self.pyoutput(feature)
def close(self):
pass
I think this is better, cause it will also replace both characters if they occure in the same string.
No need to do any checks, just bruteforce every string.
PythonCaller:
class FeatureProcessor(object):
def __init__(self):
pass
def input(self,feature):
feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
self.pyoutput(feature)
def close(self):
pass
I think this is better, cause it will also replace both characters if they occure in the same string.
No need to do any checks, just bruteforce every string.
You can use
[regexp {\\.*\\:*}]!=0 = 1 to test in the tester. If you need to.
But as you intend to replace the characters anyway, why test at all?
Just use a string replacer.
Mode: replace regulas expression
Text to replace: \\.|\\:
Replacement: none
(You can use the attribute manager/creator to do the same. )
Hopefully one of the solutions provided has worked for you. We here at Safe would love to use a sample of your data to test whether changes we are currently working on are helping improve performance using functions. If possible please could you upload a larger dataset (~ 20 times the current test.csv) to ftp://ftp.safe.com. You should be able to enter as a guest and submit the file to the top level. This would be greatly appreciated as with a bigger chunk of data we will really be able to see if we are speeding things up.
Many thanks,
Holly
Hopefully one of the solutions provided has worked for you. We here at Safe would love to use a sample of your data to test whether changes we are currently working on are helping improve performance using functions. If possible please could you upload a larger dataset (~ 20 times the current test.csv) to ftp://ftp.safe.com. You should be able to enter as a guest and submit the file to the top level. This would be greatly appreciated as with a bigger chunk of data we will really be able to see if we are speeding things up.
Many thanks,
Holly
Hi @bd,
You can indeed do this all using String Functions however I do not believe they are available in the Test Filter so I would use the Attribute Manager and set up Conditional values for the STRING attribute.
Set up two tests to look for whether the string contains "." or ":" and then set the net attribute value to
@UpperCase(@ReplaceString(@value(STRING),.,""))
@UpperCase = to make sure outcome is all caps followed by
@ReplaceString(<string>,<before>,<after>) and change before to . or : accordingly. For the after you must put "" otherwise FME doesn't recognize that this is asking the string to be replaced with nothing and will fail.
Lastly set anything else to also be changed to upper case.
Having quickly done both methods on your test data I did find using string functions faster - 4.1 vs 7.1 seconds using FME 2018.1 (approx 40%), so I do believe this would improve performance. Ps your fmw file was not uploaded so this was using a mock up based on how I believe you did it before.
PythonCaller:
class FeatureProcessor(object):
def __init__(self):
pass
def input(self,feature):
feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
self.pyoutput(feature)
def close(self):
pass
I think this is better, cause it will also replace both characters if they occure in the same string.
No need to do any checks, just bruteforce every string.
@redgeographics
yeah I tried that to but it was 2 seconds slower.
PythonCaller:
class FeatureProcessor(object):
def __init__(self):
pass
def input(self,feature):
feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
self.pyoutput(feature)
def close(self):
pass
I think this is better, cause it will also replace both characters if they occure in the same string.
No need to do any checks, just bruteforce every string.
def processFeature(feature):
feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
def processFeature(feature):
feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
@UpperCase(@ReplaceString(@ReplaceString(@Value(STRING),.,""),:,""))
no conditionals.
Hi @bd,
You can indeed do this all using String Functions however I do not believe they are available in the Test Filter so I would use the Attribute Manager and set up Conditional values for the STRING attribute.
Set up two tests to look for whether the string contains "." or ":" and then set the net attribute value to
@UpperCase(@ReplaceString(@value(STRING),.,""))
@UpperCase = to make sure outcome is all caps followed by
@ReplaceString(<string>,<before>,<after>) and change before to . or : accordingly. For the after you must put "" otherwise FME doesn't recognize that this is asking the string to be replaced with nothing and will fail.
Lastly set anything else to also be changed to upper case.
Having quickly done both methods on your test data I did find using string functions faster - 4.1 vs 7.1 seconds using FME 2018.1 (approx 40%), so I do believe this would improve performance. Ps your fmw file was not uploaded so this was using a mock up based on how I believe you did it before.
Hi all and @hollyatsafe, I am trying to do similar, but replacing month with the number of the month (eg, JAN replaced with 01). I am using attribute manager (i noticed paalped also commented on this method in this thread):
See below:This is the condition:
It is detecting that it has JAN inside, but not replacing it with 01. Why? What am I doing wrong.
Here is the output:
Thanks!
Hi all and @hollyatsafe, I am trying to do similar, but replacing month with the number of the month (eg, JAN replaced with 01). I am using attribute manager (i noticed paalped also commented on this method in this thread):
See below:This is the condition:
It is detecting that it has JAN inside, but not replacing it with 01. Why? What am I doing wrong.
Here is the output:
Thanks!
Hi, remove the quotes in the formula in AttributeValue. Also there appears to be no need for @UpperCase.
@ReplaceString(@Value(CALENDAR_FORMATTED_DATE),JAN,01)
If you want to do this for all months, take a look at the StringPairReplacer.