Skip to main content
Solved

Using String Functions in the TestFilter

  • July 27, 2018
  • 14 replies
  • 605 views

Forum|alt.badge.img

I have a very simple workspace that just tests whether a string has a “.” Or a “:’ in it and if it does then these characters are deleted. I am using the Testfilter to test for the characters, and then a stringreplacer to delete them and then I must make sure the string is all caps. Is there a way to do these three operations by using the String Functions in the Testfilter? I haven’t been able to figure out how the string functions work. Also, will using the string functions speed up the translation? I have 47 million records to test and right now it takes 4 hours to run. I am using FME2018. I have included the workspace and source CSV file. Thanks.

Best answer by hollyatsafe

Hi @bd,

You can indeed do this all using String Functions however I do not believe they are available in the Test Filter so I would use the Attribute Manager and set up Conditional values for the STRING attribute.

Set up two tests to look for whether the string contains "." or ":" and then set the net attribute value to

@UpperCase(@ReplaceString(@value(STRING),.,""))

@UpperCase = to make sure outcome is all caps followed by

@ReplaceString(<string>,<before>,<after>) and change before to . or : accordingly. For the after you must put "" otherwise FME doesn't recognize that this is asking the string to be replaced with nothing and will fail.

Lastly set anything else to also be changed to upper case.

Having quickly done both methods on your test data I did find using string functions faster - 4.1 vs 7.1 seconds using FME 2018.1 (approx 40%), so I do believe this would improve performance. Ps your fmw file was not uploaded so this was using a mock up based on how I believe you did it before.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

14 replies

Forum|alt.badge.img+2
  • Best Answer
  • July 27, 2018

Hi @bd,

You can indeed do this all using String Functions however I do not believe they are available in the Test Filter so I would use the Attribute Manager and set up Conditional values for the STRING attribute.

Set up two tests to look for whether the string contains "." or ":" and then set the net attribute value to

@UpperCase(@ReplaceString(@value(STRING),.,""))

@UpperCase = to make sure outcome is all caps followed by

@ReplaceString(<string>,<before>,<after>) and change before to . or : accordingly. For the after you must put "" otherwise FME doesn't recognize that this is asking the string to be replaced with nothing and will fail.

Lastly set anything else to also be changed to upper case.

Having quickly done both methods on your test data I did find using string functions faster - 4.1 vs 7.1 seconds using FME 2018.1 (approx 40%), so I do believe this would improve performance. Ps your fmw file was not uploaded so this was using a mock up based on how I believe you did it before.


ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • July 30, 2018

You can also do this with a single statement in an AttributeManager using the UpperCase and ReplaceRegEx string functions


paalped
Contributor
Forum|alt.badge.img+5
  • Contributor
  • July 30, 2018

PythonCaller:

class FeatureProcessor(object):
    def __init__(self):
        pass
    def input(self,feature):
        feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
        self.pyoutput(feature)
    def close(self):
        pass
 

I think this is better, cause it will also replace both characters if they occure in the same string. 

No need to do any checks, just bruteforce every string.


redgeographics
Celebrity
Forum|alt.badge.img+62

PythonCaller:

class FeatureProcessor(object):
    def __init__(self):
        pass
    def input(self,feature):
        feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
        self.pyoutput(feature)
    def close(self):
        pass
 

I think this is better, cause it will also replace both characters if they occure in the same string. 

No need to do any checks, just bruteforce every string.

A similar approach would be StringReplacer and CaseChanger. Not sure whether it'd be faster than the Python one that @paalped suggested but it's worth a try with a small sample.

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • July 30, 2018

You can use

[regexp {\\.*\\:*}]!=0 = 1 to test in the tester. If you need to.

But as you intend to replace the characters anyway, why test at all?

Just use a string replacer.

Mode: replace regulas expression

Text to replace: \\.|\\:

Replacement: none

(You can use the attribute manager/creator to do the same. )


Forum|alt.badge.img+2
Hi @bd,

 

 

Hopefully one of the solutions provided has worked for you. We here at Safe would love to use a sample of your data to test whether changes we are currently working on are helping improve performance using functions. If possible please could you upload a larger dataset (~ 20 times the current test.csv) to ftp://ftp.safe.com. You should be able to enter as a guest and submit the file to the top level. This would be greatly appreciated as with a bigger chunk of data we will really be able to see if we are speeding things up.

 

 

Many thanks,

 

Holly

Forum|alt.badge.img
  • Author
  • July 30, 2018
Hi @bd,

 

 

Hopefully one of the solutions provided has worked for you. We here at Safe would love to use a sample of your data to test whether changes we are currently working on are helping improve performance using functions. If possible please could you upload a larger dataset (~ 20 times the current test.csv) to ftp://ftp.safe.com. You should be able to enter as a guest and submit the file to the top level. This would be greatly appreciated as with a bigger chunk of data we will really be able to see if we are speeding things up.

 

 

Many thanks,

 

Holly
Can I sent you an email about uploading a larger csv file so we can discuss off forum.

 

 


Forum|alt.badge.img+2
Can I sent you an email about uploading a larger csv file so we can discuss off forum.

 

 

Of course @bd - please use the Report a Problem form and I will pick up the case from this inbox.

Forum|alt.badge.img
  • Author
  • July 30, 2018

Hi @bd,

You can indeed do this all using String Functions however I do not believe they are available in the Test Filter so I would use the Attribute Manager and set up Conditional values for the STRING attribute.

Set up two tests to look for whether the string contains "." or ":" and then set the net attribute value to

@UpperCase(@ReplaceString(@value(STRING),.,""))

@UpperCase = to make sure outcome is all caps followed by

@ReplaceString(<string>,<before>,<after>) and change before to . or : accordingly. For the after you must put "" otherwise FME doesn't recognize that this is asking the string to be replaced with nothing and will fail.

Lastly set anything else to also be changed to upper case.

Having quickly done both methods on your test data I did find using string functions faster - 4.1 vs 7.1 seconds using FME 2018.1 (approx 40%), so I do believe this would improve performance. Ps your fmw file was not uploaded so this was using a mock up based on how I believe you did it before.

Thanks for help. Using the attrbutemanager will be about %33 faster. That will save a lot of time when I am processing millions of records.

 

 


paalped
Contributor
Forum|alt.badge.img+5
  • Contributor
  • July 31, 2018

PythonCaller:

class FeatureProcessor(object):
    def __init__(self):
        pass
    def input(self,feature):
        feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
        self.pyoutput(feature)
    def close(self):
        pass
 

I think this is better, cause it will also replace both characters if they occure in the same string. 

No need to do any checks, just bruteforce every string.

 

@redgeographics

 

yeah I tried that to but it was 2 seconds slower.

paalped
Contributor
Forum|alt.badge.img+5
  • Contributor
  • July 31, 2018

PythonCaller:

class FeatureProcessor(object):
    def __init__(self):
        pass
    def input(self,feature):
        feature.setAttribute('STRING',feature.getAttribute('STRING')\
.replace('.','').replace(':','').upper())
        self.pyoutput(feature)
    def close(self):
        pass
 

I think this is better, cause it will also replace both characters if they occure in the same string. 

No need to do any checks, just bruteforce every string.

a tiny bit faster to only use a function call since methods are slightly slower:

 

def processFeature(feature):
    feature.setAttribute('STRING',feature.getAttribute('STRING')\
        .replace('.','').replace(':','').upper())

 


paalped
Contributor
Forum|alt.badge.img+5
  • Contributor
  • July 31, 2018
a tiny bit faster to only use a function call since methods are slightly slower:

 

def processFeature(feature):
    feature.setAttribute('STRING',feature.getAttribute('STRING')\
        .replace('.','').replace(':','').upper())

 

Is a little bit faster with AttributeManager:

 

@UpperCase(@ReplaceString(@ReplaceString(@Value(STRING),.,""),:,""))

 

 

no conditionals.

 

 

 


maaw306
Participant
  • Participant
  • November 30, 2023

Hi @bd,

You can indeed do this all using String Functions however I do not believe they are available in the Test Filter so I would use the Attribute Manager and set up Conditional values for the STRING attribute.

Set up two tests to look for whether the string contains "." or ":" and then set the net attribute value to

@UpperCase(@ReplaceString(@value(STRING),.,""))

@UpperCase = to make sure outcome is all caps followed by

@ReplaceString(<string>,<before>,<after>) and change before to . or : accordingly. For the after you must put "" otherwise FME doesn't recognize that this is asking the string to be replaced with nothing and will fail.

Lastly set anything else to also be changed to upper case.

Having quickly done both methods on your test data I did find using string functions faster - 4.1 vs 7.1 seconds using FME 2018.1 (approx 40%), so I do believe this would improve performance. Ps your fmw file was not uploaded so this was using a mock up based on how I believe you did it before.

Hi all and @hollyatsafe, I am trying to do similar, but replacing month with the number of the month (eg, JAN replaced with 01). I am using attribute manager (i noticed paalped also commented on this method in this thread):

See below:AMThis is the condition:

SETIt is detecting that it has JAN inside, but not replacing it with 01. Why? What am I doing wrong.

Here is the output:

MONTHThanks!


geomancer
Evangelist
Forum|alt.badge.img+60
  • Evangelist
  • December 1, 2023

Hi all and @hollyatsafe, I am trying to do similar, but replacing month with the number of the month (eg, JAN replaced with 01). I am using attribute manager (i noticed paalped also commented on this method in this thread):

See below:AMThis is the condition:

SETIt is detecting that it has JAN inside, but not replacing it with 01. Why? What am I doing wrong.

Here is the output:

MONTHThanks!

Hi, remove the quotes in the formula in AttributeValue. Also there appears to be no need for @UpperCase.

@ReplaceString(@Value(CALENDAR_FORMATTED_DATE),JAN,01)

If you want to do this for all months, take a look at the StringPairReplacer.