Question

Removing parts of a string with the Attributetrimmer?

  • 7 November 2017
  • 13 replies
  • 206 views

Badge

Hello FME users,

I need to trim parts of a string and have been using the Attribute Trimmer's ReplaceRegEx function. I noticed however that while some fields are trimmed correctly while other's aren't. I have not be able to find a patter or correlation between the length or type of string that isn't trimmed perfectly vs. the ones that are. It looks like this:

@ReplaceRegEx(<|| Toronto>,<|| Scarborough>,<|| Scarborough|| Toronto>,<|| East York>,<|| North York>,<|| York>,<|| Etobicoke>)

The result:

Notice how || Etobicoke is still there. I have tried different Attribute trimmer functions like ReplaceString or TrimLeft and the part I want to keep is sometimes trimmed off.

Would you have any suggestions for a more effective method or this just a bug?

Thanks in advance,


13 replies

Userlevel 2
Badge +12

I would use the StringReplacer transformer.

Use the regex [Scarborough|Scarborough|Toronto|East York|North York|York|Etobicoke]

Badge

I would use the StringReplacer transformer.

Use the regex [Scarborough|Scarborough|Toronto|East York|North York|York|Etobicoke]

Or StringPairReplacer

 

 

Userlevel 4
Badge +25

So you have the AttributeTrimmer transformer, but where are you using the ReplaceRegex function? Is it inside a text editor for the Trim Characters parameter? If so, I'm surprised this does anything at all. That would be the characters to trim, not a function to carry out the trimming (so whatever is trimmed inside that editor dialog, the result is what is used to trim the Attributes to Trim).

Or am I reading that wrong? Can you post a screenshot of the AttributeTrimmer parameters?

In general you could use the AttributeManager instead and put your function inside the text editor. I think that would work. Or - as others suggest - try the StringReplacer in regex mode.

Userlevel 4
Badge +25

So you have the AttributeTrimmer transformer, but where are you using the ReplaceRegex function? Is it inside a text editor for the Trim Characters parameter? If so, I'm surprised this does anything at all. That would be the characters to trim, not a function to carry out the trimming (so whatever is trimmed inside that editor dialog, the result is what is used to trim the Attributes to Trim).

Or am I reading that wrong? Can you post a screenshot of the AttributeTrimmer parameters?

In general you could use the AttributeManager instead and put your function inside the text editor. I think that would work. Or - as others suggest - try the StringReplacer in regex mode.

nb: if you are using the AttributeTrimmer transformer the Trim Characters parameter is exactly that, a list of characters to trim. It's not a substring to be removed. So if the "Kingston Rd" part is not being trimmed, it's only because the letter "d" isn't part of the character list. Otherwise that would be chopped too. That makes me wonder if you are using the AttributeTrimmer transformer or not.

 

Userlevel 2
Badge +17
What is the original string, and what part(s) should be removed (trimmed) from that?

 

 

Badge
nb: if you are using the AttributeTrimmer transformer the Trim Characters parameter is exactly that, a list of characters to trim. It's not a substring to be removed. So if the "Kingston Rd" part is not being trimmed, it's only because the letter "d" isn't part of the character list. Otherwise that would be chopped too. That makes me wonder if you are using the AttributeTrimmer transformer or not.

 

 

 

 

Hello Mark2AtSafe,

 

I am trying to trim only the "|| (city)" portion of the string using the AttributeTrimmer transformer. I posted this because my method is working for the most part but not perfectly ( as per my original post the "|| Etobicoke" is still there..

 

Userlevel 2
Badge +12

I have created a sample workspace (see attachment) that will remove the || City.

You regex should be \\|\\| (Scarborough|Scarborough|Toronto|East York|North York|York|Etobicoke).

city-remover.fmw

 

Userlevel 4
Badge +25

 

 

 

Hello Mark2AtSafe,

 

I am trying to trim only the "|| (city)" portion of the string using the AttributeTrimmer transformer. I posted this because my method is working for the most part but not perfectly ( as per my original post the "|| Etobicoke" is still there..

 

Right, then you should definitely NOT be using the AttributeTrimmer. The ReplaceRegEx function might be fine in an AttributeManager, but not here. That field is just a list of characters to remove/trim. It will trim past the || part in your source string if you have characters that match. So try either the StringReplacer or an AttributeManager, but not the AttributeTrimmer.

 

I'll ask for a documentation update so that parameter in the AttributeTrimmer is clearer.

 

Badge
Right, then you should definitely NOT be using the AttributeTrimmer. The ReplaceRegEx function might be fine in an AttributeManager, but not here. That field is just a list of characters to remove/trim. It will trim past the || part in your source string if you have characters that match. So try either the StringReplacer or an AttributeManager, but not the AttributeTrimmer.

 

I'll ask for a documentation update so that parameter in the AttributeTrimmer is clearer.

 

@Mark2AtSafe

 

Thanks
Badge

I have created a sample workspace (see attachment) that will remove the || City.

You regex should be \\|\\| (Scarborough|Scarborough|Toronto|East York|North York|York|Etobicoke).

city-remover.fmw

 

Thanks @erik_jan,

 

 

This has worked for me except for the double city parts of the string e.g. (|| York||Toronto). Where only the "(|| York)" was removed and the (||Toronto) is still present. I fixed this by outputting the file and re-running the original RegEx with the last portion e.g.||Toronto, on the new file. Would you be able to recommend a more automated option?

 

Userlevel 2
Badge +12
Thanks @erik_jan,

 

 

This has worked for me except for the double city parts of the string e.g. (|| York||Toronto). Where only the "(|| York)" was removed and the (||Toronto) is still present. I fixed this by outputting the file and re-running the original RegEx with the last portion e.g.||Toronto, on the new file. Would you be able to recommend a more automated option?

 

I think changing the regex to:

 

(\\|\\| (Scarborough|Toronto|East York|North York|York|Etobicoke))*

 

will remove multiple city entries.

 

 

Userlevel 2
Badge +17
Thanks @erik_jan,

 

 

 

   This has worked for me except for the double city parts of the string e.g. (|| York||Toronto). Where only the "(|| York)" was removed and the (||Toronto) is still present. I fixed this by outputting the file and re-running the original RegEx  with the last portion e.g.||Toronto,          on the new file. Would you be able to recommend a more automated option?

 

Perhaps the reason is that there isn't a space between '||' and 'Toronto'. The regex @erik_jan provided matches the string part only if there is a single space between '||' and city name. If there could be cases where there isn't a space, modify the regex like this.

 

\|\|\s*(Scarborough|Scarborough|Toronto|East York|North York|York|Etobicoke)
Userlevel 2
Badge +17

If you just want to remove the trailing part starting with ||, this expression could also be available.

@ReplaceRegEx(@Value(SERVICE REQUEST LOCATION),\s*\|\|.*$,"")

Reply