Skip to main content

I want to create a new attribute from a regex pattern found in another existing attribute. If no pattern found then default to static value. Say for example, attribute bID has values 1<a>4, 44<a>, 98. My new attribute would be code1 would have values a, a, d. The regex finds the 'a' and 'd' is the default if no 'a' is found. I have played around with a series of transformers, but my solutions are clunky. Can someone propose an efficient solution? I'm struggling a bit with how to use the text editor to return a found regex pattern. Still learning! Also, I want to stick with implementing regex code because my actual problem may have a more complex regex pattern.

Thank you,

Tyler

FME2020.2.1.0 Win64

I think this would be the most straightforward way, RegEx in the stringsearcher then the value of anything coming out the not matched port can be set to a default value

CaptureYou could do something like this in an attributecreator, but returning the matched regex is more complicated, especially if like you say, the regex you will be using is more complicated. This example tests for the presence of the regex, if -1 is returned it means no match and a default value is set, otherwise everything that does NOT match the regex is replaced with nothing (leaving just the matched value)

 Capture


I think this would be the most straightforward way, RegEx in the stringsearcher then the value of anything coming out the not matched port can be set to a default value

CaptureYou could do something like this in an attributecreator, but returning the matched regex is more complicated, especially if like you say, the regex you will be using is more complicated. This example tests for the presence of the regex, if -1 is returned it means no match and a default value is set, otherwise everything that does NOT match the regex is replaced with nothing (leaving just the matched value)

 Capture

Thanks for the quick response. I contemplated that solution and may end up using it, but it does have an undesirable aspect that violates the DRY principle of coding. Don't Repeat Yourself. The Code1 attribute has to be created twice, setting up a booby trap for my future self, or future others.


Thanks for the quick response. I contemplated that solution and may end up using it, but it does have an undesirable aspect that violates the DRY principle of coding. Don't Repeat Yourself. The Code1 attribute has to be created twice, setting up a booby trap for my future self, or future others.

keep the '_first_match' attribute from the StringSearcher. connect all output pipe to attributecreator and create a conditional statement where if(_first_match has value, a,d)Screenshot 2021-01-14 122423


Thanks for the quick response. I contemplated that solution and may end up using it, but it does have an undesirable aspect that violates the DRY principle of coding. Don't Repeat Yourself. The Code1 attribute has to be created twice, setting up a booby trap for my future self, or future others.

You also could skip the stringsearcher and just use the attributecreatorScreenshot 2021-01-14 123008


Thanks for the quick response. I contemplated that solution and may end up using it, but it does have an undesirable aspect that violates the DRY principle of coding. Don't Repeat Yourself. The Code1 attribute has to be created twice, setting up a booby trap for my future self, or future others.

I flipped the order from to put attribute creator first, filled all values with 'd', then ran the string searcher over overwrote as necessary. That's a bit cleaner and adheres to DRY. I'm still up for any other ideas y'all have.


Thanks for the quick response. I contemplated that solution and may end up using it, but it does have an undesirable aspect that violates the DRY principle of coding. Don't Repeat Yourself. The Code1 attribute has to be created twice, setting up a booby trap for my future self, or future others.

@townest​ why aren't you using conditional values in the attributecreator? You could eliminate stringsearcher


Thanks for the quick response. I contemplated that solution and may end up using it, but it does have an undesirable aspect that violates the DRY principle of coding. Don't Repeat Yourself. The Code1 attribute has to be created twice, setting up a booby trap for my future self, or future others.

I'm learning which is why I'm asking. Conditional values sound like a good plan, but I'm stuck on a few details. Primarily, how to return a matched pattern. How do you write:

 

If regexpattern(a) in 'abc' then return regexpattern(a);

else 'd'?

 

bid (string) code1 (result)

abc a

xyz d

baw a

 

Remember my regex is more complex than 'a' so the result has to be the actual matched pattern and not just a constant 'a' as shown in @alexlynch3450​  recommended screen shots above.

 

Thanks for your help.


Thanks for the quick response. I contemplated that solution and may end up using it, but it does have an undesirable aspect that violates the DRY principle of coding. Don't Repeat Yourself. The Code1 attribute has to be created twice, setting up a booby trap for my future self, or future others.

okay, to help you set things straight use a tester transformer before production.

  1. In the tester transformer, add left attribute, change operator to "CONTAINS REGEX", and navigate to "Open Regex Editor". Test your ~complex` pattern.
  2. run data through to verify test clauses worked.
  3. disable tester and add attributecreator
    1. pick conditional statement and take regex testScreenshot 2021-01-14 134622 clause from tester transformer

Thanks for the quick response. I contemplated that solution and may end up using it, but it does have an undesirable aspect that violates the DRY principle of coding. Don't Repeat Yourself. The Code1 attribute has to be created twice, setting up a booby trap for my future self, or future others.

The issue is returning the matched pattern is not straightforward in an attributecreator. I've demonstrated one way you can do it but the regex becomes more complicated. If the match is a fixed length it's possible to do it by finding the position of the matched pattern and then using that to get the substring.

 

The stringsearcher is a much easier way to get the match and it's likely much easier for someone else reading the workspace to understand what is going on.


@townest​  For a DRY solution and only one AttributeCreator, I think you should be able to use a capture group and back-reference \\1 in @repalceRegex()

In the AttributeCreator - use a conditional with Contains Regex to find your pattern. But in the Attribute Value field, use @ReplaceRegEx() with a capture group and the back-reference \\1

dialogI've attached an example workspace.


@townest​  For a DRY solution and only one AttributeCreator, I think you should be able to use a capture group and back-reference \\1 in @repalceRegex()

In the AttributeCreator - use a conditional with Contains Regex to find your pattern. But in the Attribute Value field, use @ReplaceRegEx() with a capture group and the back-reference \\1

dialogI've attached an example workspace.

I never knew you could use capture groups in the AttributeCreator. Matching everything and then returning just the group works, but once you start getting into lookahead and lookbehind regex it gets a bit complicated. I've never understood why there isn't a simple ReturnRegExMatch option?


I never knew you could use capture groups in the AttributeCreator. Matching everything and then returning just the group works, but once you start getting into lookahead and lookbehind regex it gets a bit complicated. I've never understood why there isn't a simple ReturnRegExMatch option?

@ebygomm​ well we both learnt something today! We are planning a @SubstringRegex() function so I have added your thoughts to the change request. . Not sure when it will appear in FME.

 


I've a similar approach that only make one Regex call:

AM_regexThe conditional for 'Code' simply tests whether 'Code' has a value and if not set 'Code' = "<Default>".

From there I get these results:

AM_regex_results


@townest​  For a DRY solution and only one AttributeCreator, I think you should be able to use a capture group and back-reference \\1 in @repalceRegex()

In the AttributeCreator - use a conditional with Contains Regex to find your pattern. But in the Attribute Value field, use @ReplaceRegEx() with a capture group and the back-reference \\1

dialogI've attached an example workspace.

@Mark Stoakes​, Thank you. I'll give your solution a whirl when I get a chance. Tyler


Reply