I've been trying to no avail to put into StringSearcher a regular expression to search for this pattern: one to three digits, followed by either a double quote character or two single ' characters in succession, followed by an x, followed by one to three digits, followed by either a double quote character or two single ' characters in succession (again). I put together an expression^[0-9]{1,3}[",'']{1,2}+[x][0-9]{1,3}[",'']{1,2}+ and it works in the regex101.com tester, but for some reason FME fails to execute when I put that expression in StringSearcher. I get a failed to evaluate expression message. What am I doing wrong?

Regular expression for finding pattern in string searcher

J

+7

jeroenstiers
178 replies
8 years ago
March 27, 2017

Hi @alpheus

Does this one work for you?

^[0-9]{1,3}(\\"|\\'{2})x[0-9]{1,3}(\\"|\\'{2})$

This is what I have changed:

1/ I have removed the [] around the x since [] allows you to define a range. x is just a normal character

2/ I have changed [",''] into ("|'') because it is one of the options. I have escaped (\\") the quotes so they are being used as a literal character and no longer signal the start of a string and I have included the {2} right after the single quote since only this character may appear twice.

3/ I have removed all plus-signs since the plus sign signals the use of at least one of the previous character (e.g. ^4+$ - means at least one '4' but it might be more) since this functionality was already implemented using the curly brackets, I removed the plus sign.

J

+7

jeroenstiers
178 replies
8 years ago
March 27, 2017

jeroenstiers wrote:

Hi @alpheus

Does this one work for you?

^[0-9]{1,3}(\\"|\\'{2})x[0-9]{1,3}(\\"|\\'{2})$

This is what I have changed:

1/ I have removed the [] around the x since [] allows you to define a range. x is just a normal character

2/ I have changed [",''] into ("|'') because it is one of the options. I have escaped (\\") the quotes so they are being used as a literal character and no longer signal the start of a string and I have included the {2} right after the single quote since only this character may appear twice.

3/ I have removed all plus-signs since the plus sign signals the use of at least one of the previous character (e.g. ^4+$ - means at least one '4' but it might be more) since this functionality was already implemented using the curly brackets, I removed the plus sign.

FYI, Great you are using regex101 to check you regular expressions. I think it is the best website out there. When I am creating regex to be used in FME, I tend to select the 'Flavor' Python (on the left side). I have noticed that it 'debugs' similarly as FME does.

+2

alpheus
Author
Participant
5 replies
8 years ago
March 27, 2017

jeroenstiers wrote:

Hi @alpheus

Does this one work for you?

^[0-9]{1,3}(\\"|\\'{2})x[0-9]{1,3}(\\"|\\'{2})$

This is what I have changed:

1/ I have removed the [] around the x since [] allows you to define a range. x is just a normal character

2/ I have changed [",''] into ("|'') because it is one of the options. I have escaped (\\") the quotes so they are being used as a literal character and no longer signal the start of a string and I have included the {2} right after the single quote since only this character may appear twice.

3/ I have removed all plus-signs since the plus sign signals the use of at least one of the previous character (e.g. ^4+$ - means at least one '4' but it might be more) since this functionality was already implemented using the curly brackets, I removed the plus sign.

Works great, much obliged! Was it the missing quote escapes that was confusing the FME complier?

J

+7

jeroenstiers
178 replies
8 years ago
March 27, 2017

jeroenstiers wrote:

Hi @alpheus

Does this one work for you?

^[0-9]{1,3}(\\"|\\'{2})x[0-9]{1,3}(\\"|\\'{2})$

This is what I have changed:

1/ I have removed the [] around the x since [] allows you to define a range. x is just a normal character

2/ I have changed [",''] into ("|'') because it is one of the options. I have escaped (\\") the quotes so they are being used as a literal character and no longer signal the start of a string and I have included the {2} right after the single quote since only this character may appear twice.

3/ I have removed all plus-signs since the plus sign signals the use of at least one of the previous character (e.g. ^4+$ - means at least one '4' but it might be more) since this functionality was already implemented using the curly brackets, I removed the plus sign.

Yes, I think that was what made the compiler fail.

takashi
7703 replies
8 years ago
March 27, 2017

jeroenstiers wrote:

Hi @alpheus

Does this one work for you?

^[0-9]{1,3}(\"|\'{2})x[0-9]{1,3}(\"|\'{2})$

This is what I have changed:

1/ I have removed the [] around the x since [] allows you to define a range. x is just a normal character

2/ I have changed [",''] into ("|'') because it is one of the options. I have escaped (\") the quotes so they are being used as a literal character and no longer signal the start of a string and I have included the {2} right after the single quote since only this character may appear twice.

3/ I have removed all plus-signs since the plus sign signals the use of at least one of the previous character (e.g. ^4+$ - means at least one '4' but it might be more) since this functionality was already implemented using the curly brackets, I removed the plus sign.

No, I don't think it's essential to escape double/single quotation marks in a regex for the StringSearcher. This regex should work as well.

^[0-9]{1,3}("|'{2})x[0-9]{1,3}("|'{2})$

Your original regex matches this string, with the StringSearcher (FME 2016, 2017).

100"x200''

However, also matches this string.

100',x200""

Because [",'']{1,2} matches 1 to 2 of any of double quotation mark, comma, or single quotation mark, and I don't think it's preferable.

J

+7

jeroenstiers
178 replies
8 years ago
March 28, 2017

jeroenstiers wrote:

Hi @alpheus

Does this one work for you?

^[0-9]{1,3}(\\"|\\'{2})x[0-9]{1,3}(\\"|\\'{2})$

This is what I have changed:

1/ I have removed the [] around the x since [] allows you to define a range. x is just a normal character

2/ I have changed [",''] into ("|'') because it is one of the options. I have escaped (\\") the quotes so they are being used as a literal character and no longer signal the start of a string and I have included the {2} right after the single quote since only this character may appear twice.

3/ I have removed all plus-signs since the plus sign signals the use of at least one of the previous character (e.g. ^4+$ - means at least one '4' but it might be more) since this functionality was already implemented using the curly brackets, I removed the plus sign.

You are right @takashi the unescaped quote also works. So I have tested the original expression in both FME 2016.1 and 2017 and in both versions I do not get the 'failed to evaluate expression' message. @alpheus in what version are you working?

+2

alpheus
Author
Participant
5 replies
8 years ago
March 28, 2017

I'm using the ESRI licensed Data Interoperability version, the about says FME 20150114 - Build 15245.

+15

gio
Contributor
2252 replies
8 years ago
March 28, 2017

[] defines a character class.

For a very good tutorial check out

Regular Expressions

The Complete Tutorial

Jan Goyvaerts

(Belgian)

It's still out there to be downloaded...

Phython uses a advanced regex engine, it has lookbehind for instance. Regexes created in that flavor will often fail in fme.

The regexp tester in the string searcher works pretty good now-a-days. I prefer using Rubular or the one in Notepad++.

J

+7

jeroenstiers
178 replies
8 years ago
March 30, 2017

Hi @alpheus

Could you close your question If you've got an answer on it?

takashi
7703 replies
8 years ago
March 30, 2017

alpheus wrote:

I'm using the ESRI licensed Data Interoperability version, the about says FME 20150114 - Build 15245.

The regex engine within FME has been upgraded in FME 2016. Your original regex could not be compiled with FME 2015 and earlier. The reason is that the expression contains unnecessary + symbols. The regex engine in FME 2016 or later seems to just ignore the + symbols, but be aware that it's not correct use of + anyway.

Regular expression for finding pattern in string searcher

10 replies

Reply

Helpful Members This Week

Recently Solved Questions

How to restart a REST Server in ArcGIS Server?

Remove last CR/LF from a CSV

1019 error with change detector and polygons

Where is the "Show Bookmark Navigator" option in FME 2024.2?

How to dynamically write new or update existing ArcGIS Online Feature Layers.

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

What is the best way to model a Gaussian beam?icon

Batwing Modelicon

June 2022 Community Updates

Analysis of a footprint diagram in sequential modeicon

Static Field Inputicon

Helpful Members This Week

Recently Solved Questions

How to restart a REST Server in ArcGIS Server?

Remove last CR/LF from a CSV

1019 error with change detector and polygons

Where is the "Show Bookmark Navigator" option in FME 2024.2?

How to dynamically write new or update existing ArcGIS Online Feature Layers.

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings