Solved

Regex in StringSearcher

Forum|Forum|5 years ago
March 15, 2020
6 replies
52 views

aviveiro

Hello,

I'm trying to find only the letter D on its own in a string such as the following: B,G,D,DM,TD,DM,TD,D. In this example, that would be 2 D's (the third D and the last D I've marked in red).

Unfortunately, all the regex expressions I've tried also gets the D within TD (e.g using D,|D$).

Thanks.

Tony

Best answer by takashi

Hi @aviveiro, the meta character '\b' that represents a word boundary (including space, start/end of a text, comma, period etc.) might help you. For example, this regex matches a single character 'D' sandwiched by word boundaries.

(?<=\b)D(?=\b)

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

takashi
Best Answer
Forum|Forum|5 years ago
March 15, 2020

(?<=\b)D(?=\b)

Why not inspect features with Visual/Data Preview and Feature/Record Information before writing them into a destination dataset?

Upvote

+26

bwn
Evangelist
Forum|Forum|5 years ago
March 15, 2020

@aviveiro Don't forget that in the case of simple situations such as dealing with value delimited strings, there is also AttributeSplitter.

Upvote

+16

arnold_bijlsma
Enthusiast
Forum|Forum|5 years ago
March 17, 2020

(?<=\b)D(?=\b)

@takashi: Excellent answer. I am still not fully familiar with the power of \b.

Just to note: In 'normal' RegEx you indeed need the lookahead and lookbehind assertions. But in the StringSearcher, you don't need them, as it will capture all instances, and \b by definition captures nothing, so you just use

\bD\b

and specify the first list in the Advanced section.

Upvote

takashi
Forum|Forum|5 years ago
March 17, 2020

(?<=\b)D(?=\b)

@arnold_bijlsma, you are right. Lookbehind and lookahead are't essential here. Thanks for pointing it out.

Why not inspect features with Visual/Data Preview and Feature/Record Information before writing them into a destination dataset?

Upvote

+15

gio
Contributor
Forum|Forum|5 years ago
March 17, 2020

@takashi

@aviveiro

@arnold_bijlsma

word boundary \\b represents all non word characters so \\bD\\b matches the string, because it matches some part(s).

Following is not correct to say the least:

In 'normal' RegEx you indeed need the lookahead and lookbehind assertions

To capture an asertion you need to enclose it in braces: \\b(D)\\b

(of course there is a non-capturing version (?:) will cap but not report.

Contrary to popular believe \\b is an (zero length)assertion. So if you enclose it in braces, it will be grabbed. Same goes for begining, end.

A regexp result always shows the enire string if grepped. The All Matches List Name.

To get the indvidual captured D's you will need to enclose it in braces and use Subexpression Matches List Name.

Furthermore, there are not many flavors hat use lookbehind. Python's version does, which will please you guys i guess.

there is a site that shows all flavors and their reaches.

Lookbehind can be emulated by lookahead and some more regexp fiddling.

Of course it may have changed since last i read on it.

Read up on the matter: Jan Goyvaerts work is awesomely suited for that. (see RegeX Buddy. His document is there) But there are plenty good ones.

Upvote

+16

arnold_bijlsma
Enthusiast
Forum|Forum|5 years ago
March 17, 2020

The key thing is that for the RegEx implementation in the StringSearcher you don't need the lookahead/-behind assertions nor any grouping brackets to capture both instances of the single letter D in the test string.

But you're right that other implementations outside FME could give a different output.

Upvote

Regex in StringSearcher

6 replies

Community Stats

Latest FME

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded