Solved

How can I create a regex with a lookahead after a certain text?

1 year ago
1 December 2022
7 replies
1 view

koenterralytics
Contributor
32 replies

I've got an issue with the stringsearcher. It does not seen to work as I expect it do do. It might be my regex skills, but I think I tried all possibilities.

What

I've got the following text: "324324" "maatvoering; hoogte; bouwhoogte; maximum bouwhoogte (m)"="8", "maatvoering; hoogte; goothoogte; maximum goothoogte (m)"="4" asdfasdf sadfsadfsf 343244

In this text I would like to find the first number after it finds the word 'goot' in the text. In this example this should be the bold number 4. I'm trying to do this with the following expression: (?=goot.*")[0-9]*(?=")

But it does not give any results as you can see in the screenshot below.

Any ideas what I'm missing here?

icon

Best answer by geomancer 1 December 2022, 13:54

View original

7 replies

Userlevel 4

+36

geomancer
Evangelist
637 replies
1 year ago
1 December 2022

Regex always provides nice puzzles 😀

.*goot.*"="\K\d+

Regex_goothoogte According to https://perldoc.perl.org/perlre#Regular-Expressions, \K means 'Keep the stuff left of the \K, don't include it in $&', so it is not included in the match.

koenterralytics
Author
Contributor
32 replies
1 year ago
1 December 2022

Regex always provides nice puzzles 😀

.*goot.*"="\K\d+

Regex_goothoogte According to https://perldoc.perl.org/perlre#Regular-Expressions, \K means 'Keep the stuff left of the \K, don't include it in $&', so it is not included in the match.

Thanks for your quick reply @geomancer . That is a completely different approach compared to mine. It works partly, but not perfect yet. For example: if I would replace the 'goot' with 'bouw', I would expect that it returns the value 8. But instead it still returns the value 4. So at the moment it seems to return the last number it can find, instead of the first number. Any idea how to fix this?

Userlevel 4

+36

geomancer
Evangelist
637 replies
1 year ago
1 December 2022
Best Answer

Ah, I didn't test that. This works for both 'bouw' and 'goot':

.*bouw[\D]*\K\d+

Userlevel 3

+26

dustin
Influencer
588 replies
1 year ago
1 December 2022

This could also work, in case the "=" might not be present.

goot(?![\s\S]*goot)\K(\D*)\K\d+

goot(?![\s\S]*goot)
- Returns the last goot in the string
\K(\D*)
- Returns the non-digit text after the last goot
\K\d+
- Returns the digits after the non-digit text (after the last goot 😁 )

koenterralytics
Author
Contributor
32 replies
1 year ago
1 December 2022

Ah, I didn't test that. This works for both 'bouw' and 'goot':

.*bouw[\D]*\K\d+

This works perfectly. Many thanks for your quick reply again!

koenterralytics
Author
Contributor
32 replies
1 year ago
1 December 2022

This could also work, in case the "=" might not be present.

goot(?![\s\S]*goot)\K(\D*)\K\d+

goot(?![\s\S]*goot)
- Returns the last goot in the string
\K(\D*)
- Returns the non-digit text after the last goot
\K\d+
- Returns the digits after the non-digit text (after the last goot 😁 )

Many thanks for your input @dustin . This regex seems some kind of rocket science :-). I also tried this solution, but it seems to return the last digit in all cases where the text contains the word goot. But in the mean time geomancer came with the perfect solution, so I'll go for that one.

Userlevel 3

+26

dustin
Influencer
588 replies
1 year ago
1 December 2022

Regex can be that way sometimes. 😂 Glad you found the solution. 😎

How can I create a regex with a lookahead after a certain text?

7 replies

Reply

Community Stats

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded