Skip to main content
Question

capturing code block with new regular expression editor


fme4fsj
Participant
Forum|alt.badge.img

I am using the 16.1 MacOS version and the regular expression editor. I would like to capture the following code block example:

type yellow {

yellow green

green yellow

sunshine yellow

}

In the regex editor I can capture "type yellow {" with this regex (type.*) but because everything else is after a line break I cannot go farther. Or I am ok with ({.*}) which works if it is all on a single line but the option \\s (which adds the multi-string or single-line-mode) in some flavors of regex is not working except to match an extra space. I have also tried "\\s\\S" but that does not work either. Is there another method to match everything including line break/newline characters between the brackets?

3 replies

ebygomm
Influencer
Forum|alt.badge.img+32
  • Influencer
  • June 1, 2016

Not familiar with the FME version you are using but this regex works for me in FME

type yellow \\{(\\s*?.*?)*?\\}

*Edit: this works in the stringsearcher with the code block being contained in the matched characters attribute. I've assumed there are more than one grouping of curly brackets in the text you are searching within.


david_r
Evangelist
  • June 1, 2016

Hi

Try the following regex:

{((.|\n|\r)*)}

It will match all charachters including Carriage Return and Line Feed inside the brackets. The "\r" might not be necessary, but it depends on a few factors so I've included it for the sake of completeness.

The "_first_match" attribute will contain the brackets and everything between them.

You can also supply a list name for "Subexpression matches list name" and you will get a list item that contains only what's between the brackets, without the brackets themselves.

David


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • June 1, 2016

wihtout newline-sensitive matching (i believe this is standard) a dot matches also a newline (CR LF or \\n or \\r\\n)

So without wihtout newline-sensitive matching:

^type.*$ OR ^type.*\\r OR ^type.*\\r\\n$ OR ^type.*\\n

will find all even the newline after the closing bracket.

This wil find all up and including the last closing bracket

^type.*\\r (so it will not grab the ending CR LF)

(based on your input string)


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings