Question

StringSearcher regex \\r\\n

7 years ago
September 26, 2017
12 replies
197 views

deanrother
21 replies

I want to find each line of the string below with a regular expression in StringSearcher that looks like this .*\\r\\n, i.e., I expect each line to end up in the all matches list produced by the StringSearcher.

METER_NUM,PRODUCTION_DATE,FLOW_TIME_MINUTES,VOLUME,ENERGY,DIFF_PRESS,STATIC_PRESS,FLOW_TEMP,MOL_CO2,ENERGY_FACTOR

1006-2001307,20170829,132.283,13.8851,,9.45286000000001,1088.28,83.9825000000001,,

1006-2001283,20170829,1440,311.974,,27.9496,162.085,88.4964,,

It works in my regex testing tool Expresso, but StringSearcher doesn't seem to recognize \\r\\n.

lenaatsafe
275 replies
7 years ago
September 26, 2017

Hi @deanrother

please try ^.*\\n as StringSearcher regular expression. You can always experiment using StringSearcher RedEx Editor (please check Open RegEx Editor... for Contains Regular Expression parameter).

Would you like to split your source value into parts? If yes, you might also want to take a look at AttributeSplitter.

deanrother
Author
21 replies
7 years ago
September 26, 2017

Lena, this doesn't work for me. It matches everything (fig. 1). The output displayed by the Inspector bears this out (fig. 2).

I want each line as a list item, e.g.,

_list{0}.value = METER_NUM,PRODUCTION_DATE,FLOW_TIME_MINUTES,VOLUME,ENERGY,DIFF_PRESS,STATIC_PRESS,FLOW_TEMP,MOL_CO2,ENERGY_FACTOR

_list{1}.value = 1006-2001307,20170829,132.283,13.8851,,9.45286000000001,1088.28,83.9825000000001,,

_list{2}.value = 1006-2001283,20170829,1440,311.974,,27.9496,162.085,88.4964,,

Figure 1

Figure 2

webklaas
6 replies
7 years ago
September 26, 2017

Here are my thoughts:

Somewhere in version 2016 or 2017 the functionality for .* changed (I struggled with it myself for a fair amount of time). It now catches newlines as well, while before it didn't (like your Espresso I guess).

So a solution could be making the .* non-greedy, that is adding a question mark to it.

The complete regex then would be '.*?\\r\\n' (without the quotes)

Hope this helps.

deanrother
Author
21 replies
7 years ago
September 26, 2017

webklaas wrote:

Here are my thoughts:

So a solution could be making the .* non-greedy, that is adding a question mark to it.

The complete regex then would be '.*?\\r\\n' (without the quotes)

Hope this helps.

Thanks, but that didn't work, didn't match anything. I'm on 2017.0. I'll have to try 2017.<newest> to see how that works.

+19

erik_jan
Contributor
2181 replies
7 years ago
September 26, 2017

Just an alternative thought:

If the source is a text file (and csv is), why not use the Text reader and read by line.

If you want the lines in a list, you can follow the reader by an Aggregator and aggregate the attributes in a list.

If the source is a text attribute, you can follow @LenaAtSafe using the AttributeSplitter and split on \\n

In any case, if you want all lines to be separated, I would not use the StringSearcher and regex.

+45

danilo_fme
Evangelist
2059 replies
7 years ago
September 26, 2017

Hi @deanrother,

I'm use the FME version 2017 and i haved this result:

That this what you want?

StringSearcher: ([^ ]* +)

Attached your workspace edited.

Thanks, - workspace-stringsearcher.fmw

Danilo

+45

danilo_fme
Evangelist
2059 replies
7 years ago
September 26, 2017

Hi @deanrother,

I'm use the FME version 2017 and i haved this result:

That this what you want?

StringSearcher: ([^ ]* +)

Attached your workspace edited.

Thanks, - workspace-stringsearcher.fmw

Danilo

deanrother
Author
21 replies
7 years ago
September 26, 2017

erik_jan wrote:

Just an alternative thought:

If the source is a text file (and csv is), why not use the Text reader and read by line.

If you want the lines in a list, you can follow the reader by an Aggregator and aggregate the attributes in a list.

If the source is a text attribute, you can follow @LenaAtSafe using the AttributeSplitter and split on \\n

In any case, if you want all lines to be separated, I would not use the StringSearcher and regex.

FTPCaller dumps the file into an attribute as a string. So, I need to parse that string and the first thing I want to do is break the lines apart.

lenaatsafe
275 replies
7 years ago
September 26, 2017

deanrother wrote:

Lena, this doesn't work for me. It matches everything (fig. 1). The output displayed by the Inspector bears this out (fig. 2).

I want each line as a list item, e.g.,

_list{0}.value = METER_NUM,PRODUCTION_DATE,FLOW_TIME_MINUTES,VOLUME,ENERGY,DIFF_PRESS,STATIC_PRESS,FLOW_TEMP,MOL_CO2,ENERGY_FACTOR

_list{1}.value = 1006-2001307,20170829,132.283,13.8851,,9.45286000000001,1088.28,83.9825000000001,,

_list{2}.value = 1006-2001283,20170829,1440,311.974,,27.9496,162.085,88.4964,,

Figure 1

Figure 2

. means any character, including \\n and \\r. So, your regex means starting from the beginning of the strings, any number of any characters, followed by <new line> - which is exactly the whole source string.

Could you please try AttributeSplitter? This would be your #1 choice for the task.

For StringSearcher one of the regex options would be [a-zA-Z0-9_.,-]*\\n . If you need to split the sample string into three strings, your regex should not start with ^ as this would automatically exclude second and third parts.

deanrother
Author
21 replies
7 years ago
September 26, 2017

danilo_fme wrote:

Hi @deanrother,

I'm use the FME version 2017 and i haved this result:

That this what you want?

StringSearcher: ([^ ]* +)

Attached your workspace edited.

Thanks, - workspace-stringsearcher.fmw

Danilo

That worked. Thanks!

+45

danilo_fme
Evangelist
2059 replies
7 years ago
September 26, 2017

deanrother wrote:

That worked. Thanks!

Perfect @deanrother . I'm happy to help you. :)

takashi
7725 replies
7 years ago
September 27, 2017

Hi @deanrother, in this case, I would use the AttributeSplitter (Delimiter: <newline character>) as @LenaAtSafe suggested at first. One of the reasons is that there could be a case where the text would not end with newline. Other advantages of the AttributeSplitter are, you can trim leading and/or trailing spaces in the split string, and also remove empty lines optionally.

I also like @erik_jan's suggestion - read the text with the Text File reader. You can download the text to a temporary file and then read it with the FeatureReader. The workflow would consist of the TempPathnameCreator, FTPCaller (Transfer Type: Download to a File), and FeatureReader (Format: Text File) connected in series.

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

StringSearcher regex \\r\\n