Hi Howard,
The pipe (|) means "OR", so your regex matches one of "\\d+:\\d", "\\d+" or "\\n".
Rubular shows all the matched parts, but the StringSearcher assigns the first matched part to "_matched_characters" attribute.
I think it's the reason for the difference between them.
For example, this expression can be used if you want to extract every matched part.
The StringSearcher assigns the matched parts to elements of "_matched_parts" list.
-----
(\\d+:\\d+)\\D+(\\d+)\\D+(\\d+)\\D+(\\d+)\\D+(\\d+)\\D+(\\d+)
-----
Input: <abc>1:00<def>34<def>12<def>9<def>0<def>8<xyz>
Result:
`_matched_characters' has value `1:00<def>34<def>12<def>9<def>0<def>8'
`_matched_parts{0}' has value `1:00'
`_matched_parts{1}' has value `34'
`_matched_parts{2}' has value `12'
`_matched_parts{3}' has value `9'
`_matched_parts{4}' has value `0'
`_matched_parts{5}' has value `8'
Takashi
Takashi,
Thanks very much for your reply.
Yes, I was aware of the alternation symbol lpipe(|) = OR]. The main issue in the original question was the fact that 3 completely different test beds (Rubular, RegExr and RegexBuddy) all show matched characters 1 way (by showing all possible matches), while it seems so far that FME alone shows matched characters another way by only showing the first match (in the _matched_characters attribute). It would be really good if the test beds and FME were all consistent. That way you would be able to move from test bed to FME a little easier. If Safe could change this I think it would be really useful.
.............
Your tip about using the _matched_parts is really helpful
.............
Thanks once again for your reply
Howard L'
The TclCaller with this Tcl expression will generate space-delimited all matched parts (similar result to other tools).
Tcl Expression Example:
-----
return uregexp -all -inline -- {\\d+:\\d+|\\d+} dFME_GetAttribute "text_line_data"]]
-----
FYI
There are many Regexp flavors.
There is a list of wich has wich fascility.
Rubulator can show 3 versions atm.
Rubular standard shows results with the "-all" swith on.
I find this very handy.
Switches can be used in AttributeCreators. Though fme 2015 has a issue with expression outputs when they contain non-numerals; you cannot remove the @Evaluate() icon. In 2014 you could just remove it because it was not a fixed icon.
If you use the same regexp but with switches "-all", "indices", and "-inline" you can extract all the hits. There are posts on this in this forum. I found a couple i made in 2010. Shame to bug the evaluator like this...:(
Here www.tcl.tk/ you can find all you wish to know about it in all flavors