Skip to main content
We need to use FME split up character strings like this:

 

“F29920110716104845F37920120929203346TC20080508184800”

 

into separate substrings:

 

"F29920110716104845", "F37920120929203346", "TC20080508184800".

 

 

The substrings are <personnel-ids> each made up of two sub-elements <operatorid><datetime>  where the <operatorid> element always starts with tA-Z] and is two-five characters in length and the <datetime> element is  always 14 decimal digits long (YYYYMMDDhhmmss).

 

 

The following JavaScript regular expression will parse this string and extract each substring: 

 

     (  

 

However, in the FME StringSearcher Transformer, it only ever extracts the first substring into _matched_parts{0}.     

 

 

Q1.   Does FME StringSearcher support a “global match” parameter, so that it finds all matches in the string, rather than just the first, and if so how do you set it?   

 

 

Q2.  Failing that, can anybody please recommend an alternate FME method to split out our <personnel-id> substrings?

 

 

  

 

 

 

 

Hi,

 

 

Unfortunately, the StringSercher seems not to support a parameter like "global match".

 

If the string always has 3 parts and each of them consists of one or more alphabetical characters and digits, the following expression would be effective: ^( A-Z]+Z0-9]+)(+A-Z]+Z0-9]+)(+A-Z]+Z0-9]+)$   This is a simplified example, you can consider stricter expression, if necessary.

 

Hope this helps.

 

 

Takashi
Hi,

 

 

I think it would be quite easy to implement this using a PythonCaller and the re.findall() method. Give us a word if you need more details.

 

 

David
You can use a regular expression in a stringreplacer to search for the alpha characters, replace those with a comma and the matched characters and then use an attribute splitter to split at the inserted comma

 

 

 


Use the RegularExpressionMatcher from the FME Store.  Connect it to a ListExploder set to read List Attribute: REM_matched_parts{}.  I used the following regular expression: (\\w{1,4}\\d{13,17})
If the string pattern is consistent then try AttributeSplitter with a format string 18s18s16s
Hi,

 

 

Use a Tcl regexp with -inline switch and maybe switch all.

 

This will give u all matches.

 

Like this in a an atribute creator:

 

@Evaluate(uregexp -all -inline {your expression} "@Value(Object)" ])

 

 

If u know the amount of hits, i.e. 3  u can also do

 

@Evaluate(Eregexp -all  {your expression matchedparts Match1 Match2 Match3} "@Value(Object)" ])

 

It wil then write em to variables named Match1 etc.

 

U can then assign those variables to attirbutes.

 

 

check www.tcl.tk/man/tcl8.4/TclCmd very powerfull
...sryy

 

 

@Evaluate(eregexp -all {your expression matchedparts Match1 Match2 Match3} "@Value(Object)" ])

 

 

should be

 

@Evaluate(uregexp -all {your expression} "@Value(Object)" matchedparts Match1 Match2 Match3])

 

 

 

 

too many copy 'n pasting 🙂
Hi, me again,

 

 

I changed ur regexp to ( A-Z]{1,4}40-9]{14,17})(?=?A-Z]|$) so i get just 3 matches. Yours gives a space match as well 3x.

 

 

In an attribute creator, arithmic editor:

 

regexp -all -inline {(nA-Z]{1,4}{0-9]{14,17})(?=}A-Z]|$)} "@Value(Object)" ]

 

 

then

 

attributesplitter, delimeter: space.

 

listduplicateremover.

 

listexploder.

 

 

U should now have 3 attributes with separate <personnel-ids>

 

 

greetings

 

 

 


Reply