Solved

One attribute to multiple using regEx

6 years ago
16 October 2017
7 replies
29 views

fariyafarhad
62 replies

Hello,

I am having difficulty finding something similar to this in the Knowledge Center so I thought I would ask.

I have an attribute whose value looks like this (it is a full line from a badly formatted text file).

Ticket No: 4653 Nearest Intersection: Chinton Street Seq No: 34

I need them split to an attribute list and subsequent value list

Eg:

Attribute_1 = Ticket No

Value_1 = 4653

Attribute_2: Nearest Intersection

Value_2: Chinton Street

Attribute_3: Seq No

Value_3: 34

It doesn't have to be a "list" as long as I can separate those part of the texts. I am guessing there should be a RegEx way of doing this.

The attribute names "Ticket No", "Nearest Intersection" and "Seq No" will always remain the same with their values changing. I am trying to build a script to always separate them.

Any suggestions? Thank you!

Addition:

For the first one, if there was a way to extract between 'Ticket No:' and 'Nearest' that would be fine. I can work with the formatting of the value afterwards.

I have 19 lines with the same formatting problems. With 2-3 attributes per line. I am trying to avoid too many transformers per line. If I could possibly use an attribute creator where I create the new attribute and the value would be a the substring between two known attributes on both sides, that would be great.

icon

Best answer by fariyafarhad 16 October 2017, 21:21

View original

7 replies

Userlevel 3

+26

dustin
Influencer
589 replies
6 years ago
16 October 2017

Here is my first thought. On the StringConcatenator, I put in 4 elements from the list to account for 4 separate 'words' in the Intersection, i.e. Huntsville Browns Ferry Road. You can add more list elements there if you think you would have that scenario.

Hello @cartoscro

I appreciate your answer. That seems like a good way to do it. What I have though are many different lines with the same formatting issue. Using so many transformers per line may be a bit too much. I am trying to see if there is an easier Regex way of doing it.

Userlevel 3

+26

dustin
Influencer
589 replies
6 years ago
16 October 2017

Hello @cartoscro

@fariyafarhad There likely is a cleaner way to do it with Regex. FME is very efficient within it's text string processing, so I have always defaulted to the line of thinking with multiple transformers.

Userlevel 3

+26

dustin
Influencer
589 replies
6 years ago
16 October 2017

Hello @cartoscro

Slightly smaller workflow:

Hello @cartoscro

@cartoscroI like the shorter workflow. Though I think I stumbled upon an answer myself. Thanks!

fariyafarhad
Author
62 replies
6 years ago
16 October 2017
Best Answer

I decided to use a StringSearcher.

So for that one line, I set up 3 string searchers in series. If the string is matched, this transformer automatically puts the matching text in an attribute and I specified the attribute name. The first StringSearcher is set up this way.

This means one transformer per

attribute. If any one has a better suggestion, still welcome :)

Userlevel 3

+17

takashi
Contributor
7538 replies
6 years ago
16 October 2017

Hi @fariyafarhad, why not use the StringSearcher with this regex?

^Ticket No\s*:\s*(.*)\s+Nearest Intersection\s*:\s*(.*)\s+Seq No\s*:\s*(.*)$

And set the Subexpression Matches List Name (e.g. _sub). You can then rename "_sub{0}.part" to "Ticket No", and so on.

One attribute to multiple using regEx

7 replies

Reply

Community Stats

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded