Skip to main content
Solved

One attribute to multiple using regEx

  • October 16, 2017
  • 7 replies
  • 122 views

Hello,

I am having difficulty finding something similar to this in the Knowledge Center so I thought I would ask.

I have an attribute whose value looks like this (it is a full line from a badly formatted text file).

Ticket No: 4653 Nearest Intersection: Chinton Street Seq No: 34

I need them split to an attribute list and subsequent value list

Eg:

Attribute_1 = Ticket No

Value_1 = 4653

Attribute_2: Nearest Intersection

Value_2: Chinton Street

Attribute_3: Seq No

Value_3: 34

It doesn't have to be a "list" as long as I can separate those part of the texts. I am guessing there should be a RegEx way of doing this.

The attribute names "Ticket No", "Nearest Intersection" and "Seq No" will always remain the same with their values changing. I am trying to build a script to always separate them.

Any suggestions? Thank you!

Addition:

For the first one, if there was a way to extract between 'Ticket No:' and 'Nearest' that would be fine. I can work with the formatting of the value afterwards.

I have 19 lines with the same formatting problems. With 2-3 attributes per line. I am trying to avoid too many transformers per line. If I could possibly use an attribute creator where I create the new attribute and the value would be a the substring between two known attributes on both sides, that would be great.

Best answer by fariyafarhad

I decided to use a StringSearcher.

So for that one line, I set up 3 string searchers in series. If the string is matched, this transformer automatically puts the matching text in an attribute and I specified the attribute name. The first StringSearcher is set up this way.

This means one transformer per

attribute. If any one has a better suggestion, still welcome :)

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

7 replies

dustin
Influencer
Forum|alt.badge.img+31
  • Influencer
  • 629 replies
  • October 16, 2017

Here is my first thought. On the StringConcatenator, I put in 4 elements from the list to account for 4 separate 'words' in the Intersection, i.e. Huntsville Browns Ferry Road. You can add more list elements there if you think you would have that scenario.


  • Author
  • 62 replies
  • October 16, 2017

Here is my first thought. On the StringConcatenator, I put in 4 elements from the list to account for 4 separate 'words' in the Intersection, i.e. Huntsville Browns Ferry Road. You can add more list elements there if you think you would have that scenario.

Hello @cartoscro

 

I appreciate your answer. That seems like a good way to do it. What I have though are many different lines with the same formatting issue. Using so many transformers per line may be a bit too much. I am trying to see if there is an easier Regex way of doing it.

dustin
Influencer
Forum|alt.badge.img+31
  • Influencer
  • 629 replies
  • October 16, 2017
Hello @cartoscro

 

I appreciate your answer. That seems like a good way to do it. What I have though are many different lines with the same formatting issue. Using so many transformers per line may be a bit too much. I am trying to see if there is an easier Regex way of doing it.
@fariyafarhad There likely is a cleaner way to do it with Regex. FME is very efficient within it's text string processing, so I have always defaulted to the line of thinking with multiple transformers.

 


dustin
Influencer
Forum|alt.badge.img+31
  • Influencer
  • 629 replies
  • October 16, 2017
Hello @cartoscro

 

I appreciate your answer. That seems like a good way to do it. What I have though are many different lines with the same formatting issue. Using so many transformers per line may be a bit too much. I am trying to see if there is an easier Regex way of doing it.
Slightly smaller workflow:

 


  • Author
  • 62 replies
  • October 16, 2017
Hello @cartoscro

 

I appreciate your answer. That seems like a good way to do it. What I have though are many different lines with the same formatting issue. Using so many transformers per line may be a bit too much. I am trying to see if there is an easier Regex way of doing it.
@cartoscroI like the shorter workflow. Though I think I stumbled upon an answer myself. Thanks!

 


  • Author
  • 62 replies
  • Best Answer
  • October 16, 2017

I decided to use a StringSearcher.

So for that one line, I set up 3 string searchers in series. If the string is matched, this transformer automatically puts the matching text in an attribute and I specified the attribute name. The first StringSearcher is set up this way.

This means one transformer per

attribute. If any one has a better suggestion, still welcome :)


takashi
Celebrity
  • 7843 replies
  • October 16, 2017

Hi @fariyafarhad, why not use the StringSearcher with this regex?

^Ticket No\s*:\s*(.*)\s+Nearest Intersection\s*:\s*(.*)\s+Seq No\s*:\s*(.*)$
And set the Subexpression Matches List Name (e.g. _sub). You can then rename "_sub{0}.part" to "Ticket No", and so on.