Question

split attribute with delimiter but also ignore the same delimiter

Forum|Forum|6 years ago
February 18, 2019
6 replies
191 views

rva1
Contributor

Hello,

i'm trying to split an attribute into a list with the attributesplitter using a delimiter.

The problem is the same delimiter must sometimes be ignored.

The atribute i'm trying to split is:

'3Ku_PFwo2YcPZnv0WeDVdW',#25,$,$,#113, (#13017,#14678,#19207,#20344,#20598,#20727,#20858)

The list values should be:

0: 3Ku_PFwo2YcPZnv0WeDVdW

1: #25

2: $

3: $

4: #113

5: #13017,#14678,#19207,#20344,#20598,#20727,#20858

when using the delimiter "," the last values will also be split...

The problem also is that the format of the attribute differ each time. So sometimes it could be: $,$,$,($,$,$,),$,$,($,$,$),$ or $,$,$,$ of ($,$,$),($,$,$),$,($) etc...

The order of the list must remain the same as the original order.

Any suggestion would be more than welcome;)

cheers,

ronald

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+62

redgeographics
Celebrity
Forum|Forum|6 years ago
February 18, 2019

Well... what you could try is use a regex to find anything between ( and ) and store that in a substring, replace the , in that substring with something else, put it all back in the main string, split that and then replace the something else back to a , (although if the formate is dynamic that might be difficult too)

FME rocks! \m/

Upvote

+18

itay
Supporter
Forum|Forum|6 years ago
February 18, 2019

Hi @rva1,

The fact the the attribute value will change makes it quite difficult to come up with a completly automated solution.

There are many ways to go about it, I would try the following:

Search for everything in between the parentheses
and use that to erase it from the original value.
remove parentheses from substring
Split the remaining value and use the substring

Hope this helps,

Itay

Upvote

rva1
Author
Contributor
Forum|Forum|6 years ago
February 19, 2019

Hi @rva1,

The fact the the attribute value will change makes it quite difficult to come up with a completly automated solution.

There are many ways to go about it, I would try the following:

Search for everything in between the parentheses
and use that to erase it from the original value.
remove parentheses from substring
Split the remaining value and use the substring

Hope this helps,

Itay

been a little work finding the right regex... but it seem \$#(.*?)\$\\)|\$#(.*?)\$ is working...

the problem was that sometime i had to correct $,$,($,$,$),$,($,($)).... ;)

so i'm using a pipeline to get both

i also used the all matches list names to get all results... hope it's waterproof...

Upvote

+46

ebygomm
Influencer
Forum|Forum|6 years ago
February 20, 2019

I'd probably use a string replacer with regex to replace all commas outside of the string with another character that's not going to cause a conflict elsewhere, then use that character to split the string in an attribute splitter.

,\s*(?![^()]*\))

Or use python to split at commas outside the brakcets

import fme
import fmeobjects
import re

# Template Function interface:
# When using this function, make sure its name is set as the value of
# the 'Class or Function to Process Features' transformer parameter
def splitString(feature):
    string = feature.getAttribute('string')
    split_string = re.split(r',\s*(?![^()]*\))', string)
    for i, val in enumerate(split_string):
        feature.setAttribute('string{'+str(i)+'}.split',val)

You'd still need to remove the brackets themselves in either option

Upvote

rva1
Author
Contributor
Forum|Forum|6 years ago
February 20, 2019

,\s*(?![^()]*\))

Or use python to split at commas outside the brakcets

import fme
import fmeobjects
import re

# Template Function interface:
# When using this function, make sure its name is set as the value of
# the 'Class or Function to Process Features' transformer parameter
def splitString(feature):
    string = feature.getAttribute('string')
    split_string = re.split(r',\s*(?![^()]*\))', string)
    for i, val in enumerate(split_string):
        feature.setAttribute('string{'+str(i)+'}.split',val)

You'd still need to remove the brackets themselves in either option

ahh very nice!

this helps with some other issues as well

tnx!

Upvote

markatsafe
Forum|Forum|6 years ago
February 22, 2019

@rva1 Leveraging @egomm 's great regular expression to replace the comma delimiter with something else, i.e. a pipe (|) character, here's an equivalent workspace (2018.1): for those of us not so comfortable in Python!

stringsplitter.zip

Upvote

split attribute with delimiter but also ignore the same delimiter

6 replies

Community Stats

Latest FME

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded