Skip to main content
Question

Attribute Manipulation : Splitting a list at discontinuities


Forum|alt.badge.img

How do I retrieve the lowers and uppers bounds in a discontinuous list of values (alphanumeric)?

Example of list: {5, 6, 7, 8, 10a, 11, 15, 16x, 17, 18, 19, 20, 25}

The extraction I'm looking for would be: 5-8, 10a-11, 15-20, 25

Thank you

8 replies

david_r
Evangelist
  • February 27, 2017

I'm sure this can be done using transformers, but I suspect it'll be rather convoluted. Here's a possible solution using a PythonCaller:

import fmeobjects
from itertools import groupby
from operator import itemgetter

def str2int(s):
    return int(''.join([x for x in s if x.isdigit()]))

def split_list_at_bounds(feature):
    data = feature.getAttribute('values')
    if data:
        data = [x.strip() for x in data.replace('{''').replace('}''').split(',')]
        bounds = []
        for k, g in groupby(enumerate(data), lambda (i,x):i-str2int(x)):
            bounds.append(map(itemgetter(1), g))
        for n, boundary in enumerate(bounds):
            if len(boundary) == 1:
                s = boundary[0]
            else:
                s = '-'.join([boundary[0], boundary[-1]])
            feature.setAttribute('boundary{%s}' % n, s)

Assuming the input attribute values = "{5, 6, 7, 8, 10a, 11, 15, 16x, 17, 18, 19, 20, 25}" it will output list boundary{} like the following:

0684Q00000ArLSTQA3.png

The grouping mechanism in str2int() will strip away all non-digits, so be careful if you have values like 12x3 as it will be interpreted as 123 and not 12.


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • February 27, 2017

takashi
Contributor
Forum|alt.badge.img+19
  • Contributor
  • February 28, 2017

A geometric approach could also be an alternative.


Forum|alt.badge.img
  • Author
  • February 28, 2017

I am admiring the flexibility of this tool and the genius of its users.

Thank you


Forum|alt.badge.img
  • Author
  • February 28, 2017

I need a little time to test all these solutions. Thank you


david_r
Evangelist
  • February 28, 2017
40_eme wrote:

I need a little time to test all these solutions. Thank you

Performance-wise, I would expect the Python-solution to be the fastest by quite a margin, followed by the solution from @egomm. The suggestion from @takashi is really cool and a great demonstration of the flexibility of FME, but I suspect it is relatively slow if you have a lot of data.

 

 


david_r
Evangelist
  • February 28, 2017
takashi wrote:

A geometric approach could also be an alternative.

Very cool.

takashi
Contributor
Forum|alt.badge.img+19
  • Contributor
  • February 28, 2017
40_eme wrote:

I need a little time to test all these solutions. Thank you

The geometric approach is interesting and demonstrates the flexibility of FME, but its performance is not good as @david_r pointed it out. If the performance is critical, I would not recommend you to use the geometric method in the practical workspace, and consider adopting the Python solution.

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings