Skip to main content
Question

Attribute Manipulation : Splitting a list at discontinuities

  • February 27, 2017
  • 8 replies
  • 14 views

Forum|alt.badge.img

How do I retrieve the lowers and uppers bounds in a discontinuous list of values (alphanumeric)?

Example of list: {5, 6, 7, 8, 10a, 11, 15, 16x, 17, 18, 19, 20, 25}

The extraction I'm looking for would be: 5-8, 10a-11, 15-20, 25

Thank you

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

8 replies

david_r
Celebrity
  • February 27, 2017

I'm sure this can be done using transformers, but I suspect it'll be rather convoluted. Here's a possible solution using a PythonCaller:

import fmeobjects
from itertools import groupby
from operator import itemgetter

def str2int(s):
    return int(''.join([x for x in s if x.isdigit()]))

def split_list_at_bounds(feature):
    data = feature.getAttribute('values')
    if data:
        data = [x.strip() for x in data.replace('{', '').replace('}', '').split(',')]
        bounds = []
        for k, g in groupby(enumerate(data), lambda (i,x):i-str2int(x)):
            bounds.append(map(itemgetter(1), g))
        for n, boundary in enumerate(bounds):
            if len(boundary) == 1:
                s = boundary[0]
            else:
                s = '-'.join([boundary[0], boundary[-1]])
            feature.setAttribute('boundary{%s}' % n, s)

Assuming the input attribute values = "{5, 6, 7, 8, 10a, 11, 15, 16x, 17, 18, 19, 20, 25}" it will output list boundary{} like the following:

0684Q00000ArLSTQA3.png

The grouping mechanism in str2int() will strip away all non-digits, so be careful if you have values like 12x3 as it will be interpreted as 123 and not 12.


ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • February 27, 2017

takashi
Celebrity
  • February 28, 2017

A geometric approach could also be an alternative.


Forum|alt.badge.img
  • Author
  • February 28, 2017

I am admiring the flexibility of this tool and the genius of its users.

Thank you


Forum|alt.badge.img
  • Author
  • February 28, 2017

I need a little time to test all these solutions. Thank you


david_r
Celebrity
  • February 28, 2017

I need a little time to test all these solutions. Thank you

Performance-wise, I would expect the Python-solution to be the fastest by quite a margin, followed by the solution from @egomm. The suggestion from @takashi is really cool and a great demonstration of the flexibility of FME, but I suspect it is relatively slow if you have a lot of data.

 

 


david_r
Celebrity
  • February 28, 2017

A geometric approach could also be an alternative.

Very cool.

takashi
Celebrity
  • February 28, 2017

I need a little time to test all these solutions. Thank you

The geometric approach is interesting and demonstrates the flexibility of FME, but its performance is not good as @david_r pointed it out. If the performance is critical, I would not recommend you to use the geometric method in the practical workspace, and consider adopting the Python solution.