I have a number of features which have an attribute named ID and a list attribute named TRIPS{} like this:ID TIPS{} 1 TIPS{0}.trip = B TIPS{1}.trip = E TIPS{2}.trip = D TIPS{3}.trip = C 2 TIPS{0}.trip = E TIPS{1}.trip = C TIPS{2}.trip = D 3 TIPS{0}.trip = A TIPS{1}.trip = C TIPS{2}.trip = D TIPS{3}.trip = E 4 TIPS{0}.trip = B TIPS{1}.trip = A For my purpose some features are redundant as all their trip values are already in other features, so for every feature I want to check if all trip values are in any other feature trips, and if so, delete the feature.In my sample list, feature ID=2 should be deleted as trips C, D and E are also in features ID=1 and ID=3. The resulting features should be:ID TIPS{} 1 TIPS{0}.trip = B TIPS{1}.trip = E TIPS{2}.trip = D TIPS{3}.trip = C 3 TIPS{0}.trip = A TIPS{1}.trip = C TIPS{2}.trip = D TIPS{3}.trip = E 4 TIPS{0}.trip = B TIPS{1}.trip = A I tried the following: explode the list, aggregate features concatenating 'trip' attribute with commas, and use a PythonCaller to build a list of features with concatenated trips converted to a python list (i.e. a list of lists) and then use a loop inside a loop to check if all trips of a feature are in any other feature. The python part is taking me time and I'm guessing there's a simpler o better way of doing it. Maybe an advanced use of an existing list transformer? Using regular expressions? I don't know. Any idea?

Question

Discarding a feature if all values in a list attribute are in any other feature

5 years ago
October 17, 2019
6 replies
6 views

+11

dms2
Contributor
46 replies

I have a number of features which have an attribute named ID and a list attribute named TRIPS{} like this:

TIPS{}

TIPS{0}.trip = B

TIPS{1}.trip = E

TIPS{2}.trip = D

TIPS{3}.trip = C

TIPS{0}.trip = E

TIPS{1}.trip = C

TIPS{2}.trip = D

TIPS{0}.trip = A

TIPS{1}.trip = C

TIPS{2}.trip = D

TIPS{3}.trip = E

TIPS{0}.trip = B

TIPS{1}.trip = A

For my purpose some features are redundant as all their trip values are already in other features, so for every feature I want to check if all trip values are in any other feature trips, and if so, delete the feature.

In my sample list, feature ID=2 should be deleted as trips C, D and E are also in features ID=1 and ID=3. The resulting features should be:

TIPS{}

TIPS{0}.trip = B

TIPS{1}.trip = E

TIPS{2}.trip = D

TIPS{3}.trip = C

TIPS{0}.trip = A

TIPS{1}.trip = C

TIPS{2}.trip = D

TIPS{3}.trip = E

TIPS{0}.trip = B

TIPS{1}.trip = A

I tried the following: explode the list, aggregate features concatenating 'trip' attribute with commas, and use a PythonCaller to build a list of features with concatenated trips converted to a python list (i.e. a list of lists) and then use a loop inside a loop to check if all trips of a feature are in any other feature.

The python part is taking me time and I'm guessing there's a simpler o better way of doing it. Maybe an advanced use of an existing list transformer? Using regular expressions? I don't know. Any idea?

+31

ebygomm
Influencer
3234 replies
5 years ago
October 17, 2019

Why is ID 4 kept when B and A are both in other features?

+16

itay
Supporter
1439 replies
5 years ago
October 17, 2019

Have a look at the ListBasedFeatureMerger, can be what you are looking for.

+11

dms2
Author
Contributor
46 replies
5 years ago
October 17, 2019

ebygomm wrote:

Why is ID 4 kept when B and A are both in other features?

@egomm It's because they BOTH must be in another feature. A is in ID=3 and B is in ID=1, but there's no feature containing both values.

+11

dms2
Author
Contributor
46 replies
5 years ago
October 17, 2019

itay wrote:

Have a look at the ListBasedFeatureMerger, can be what you are looking for.

I did but can't figure out how to make it do what I want.

+31

ebygomm
Influencer
3234 replies
5 years ago
October 17, 2019

dms2 wrote:

@egomm It's because they BOTH must be in another feature. A is in ID=3 and B is in ID=1, but there's no feature containing both values.

I think python is going to be the solution here, are there ever going to be cases where the lists are identical?

+31

ebygomm
Influencer
3234 replies
5 years ago
October 17, 2019

This might get you part way there, it would output the ID of any feature where all the list values are present in another single feature. You could then use a featuremerger with the original features to only keep features that are unmerged (if features can have identical lists it won't handle them correctly!)

import fme
import fmeobjects

class FeatureProcessor(object):
    def __init__(self):
        self.trips=[]
        
    def input(self,feature):
        id = feature.getAttribute('ID')
        trip = feature.getAttribute('TIPS{}.trip')
        for i in trip:
            self.trips.append((id, i))
                

    def close(self):
        ids = list(set([i[0] for i in self.trips]))
        discard = []
        for x in ids:
            list1 = [i[1] for i in self.trips if i[0]==x]
            for y in ids:      
                if x==y:
                    pass
                else:
                    list2 =[i[1] for i in self.trips if i[0]==y]
                    result = all(elem in list1 for elem in list2)
                    if result:
                        discard.append(y)
        for val in list(set(discard)):
            newFeature = fmeobjects.FMEFeature()
            newFeature.setAttribute('ID',val)
            self.pyoutput(newFeature)

discard_list_in_list.fmw

Disclaimer: python people might be able to suggest a better method

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos

Discarding a feature if all values in a list attribute are in any other feature