Skip to main content
Question

Discarding a feature if all values in a list attribute are in any other feature


dms2
Contributor
Forum|alt.badge.img+11
  • Contributor

I have a number of features which have an attribute named ID and a list attribute named TRIPS{} like this:

ID

 

TIPS{}

 

1

 

TIPS{0}.trip = B

 

TIPS{1}.trip = E

 

TIPS{2}.trip = D

 

TIPS{3}.trip = C

 

2

 

TIPS{0}.trip = E

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

3

 

TIPS{0}.trip = A

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

TIPS{3}.trip = E

 

4

 

TIPS{0}.trip = B

 

TIPS{1}.trip = A

 

 

For my purpose some features are redundant as all their trip values are already in other features, so for every feature I want to check if all trip values are in any other feature trips, and if so, delete the feature.

In my sample list, feature ID=2 should be deleted as trips C, D and E are also in features ID=1 and ID=3. The resulting features should be:

ID

 

TIPS{}

 

1

 

TIPS{0}.trip = B

 

TIPS{1}.trip = E

 

TIPS{2}.trip = D

 

TIPS{3}.trip = C

 

3

 

TIPS{0}.trip = A

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

TIPS{3}.trip = E

 

4

 

TIPS{0}.trip = B

 

TIPS{1}.trip = A

 

 

I tried the following: explode the list, aggregate features concatenating 'trip' attribute with commas, and use a PythonCaller to build a list of features with concatenated trips converted to a python list (i.e. a list of lists) and then use a loop inside a loop to check if all trips of a feature are in any other feature.

The python part is taking me time and I'm guessing there's a simpler o better way of doing it. Maybe an advanced use of an existing list transformer? Using regular expressions? I don't know. Any idea?

 

 

6 replies

ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • October 17, 2019

Why is ID 4 kept when B and A are both in other features?


itay
Supporter
Forum|alt.badge.img+16
  • Supporter
  • October 17, 2019

Have a look at the ListBasedFeatureMerger, can be what you are looking for.


dms2
Contributor
Forum|alt.badge.img+11
  • Author
  • Contributor
  • October 17, 2019
ebygomm wrote:

Why is ID 4 kept when B and A are both in other features?

@egomm It's because they BOTH must be in another feature. A is in ID=3 and B is in ID=1, but there's no feature containing both values.


dms2
Contributor
Forum|alt.badge.img+11
  • Author
  • Contributor
  • October 17, 2019
itay wrote:

Have a look at the ListBasedFeatureMerger, can be what you are looking for.

I did but can't figure out how to make it do what I want.


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • October 17, 2019
dms2 wrote:

@egomm It's because they BOTH must be in another feature. A is in ID=3 and B is in ID=1, but there's no feature containing both values.

I think python is going to be the solution here, are there ever going to be cases where the lists are identical?


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • October 17, 2019

This might get you part way there, it would output the ID of any feature where all the list values are present in another single feature. You could then use a featuremerger with the original features to only keep features that are unmerged (if features can have identical lists it won't handle them correctly!)

import fme
import fmeobjects

class FeatureProcessor(object):
    def __init__(self):
        self.trips=[]
        
    def input(self,feature):
        id = feature.getAttribute('ID')
        trip = feature.getAttribute('TIPS{}.trip')
        for i in trip:
            self.trips.append((id, i))
                

    def close(self):
        ids = list(set([i[0for i in self.trips]))
        discard = []
        for x in ids:
            list1 = [i[1for i in self.trips if i[0]==x]
            for y in ids:      
                if x==y:
                    pass
                else:
                    list2 =[i[1for i in self.trips if i[0]==y]
                    result = all(elem in list1 for elem in list2)
                    if result:
                        discard.append(y)
        for val in list(set(discard)):
            newFeature = fmeobjects.FMEFeature()
            newFeature.setAttribute('ID',val)
            self.pyoutput(newFeature)

discard_list_in_list.fmw

Disclaimer: python people might be able to suggest a better method


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings