Skip to main content
Question

Discarding a feature if all values in a list attribute are in any other feature

  • October 17, 2019
  • 6 replies
  • 20 views

dms2
Contributor
Forum|alt.badge.img+11

I have a number of features which have an attribute named ID and a list attribute named TRIPS{} like this:

ID

 

TIPS{}

 

1

 

TIPS{0}.trip = B

 

TIPS{1}.trip = E

 

TIPS{2}.trip = D

 

TIPS{3}.trip = C

 

2

 

TIPS{0}.trip = E

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

3

 

TIPS{0}.trip = A

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

TIPS{3}.trip = E

 

4

 

TIPS{0}.trip = B

 

TIPS{1}.trip = A

 

 

For my purpose some features are redundant as all their trip values are already in other features, so for every feature I want to check if all trip values are in any other feature trips, and if so, delete the feature.

In my sample list, feature ID=2 should be deleted as trips C, D and E are also in features ID=1 and ID=3. The resulting features should be:

ID

 

TIPS{}

 

1

 

TIPS{0}.trip = B

 

TIPS{1}.trip = E

 

TIPS{2}.trip = D

 

TIPS{3}.trip = C

 

3

 

TIPS{0}.trip = A

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

TIPS{3}.trip = E

 

4

 

TIPS{0}.trip = B

 

TIPS{1}.trip = A

 

 

I tried the following: explode the list, aggregate features concatenating 'trip' attribute with commas, and use a PythonCaller to build a list of features with concatenated trips converted to a python list (i.e. a list of lists) and then use a loop inside a loop to check if all trips of a feature are in any other feature.

The python part is taking me time and I'm guessing there's a simpler o better way of doing it. Maybe an advanced use of an existing list transformer? Using regular expressions? I don't know. Any idea?

 

 

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

6 replies

ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • October 17, 2019

Why is ID 4 kept when B and A are both in other features?


itay
Supporter
Forum|alt.badge.img+18
  • Supporter
  • October 17, 2019

Have a look at the ListBasedFeatureMerger, can be what you are looking for.


dms2
Contributor
Forum|alt.badge.img+11
  • Author
  • Contributor
  • October 17, 2019

Why is ID 4 kept when B and A are both in other features?

@egomm It's because they BOTH must be in another feature. A is in ID=3 and B is in ID=1, but there's no feature containing both values.


dms2
Contributor
Forum|alt.badge.img+11
  • Author
  • Contributor
  • October 17, 2019

Have a look at the ListBasedFeatureMerger, can be what you are looking for.

I did but can't figure out how to make it do what I want.


ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • October 17, 2019

@egomm It's because they BOTH must be in another feature. A is in ID=3 and B is in ID=1, but there's no feature containing both values.

I think python is going to be the solution here, are there ever going to be cases where the lists are identical?


ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • October 17, 2019

This might get you part way there, it would output the ID of any feature where all the list values are present in another single feature. You could then use a featuremerger with the original features to only keep features that are unmerged (if features can have identical lists it won't handle them correctly!)

import fme
import fmeobjects

class FeatureProcessor(object):
    def __init__(self):
        self.trips=[]
        
    def input(self,feature):
        id = feature.getAttribute('ID')
        trip = feature.getAttribute('TIPS{}.trip')
        for i in trip:
            self.trips.append((id, i))
                

    def close(self):
        ids = list(set([i[0] for i in self.trips]))
        discard = []
        for x in ids:
            list1 = [i[1] for i in self.trips if i[0]==x]
            for y in ids:      
                if x==y:
                    pass
                else:
                    list2 =[i[1] for i in self.trips if i[0]==y]
                    result = all(elem in list1 for elem in list2)
                    if result:
                        discard.append(y)
        for val in list(set(discard)):
            newFeature = fmeobjects.FMEFeature()
            newFeature.setAttribute('ID',val)
            self.pyoutput(newFeature)

discard_list_in_list.fmw

Disclaimer: python people might be able to suggest a better method