Skip to main content

I have a number of features which have an attribute named ID and a list attribute named TRIPS{} like this:

ID

 

TIPS{}

 

1

 

TIPS{0}.trip = B

 

TIPS{1}.trip = E

 

TIPS{2}.trip = D

 

TIPS{3}.trip = C

 

2

 

TIPS{0}.trip = E

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

3

 

TIPS{0}.trip = A

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

TIPS{3}.trip = E

 

4

 

TIPS{0}.trip = B

 

TIPS{1}.trip = A

 

 

For my purpose some features are redundant as all their trip values are already in other features, so for every feature I want to check if all trip values are in any other feature trips, and if so, delete the feature.

In my sample list, feature ID=2 should be deleted as trips C, D and E are also in features ID=1 and ID=3. The resulting features should be:

ID

 

TIPS{}

 

1

 

TIPS{0}.trip = B

 

TIPS{1}.trip = E

 

TIPS{2}.trip = D

 

TIPS{3}.trip = C

 

3

 

TIPS{0}.trip = A

 

TIPS{1}.trip = C

 

TIPS{2}.trip = D

 

TIPS{3}.trip = E

 

4

 

TIPS{0}.trip = B

 

TIPS{1}.trip = A

 

 

I tried the following: explode the list, aggregate features concatenating 'trip' attribute with commas, and use a PythonCaller to build a list of features with concatenated trips converted to a python list (i.e. a list of lists) and then use a loop inside a loop to check if all trips of a feature are in any other feature.

The python part is taking me time and I'm guessing there's a simpler o better way of doing it. Maybe an advanced use of an existing list transformer? Using regular expressions? I don't know. Any idea?

 

 

Why is ID 4 kept when B and A are both in other features?


Have a look at the ListBasedFeatureMerger, can be what you are looking for.


Why is ID 4 kept when B and A are both in other features?

@egomm It's because they BOTH must be in another feature. A is in ID=3 and B is in ID=1, but there's no feature containing both values.


Have a look at the ListBasedFeatureMerger, can be what you are looking for.

I did but can't figure out how to make it do what I want.


@egomm It's because they BOTH must be in another feature. A is in ID=3 and B is in ID=1, but there's no feature containing both values.

I think python is going to be the solution here, are there ever going to be cases where the lists are identical?


This might get you part way there, it would output the ID of any feature where all the list values are present in another single feature. You could then use a featuremerger with the original features to only keep features that are unmerged (if features can have identical lists it won't handle them correctly!)

import fme
import fmeobjects

class FeatureProcessor(object):
    def __init__(self):
        self.trips=)]
        
    def input(self,feature):
        id = feature.getAttribute('ID')
        trip = feature.getAttribute('TIPS{}.trip')
        for i in trip:
            self.trips.append((id, i))
                

    def close(self):
        ids = list(set(ÂiÂ0] for i in self.trips]))
        discard =  ]
        for x in ids:
            list1 = lis1] for i in self.trips if i 0]==x]
            for y in ids:      
                if x==y:
                    pass
                else:
                    list2 =Âi[1] for i in self.trips if is0]==y]
                    result = all(elem in list1 for elem in list2)
                    if result:
                        discard.append(y)
        for val in list(set(discard)):
            newFeature = fmeobjects.FMEFeature()
            newFeature.setAttribute('ID',val)
            self.pyoutput(newFeature)

discard_list_in_list.fmw

Disclaimer: python people might be able to suggest a better method


Reply