I have a number of features which have an attribute named ID and a list attribute named TRIPS{} like this:
IDTIPS{}
1
TIPS{0}.trip = B
TIPS{1}.trip = E
TIPS{2}.trip = D
TIPS{3}.trip = C
2
TIPS{0}.trip = E
TIPS{1}.trip = C
TIPS{2}.trip = D
3
TIPS{0}.trip = A
TIPS{1}.trip = C
TIPS{2}.trip = D
TIPS{3}.trip = E
4
TIPS{0}.trip = B
TIPS{1}.trip = A
For my purpose some features are redundant as all their trip values are already in other features, so for every feature I want to check if all trip values are in any other feature trips, and if so, delete the feature.
In my sample list, feature ID=2 should be deleted as trips C, D and E are also in features ID=1 and ID=3. The resulting features should be:
IDTIPS{}
1
TIPS{0}.trip = B
TIPS{1}.trip = E
TIPS{2}.trip = D
TIPS{3}.trip = C
3
TIPS{0}.trip = A
TIPS{1}.trip = C
TIPS{2}.trip = D
TIPS{3}.trip = E
4
TIPS{0}.trip = B
TIPS{1}.trip = A
I tried the following: explode the list, aggregate features concatenating 'trip' attribute with commas, and use a PythonCaller to build a list of features with concatenated trips converted to a python list (i.e. a list of lists) and then use a loop inside a loop to check if all trips of a feature are in any other feature.
The python part is taking me time and I'm guessing there's a simpler o better way of doing it. Maybe an advanced use of an existing list transformer? Using regular expressions? I don't know. Any idea?