Solved

How to compare strings from two data sets (two excel tables)?

  • 2 February 2022
  • 4 replies
  • 28 views

Badge

Hello,

I need to compare strings from two sets of data and to find strings which contains the strings from other set.

 

For example, in first table I have strings:

 column1

ABC

ABCD

ABB

ABBD

ACC

ACCD

 

In second table I have strings

column1

ABC

AC

 

I need to compare them and have the result something like this:

column1; column2

ABC; ABC

ABCD; ABC

ABB

ABBD

ACC; AC

ACCD; AC

 

Rows without data in second column may also be omitted.

 

What would be the best approach?

 

icon

Best answer by chrisatsafe 3 February 2022, 16:54

View original

4 replies

Badge +2

Hi @mario_bla​ ,

 

Great question. I bet there are a few ways to get this done. One solution that came to mind for me was to merge a list of each feature (record) in table 2 onto table 1. Then, explode the list and conditionally set the attributes based on whether table 1 features contain the table 2 attribute values

imageAs a result, if there is a match, the attribute value is set accordingly.

 

Once the attribute values are set, it's as simple as cleaning up the temporary attributes used (i.e. tid, _element_index, and tempcolumn1), and using a Sorter & Sampler to get the features you are interested in.

imageYou could also do this without exploding the list, but then you have to set a new condition for each element in the list:

imageI've uploaded both workspaces in case you wanted to see the different methods. You can get a general idea of what's going on from the annotations and by inspecting each feature cache to see how the data changes.

Badge

Hello!

 

Thank you very much for your help!

 

I think I understand the general idea of your suggestion. If I understood it correctly than by examining your conditional testing (where you are explicitly testing against first and second element from the list) I presume this will work only if I have two strings in second table? Am I right? The fact is that I can have many strings in both tables.

 

The practical use of this would be for example: I have list of all street names of the country in first table and I want to find all the streets named by some plant (like oakstreet, applestreet, street of flowers,...) so I compare the list of streets with list of all/many plant names from plant dictionary.

 

Another thing is that I couldn't make it work even with this two strings. I understand the beginning and the end, but couldn't merge list on every feature. How to do that? Can you please put the screenshot of featuremerger settings? How should I join the requestor and supplier?

 

EDIT: after several trials and errors I came up with this setting of feature merger and was able to get the result - but I'm not sure if that is the proper use of featuremerger...

 2022-02-03_032640

Badge +2

Hello!

 

Thank you very much for your help!

 

I think I understand the general idea of your suggestion. If I understood it correctly than by examining your conditional testing (where you are explicitly testing against first and second element from the list) I presume this will work only if I have two strings in second table? Am I right? The fact is that I can have many strings in both tables.

 

The practical use of this would be for example: I have list of all street names of the country in first table and I want to find all the streets named by some plant (like oakstreet, applestreet, street of flowers,...) so I compare the list of streets with list of all/many plant names from plant dictionary.

 

Another thing is that I couldn't make it work even with this two strings. I understand the beginning and the end, but couldn't merge list on every feature. How to do that? Can you please put the screenshot of featuremerger settings? How should I join the requestor and supplier?

 

EDIT: after several trials and errors I came up with this setting of feature merger and was able to get the result - but I'm not sure if that is the proper use of featuremerger...

 2022-02-03_032640

Sorry looks like the workspace failed to attach! I'll re-attach it - you'll likely want to follow the example in StringCompare.fmwt so you don't have to set the conditions for all of the possible values. And yes you've got it right for the FeatureMerger!

Badge

Sorry looks like the workspace failed to attach! I'll re-attach it - you'll likely want to follow the example in StringCompare.fmwt so you don't have to set the conditions for all of the possible values. And yes you've got it right for the FeatureMerger!

Thank you very much for your help!

 

Everything works, although I have noticed it works much slower when conditions are set in this other way so I still have to check how will it work with large amount of data (time is not so important but I had experience before with time consuming workbenches which crash after hours of running...)

 

Reply