Question

How to test if the sub-set value of an attribute is correct?


Badge

Hi,

I have a dataset I want to check matches with another one. In particular I need to make sure that if the business area names match between the 2 datasets that the sub-business area listed is one named under that business area. For example if in the 1st spreadsheet;

Business area = Environment, Sub-business area = River or Forest

Business area = Air, Sub-business area = Quality or Pollution.

Then in the 2nd spreadsheet, if the Business area is Environment then the sub-business area should only be River or Forest. If is it Quality then I need to know this is incorrect. Also is there are way of doing this without having to list out all the sub business areas (e.g. if you used a tester with environment = river, forest,etc).

Thanks in advance and also happy to clarify if what im asking isn't clear.


13 replies

Userlevel 2
Badge +19

I would use a FeatureMerger on both fields: Business area and sub-area.

Requestor = 2nd spreadsheet. Supplier = 1st spreadsheet.

Through the "UnmergedRequestor" port you should have the incorrect ones.

Badge

I would use a FeatureMerger on both fields: Business area and sub-area.

Requestor = 2nd spreadsheet. Supplier = 1st spreadsheet.

Through the "UnmergedRequestor" port you should have the incorrect ones.

Hi, thank you. I'm not sure this gives me what im looking for though, or I'm not understanding it. I just get a list of business and sub-business areas, it doesn't seem to check if the sub-business areas have been assigned correctly.

 

 

Maybe I wasn't clear in my explanation, but the bus & sub-bus areas are correct in s.sheet 1, I need to check they have been correctly allocated in s.sheet 2.

 

Userlevel 2
Badge +19
Hi, thank you. I'm not sure this gives me what im looking for though, or I'm not understanding it. I just get a list of business and sub-business areas, it doesn't seem to check if the sub-business areas have been assigned correctly.

 

 

Maybe I wasn't clear in my explanation, but the bus & sub-bus areas are correct in s.sheet 1, I need to check they have been correctly allocated in s.sheet 2.

 

Wait, I understood incorrectly your issue. In the same cell of the first spreadsheet are all the possible values. Quality or Pollution is one cell with two possible values. My solution was for one cell and one value.

 

 

It would help if you could share a sample of the spreadsheets.

 

 

Userlevel 2
Badge +19

Please, check the file I have attached to see if I understand your problem.

There are 2 spreadsheets and a worskspace to compare them.

Badge +3

@cam

A way to do this could be by

creating aggregate on business area on Sheet1. Have sub-business concatenated with a comma (call it for instance sub_bis_list.

Then use featuremerger, merge on business (use unconditional merge 1=1, list dup and explode). First bulkattribute rename sheet 2 to avoid naming conflict.

Then use a string searcher regexp mode. Search for @Value(Sub-business area) in @Value(Subbis_list.)

Badge +3

Please, check the file I have attached to see if I understand your problem.

There are 2 spreadsheets and a worskspace to compare them.

compare.fmw

 

@oscard

 

 

 

Userlevel 2
Badge +17

Hi @cam, if I understand the requirement correctly, this workflow would be a solution.

Assumed that the 1st-spreadsheet looks like this.

If the 1st-spreadsheet looks like below, consider using the ListBuilder (Group By: Business area) or the Aggregator (Group By: Business area) to create a list that will be merged to the features from the 2nd-spreadsheet, instead of the AttributeSplitter in the workflow above.

Userlevel 2
Badge +17

Hi @cam, if I understand the requirement correctly, this workflow would be a solution.

Assumed that the 1st-spreadsheet looks like this.

If the 1st-spreadsheet looks like below, consider using the ListBuilder (Group By: Business area) or the Aggregator (Group By: Business area) to create a list that will be merged to the features from the 2nd-spreadsheet, instead of the AttributeSplitter in the workflow above.

In the second case (the 1st-spreadsheet looks like this), the following workflow might also be possible.

 

1st-spreadsheet:

Workflow:

 

Badge +3

compare.fmw

@oscard

I had wrong sheet at wrong input. Here is correct order.

Or in pic

form

Badge

Hi @cam, if I understand the requirement correctly, this workflow would be a solution.

Assumed that the 1st-spreadsheet looks like this.

If the 1st-spreadsheet looks like below, consider using the ListBuilder (Group By: Business area) or the Aggregator (Group By: Business area) to create a list that will be merged to the features from the 2nd-spreadsheet, instead of the AttributeSplitter in the workflow above.

Brilliant thank you very much for your help. I used the example below (second case - spreadsheet 1 looks like the below one)

 

 

Badge

compare.fmw

@oscard

I had wrong sheet at wrong input. Here is correct order.

Or in pic

form

Thanks for your help

 

 

Badge

@cam

A way to do this could be by

creating aggregate on business area on Sheet1. Have sub-business concatenated with a comma (call it for instance sub_bis_list.

Then use featuremerger, merge on business (use unconditional merge 1=1, list dup and explode). First bulkattribute rename sheet 2 to avoid naming conflict.

Then use a string searcher regexp mode. Search for @Value(Sub-business area) in @Value(Subbis_list.)

Thanks for your help

 

 

Badge

Please, check the file I have attached to see if I understand your problem.

There are 2 spreadsheets and a worskspace to compare them.

thanks for your help

 

 

Reply