Question

How to test if the sub-set value of an attribute is correct?

7 years ago
January 29, 2018
13 replies
3 views

cam
8 replies

Hi,

I have a dataset I want to check matches with another one. In particular I need to make sure that if the business area names match between the 2 datasets that the sub-business area listed is one named under that business area. For example if in the 1st spreadsheet;

Business area = Environment, Sub-business area = River or Forest

Business area = Air, Sub-business area = Quality or Pollution.

Then in the 2nd spreadsheet, if the Business area is Environment then the sub-business area should only be River or Forest. If is it Quality then I need to know this is incorrect. Also is there are way of doing this without having to list out all the sub business areas (e.g. if you used a tester with environment = river, forest,etc).

Thanks in advance and also happy to clarify if what im asking isn't clear.

+21

oscard
Influencer
342 replies
7 years ago
January 29, 2018

I would use a FeatureMerger on both fields: Business area and sub-area.

Requestor = 2nd spreadsheet. Supplier = 1st spreadsheet.

Through the "UnmergedRequestor" port you should have the incorrect ones.

cam
Author
8 replies
7 years ago
January 29, 2018

oscard wrote:

I would use a FeatureMerger on both fields: Business area and sub-area.

Requestor = 2nd spreadsheet. Supplier = 1st spreadsheet.

Through the "UnmergedRequestor" port you should have the incorrect ones.

Hi, thank you. I'm not sure this gives me what im looking for though, or I'm not understanding it. I just get a list of business and sub-business areas, it doesn't seem to check if the sub-business areas have been assigned correctly.

Maybe I wasn't clear in my explanation, but the bus & sub-bus areas are correct in s.sheet 1, I need to check they have been correctly allocated in s.sheet 2.

+21

oscard
Influencer
342 replies
7 years ago
January 29, 2018

cam wrote:

Maybe I wasn't clear in my explanation, but the bus & sub-bus areas are correct in s.sheet 1, I need to check they have been correctly allocated in s.sheet 2.

Wait, I understood incorrectly your issue. In the same cell of the first spreadsheet are all the possible values. Quality or Pollution is one cell with two possible values. My solution was for one cell and one value.

It would help if you could share a sample of the spreadsheets.

+21

oscard
Influencer
342 replies
7 years ago
January 29, 2018

Please, check the file I have attached to see if I understand your problem.

There are 2 spreadsheets and a worskspace to compare them.

+15

gio
Contributor
2252 replies
7 years ago
January 29, 2018

@cam

A way to do this could be by

creating aggregate on business area on Sheet1. Have sub-business concatenated with a comma (call it for instance sub_bis_list.

Then use featuremerger, merge on business (use unconditional merge 1=1, list dup and explode). First bulkattribute rename sheet 2 to avoid naming conflict.

Then use a string searcher regexp mode. Search for @Value(Sub-business area) in @Value(Subbis_list.)

+15

gio
Contributor
2252 replies
7 years ago
January 29, 2018

oscard wrote:

Please, check the file I have attached to see if I understand your problem.

There are 2 spreadsheets and a worskspace to compare them.

compare.fmw

@oscard

takashi
7715 replies
7 years ago
January 30, 2018

Hi @cam, if I understand the requirement correctly, this workflow would be a solution.

Assumed that the 1st-spreadsheet looks like this.

If the 1st-spreadsheet looks like below, consider using the ListBuilder (Group By: Business area) or the Aggregator (Group By: Business area) to create a list that will be merged to the features from the 2nd-spreadsheet, instead of the AttributeSplitter in the workflow above.

takashi
7715 replies
7 years ago
January 30, 2018

takashi wrote:

Hi @cam, if I understand the requirement correctly, this workflow would be a solution.

Assumed that the 1st-spreadsheet looks like this.

In the second case (the 1st-spreadsheet looks like this), the following workflow might also be possible.

1st-spreadsheet:

Workflow:

+15

gio
Contributor
2252 replies
7 years ago
January 30, 2018

compare.fmw

@oscard

I had wrong sheet at wrong input. Here is correct order.

Or in pic

form

cam
Author
8 replies
7 years ago
January 30, 2018

takashi wrote:

Hi @cam, if I understand the requirement correctly, this workflow would be a solution.

Assumed that the 1st-spreadsheet looks like this.

Brilliant thank you very much for your help. I used the example below (second case - spreadsheet 1 looks like the below one)

cam
Author
8 replies
7 years ago
January 30, 2018

gio wrote:

compare.fmw

@oscard

I had wrong sheet at wrong input. Here is correct order.

Or in pic

form

Thanks for your help

cam
Author
8 replies
7 years ago
January 30, 2018

gio wrote:

@cam

A way to do this could be by

creating aggregate on business area on Sheet1. Have sub-business concatenated with a comma (call it for instance sub_bis_list.

Then use featuremerger, merge on business (use unconditional merge 1=1, list dup and explode). First bulkattribute rename sheet 2 to avoid naming conflict.

Then use a string searcher regexp mode. Search for @Value(Sub-business area) in @Value(Subbis_list.)

Thanks for your help

cam
Author
8 replies
7 years ago
January 30, 2018

oscard wrote:

Please, check the file I have attached to see if I understand your problem.

There are 2 spreadsheets and a worskspace to compare them.

thanks for your help

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

How to test if the sub-set value of an attribute is correct?