Skip to main content

Hi, I could probably work this out over days (weeks!) but I figured asking the question here may be a quicker method to the end goal!

As a first line, I am working in FME(R) 2018.1.0.0 (20180717 - Build 18520 - WIN32) which is unchangeable due to it being what is installed/available at my place of work. Just in case this affects a solution.

Also, this is my first time posting here, so apologies if this is vague/long winded/wrong format etc.

I have a spreadsheet with each row being a project and multiple columns showing the names and addresses of people involved in the project. I have successfully run this through FME to amend the various raw details into the format required to distribute as I require.

However, I now need to generate a duplicate free list of all the 'contact name'/'company name'/'postcode' which I will then be able to reference back to the original format of the spreadsheet (i.e. a row per project with multiple 'contact name'/'company name'/'postcode' columns for each).

The concept of creating a duplicate free list seems simple enough, but I can't think how I would reference back to the original dataset. I'm thinking lists are the answer and have seen there are many 'List...' transformers but I don't really know where to start.

The aim is to pass a duplicate free list (with three columns: contact name/company name/postcode) to a colleague for them to reference against their systems and then to be able to highlight any matches in my original dataset (row per project and multiple columns as above).

I'm thinking I'd need some sort of unique reference number per row/column combination for each, but what about 'contacts' that appear on more than one project and in different columns? Once the list is free of duplicates, can each have more than one unique reference number?

Does this even make sense!

Cheers!

The format of the original data is sort of like this, except for their being up to 13 sets of contacts and possibly quite a lot of duplication between the various projects and contacts

Project No.

 

Contact1

 

Company1

 

Postcode1

 

Contact2

 

Company2

 

Postcode2

 

etc.

 

10000

 

Bob Jones

 

Indigo Cleaning

 

A1 1AA

 

Simon Wise

 

Soundsmiths Ltd

 

YV7 8RD

 

 

10001

 

Sam Tibble

 

Tibble & Son

 

SS1 1UB

 

Shirley Curtin

 

Intercreate

 

OK7 6HB

 

 

10002

 

Simon Wise

 

Soundsmiths Ltd

 

YV7 8RD

 

Bob Jones

 

Indigo Cleaning

 

A1 1AA

 

 

 

I would like to get a list like below and then be able to easily highlight the matches (against another dataset) by, for example, highlighting the specific match, as well as the Project No. to highlight there is a match on that line somewhere:

 

Unique Ref

 

Contact Name

 

Company Name

 

Postcode

 

 

Bob Jones

 

Indigo Cleaning

 

A1 1AA

 

 

Simon Wise

 

Soundsmiths Ltd

 

YV7 8RD

 

 

Sam Tibble

 

Tibble & Son

 

SS1 1UB

 

 

Shirley Curtin

 

Intercreate

 

OK7 6HB

 


Hi @_ds Can you upload a spreadsheet showing the original data and the final output. I noticed that when you view on the phone or zoom in on the computer, the first spreadsheet looks different


Hi,

 

You know how to´create the list of unique Contacts, right?

 

So, with the created list the easiest way is to use the aggregator grouping by Contact and creating a list in the project no.

And when you need to reference back to the original list you use a list exploder.

 


As requested below by @australia, here is an example spreadsheet of the source dataset (much curtailed vertically and horizontally), as well as the desired edited list. Of course the crux being the ability to reference back to the former. Thanks

FME_forum_example_spreadsheet.xlsx


Hi @_ds Can you upload a spreadsheet showing the original data and the final output. I noticed that when you view on the phone or zoom in on the computer, the first spreadsheet looks different

Done. Cheers!


Hi,

 

You know how to´create the list of unique Contacts, right?

 

So, with the created list the easiest way is to use the aggregator grouping by Contact and creating a list in the project no.

And when you need to reference back to the original list you use a list exploder.

 

Thanks, I'll give this a go


Hi,

 

You know how to´create the list of unique Contacts, right?

 

So, with the created list the easiest way is to use the aggregator grouping by Contact and creating a list in the project no.

And when you need to reference back to the original list you use a list exploder.

 

PS. I'm actually struggling now to create the list of unique contacts! If they were all in one column that'd be ok, but as they are in different columns...stumped!


PS. I'm actually struggling now to create the list of unique contacts! If they were all in one column that'd be ok, but as they are in different columns...stumped!

Hi,

 

Have you achieved this? I am lost in this new layout of the forum haha.

But I have simulated your data and achieved it with some manual work. If your data has always the same number of columns it will be hard to create the first flow, but will not be problem after, but if the number are always change, will be more trickier.

 

I´ve started with a "Cloner", put the number of copies to the number of Collums.

 

Then you can filter by the field "copynum" and create the unique field contact by simple rename it. So the copynum 0 you rename contact1 to contact, the copynum 1 you rename contact2 to contact and goes through all your data.

 

Them you can aggregate it by contact (new field) and create list in projectNo field.

 

 


HI @_ds​ ,

 

can you please provide an example spreadsheet of the source dataset? I can't find one in your previous post.


HI @_ds​ ,

 

can you please provide an example spreadsheet of the source dataset? I can't find one in your previous post.

I'm not sure what happened before and why it isn​'t showing, but hopefully it'll work this time.

 

Thanks, David


Reply