Skip to main content
Question

Duplicate remover problems

  • January 15, 2015
  • 8 replies
  • 35 views

Hi!

 

 

I have applied the Duplicator Remover to a simple excel file with one column of postcodes which have duplicated postcodes. I want to achieve an excel file of postcodes with no duplicates. However, when I inspect the 'unique values' output in the inspector, these value are in fact duplicated. Am I using this correctly or is there another tool I should use?

8 replies

fmelizard
Safer
Forum|alt.badge.img+18
  • Safer
  • January 15, 2015
Hi,

 

You are using the correct transformer for the job, are there possibly spaces in the 'duplicates'? you can test that by calculating the lenght of the string.

 

Here the postcode are made of 4 numeric values and 2 alfanumeric so anything with a lenght bigger than 6 can contain white spaces and anything shorter are not valid postcodes.

takashi
Influencer
  • January 15, 2015
Hi,

 

 

I agree that the "duplicate" postcodes could contain whitespaces.

 

In addition, in some countries, a postcode consists of digits and alphabet characters. If your postcode is so, check also case of alphabet. FME is generally case-sensitive when comparing character strings.

 

 

Takashi

  • Author
  • January 16, 2015
Hi,

 

 

my postcode list does indeed have whitespaces and digits and alphabet

 

i.e. PL23 4NG. Will I need to remove the whitespace for the duplicate remover to work?

 

 

Thanks

fmelizard
Safer
Forum|alt.badge.img+18
  • Safer
  • January 16, 2015
Hi,

 

Not necessarly, it depends if the whitespace between the 3 and 4 in your example is correct.

 

My initial idea was to remove whitespaces before and after the string so that the duplicateremover can do its work. In case that the whitespces inbetween is correct you could try a regex to remove only the whitespaces before and after the string.

  • Author
  • January 16, 2015
Brilliant, I will try that and let you know if I have any luck - thanks again! :-)

takashi
Influencer
  • January 16, 2015
FYI. The AttributeTrimmer can be used to remove leading and trailing excess whitespaces.

 

Additionally, if there are both uppercase and lowercase postcodes (e.g. "PL23 4NG", "pl23 4ng", "PL23 4ng" ...), consider using the StringCaseChanger to change every character to uppercase. 

fmelizard
Safer
Forum|alt.badge.img+18
  • Safer
  • January 16, 2015
good call Takashi, didnt think about that!

  • Author
  • January 16, 2015
Hi guys, tried the attribute trimmer before feeding the data through the duplicate remover and it is working perfectly...thank you so much for all your help :-)

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings