Skip to main content
Solved

Remove duplicate entries


Forum|alt.badge.img

Dear FME community,

I have a dataset which has for example the following two attributes:

- grundbescheid_nr

- bescheid_nr

If there are two identical rows with the same value in grundbescheid_nr (e.g. grundbescheid_nr= 183101), then only the entry with the higher number in bescheid_nr (in the example bescheid_nr = 186754) should be preserved. The entry marked with the blue background (bescheid_nr = 185582) and it's geometry should be deleted. The lower one(s) should therefore be eliminated.

I suppose I could do it with a ListBuilder somehow, but I'm not yet sure how to do it exactly. Is there an easy way to solve this?

Kind regards

Thomas

Best answer by david_r

You can use a Sorter and sort descending on bescheid_nr. Then use a DuplicateFilter (was called DuplicateRemover before FME 2016) on grundbescheid_nr and only take the unique record.

This works since the unique output port will always output the first unique value entering.

View original
Did this help you find an answer to your question?
This post is closed to further activity.
It may be a question with a best answer, an implemented idea, or just a post needing no comment.
If you have a follow-up or related question, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

4 replies

david_r
Celebrity
  • Best Answer
  • October 27, 2016

You can use a Sorter and sort descending on bescheid_nr. Then use a DuplicateFilter (was called DuplicateRemover before FME 2016) on grundbescheid_nr and only take the unique record.

This works since the unique output port will always output the first unique value entering.


Forum|alt.badge.img
  • Author
  • October 27, 2016
david_r wrote:

You can use a Sorter and sort descending on bescheid_nr. Then use a DuplicateFilter (was called DuplicateRemover before FME 2016) on grundbescheid_nr and only take the unique record.

This works since the unique output port will always output the first unique value entering.

Wonderful, that works perfectly! Many thanks.

 


takashi
Influencer
  • October 28, 2016

Hi @thomask, I also use the combination of Sorter and DuplicateFilter usually in a similar case, but it's not the only way. For example, the InlineQuerier with this SQL statement could also be effective. FYI.

select a.* from features as a
inner join (select grundbescheid_nr, max(bescheid_nr) as maxBescheidNr
    from features group by grundbescheid_nr) as b
on a.grundbescheid_nr = b.grundbescheid_nr and a.bescheid_nr = b.maxBescheidNr

Forum|alt.badge.img
  • Author
  • October 28, 2016

Thank you very much @takashi! This look good, but I'm not so experienced in SQL. It's been a while since we had it in our lectures :-)


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings