Hello,
I am trying to pull pdf files from an attachment table and avoid having any duplicate documents. There is nothing in either the attachment table nor the related table that I can use as a unique identifier for the documents. I have tried two different methods and both only partially work.
1) Use an aggregator grouped by DATA. This works ok but some of the DATA differs even though the documents themselves are the same.
2) Use an aggregator grouped by DATA_SIZE. This works ok as well but I'm still getting duplicates. I investigated a little and it looks like even though the DATA_SIZE appears the same in the inspector table, when the aggregator runs some of the sizes are slightly different. When I pull up the properties for each of the output duplicate pdfs, the difference appears to be between the file size and size on disk.
Any ideas how to get these to aggregate properly?