To make sure we fully understand what you’re looking for, could you describe a situation where Group-By with the Matcher would make a difference in your workflow?
Any concrete example would be really helpful for us to explore this further.
I need to analyse several hundred (potentially thousands) files, the files to be analysed are only determined at run time. The potential duplicates will only be within the same file, the features arrive at the matcher in file order, so if the matcher had a group by and complete groups when group changes, then the features could be released to the featureWriter rather than being store in memory at the matcher.
My workflow is analysis to determine files of interest → feature reader → analyse features → subset of derived data→ matcher → single matched/unmatched to featureWriter (fanout based on input file).
Given the nature of the analysis before the matcher, a parent workspace with workspaceRunner for each input file is not optimal.
Thanks @jdh for sharing more about your use case.
Have you tried setting “Attributes That Must Differ” to fme_basename? That may achieve a similar effect by keeping matches file-specific. If you’ve already tested this, I’m interested to learn what’s missing from that approach.
@jdh , I also frequently face similar situation of yours. For your reference, sometimes I define a custom transformer which wraps a Matcher and publishes Group By and Complete Groups parameters, as a workaround in the interim. See also the attachement including an example.
@andreaatsafe, jdh is talking about "When Group Changes (Advanced)" option in the general group processing. “Attributes That Must Differ” setting in Matcher can't be its substitution.
See the attached workspace example which demonstrates how the "When Group Changes (Advanced)" option would work.
Similary, it would be great if LineCombiner would have group processing including Complete Groups options.
@jdh , I also frequently face similar situation of yours. For your reference, sometimes I define a custom transformer which wraps a Matcher and publishes Group By and Complete Groups parameters, as a workaround in the interim.
That’s the approach I took, but it’s a hack that shouldn’t be necessary, hence the idea.