Similar to this idea (
https://knowledge.safe.com/content/idea/19290/two-input-port-ordered-by-group-asynchronously.html - which I still really need to be able to optimise many workspaces), it would be good if the FeatureMerger could handle Join On fields that come in synchronously (Note: Join On, not Group By - assume Group By is empty for this example), but not grouped as All Suppliers, then All Requesters.
Currently if I have this input:
Requester_id = 1
....Supplier_id = 1
Requester_id = 2
....Supplier_id = 2
And so FeatureMerger will cache the both Request and Supplier entirely before doing the merges; a massive problem for anything more than a medium sized dataset, and makes quite a few workspaces effectively impossible.
I guess a cardinality parameter like DatabaseJoiner has could be used, but that's optional.
This would make a lot of workspaces for any large datasets actually workable. Currently I spend a lot of time splitting up datasets in overly-complex ways so that I can process them in chunks to work around memory limits.