The problem
The documentation for FeatureMerger mentions a "Suppliers First" mode that can reportedly be very beneficial to performance (and, I would imagine, crucial for Streams), but comes with the constraint that all suppliers must have arrived before the first requestor comes in. To my knowledge, there is currently no way in FME of upholding that guarantee in a reliable manner. The timing between Readers, Creators, FeatureReaders and SQLExecutors is not something I can claim to understand, and the completion order can change based on whether caching is turned on or not. This is already troublesome when editing workbenches, but it can be expecially problematic inside custom transformers, where you don't have control over the delivery order of features in your input ports.
This isn't merely an issue with FeatureMerger; it's going to be a problem at any time where you rely on input ordering for a transformer to work properly, with specific attention given to Streams and Feature tables.
If I have a PythonCaller that configures itself based on data coming from TransformerA before it's ready to accept data from TransformerB, there isn't a lot of options for reliably dealing with out-of-order input. Holding onto features is illegal when bulk mode support is advertized, so the PythonCaller must either opt out of it (which hurts downstream performance) to accumulate any features it's not ready to process until the configuration features have come in. Even then, it can't know when TransformerA has closed, so unless it only expects one configuration feature, it's dangerous to start processing before close() has been called.
This might be clearer when considering the attached screenshot: Python_MapAttributes needs the output line from JsonTemplater in order to work with the data coming from MappedInputLines. If MappedInputLines starts sending features first, the PythonCaller can't do anything yet, and can only crash (undesirable) or start buffering features (illegal in Bulk mode).
The solution
The idea would be to have some sort of Semaphore transformer, something like a FeatureHolder, but with (at least?) two input ports: a "Priority" port, which lets features through normally, and a "Held" port which buffers features until the other port has closed and no more features can go through it. This would ensure that no feature from the "Priority" side can ever arrive before a "Held" feature, thus allowing workflow and transformer designers to guarantee feature ordering downstream without breaking bulk mode.
Other relevent use cases
One might also consider having a Terminator node which should stop the translation when an assertion or a join fail in some unexpected way, but should wait until every faulty feature has arrived to give proper context instead of immediately stopping at the first one. Bad features could be sent through the priority port and then the priority port routed to a terminator, so that no feature can be passed to the next step until it has been verified that none exist that would trip the Terminator. This would also allow the Terminator to be changed to wait for all features to have arrived before stopping the translation, instead of aborting at the first one (which currently makes sense, as the more you wait, the more you risk that downstream writers will have already started to write incomplete data).

