Skip to main content

for example:

TableA has attributeA and keyA. TableB has attributeB and keyB.

if keyA is found in keyB, copy attributeA to attributeB. Don't create a new record; just populate attributeB in TableB

There's a 1 to many relationship between TableA and TableB.

Does this require multiple transformers and if so which ones and how would they be setup?

If both tables are in a (different) database, I would use the SQLExecutor transformer, connected to reading TableA.

The SQL statement can be use to update TableB.

Hope this helps.


I just started looking at the SQLExecutor. Both my tables are from an Excel spreadsheet (sorry, I should have mentioned that) and the SQL transformers seem to only work with database tables or can it work with a spreadsheet?


Addition: Both my tables are from an Excel spreadsheet


Is it KeyB = KeyA or KeyB contains KeyA? That will affect the best approach


In the case both are Excel sheets, the option to write to an existing Sheet (TableB) in update mode exists.

That should provide you with a solution.


Update: it's a Zero to Many relationship between TableA.KeyA to TableB.KeyB


Is it KeyB = KeyA or KeyB contains KeyA? That will affect the best approach

TableA.KeyA may be in TableB.KeyB

There can be zero or more matching TableA.KeyA in TableB.KeyB. It's a zero to many relationship between between TableA ->TableB


In the case both are Excel sheets, the option to write to an existing Sheet (TableB) in update mode exists.

That should provide you with a solution.

Not sure I'm following your suggestion. Does the Writer have a join and a copy attribute? or is this a method that allows copying the attribute back into the original excel sheet?


Not sure I'm following your suggestion. Does the Writer have a join and a copy attribute? or is this a method that allows copying the attribute back into the original excel sheet?

According to the specs in the documentation it updates the fields using the row attribute:

https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/xlsx/XLSX_writer.htm

That row attribute is available if you read TableB and use the FeatureJoiner transformer to join to TableA.


Hi @rsuinn, sounds that one of FeautreMerger, FeatureJoiner, or DatabseJoiner is what all you need. See their help documents to find which one exactly fit to your requirement and conditions.


Hi @rsuinn, sounds that one of FeautreMerger, FeatureJoiner, or DatabseJoiner is what all you need. See their help documents to find which one exactly fit to your requirement and conditions.

Arigat? gozaimasu. I am currently trying a few suggested methods.


Similar to @takashi, I find the best general purpose tool for this is FeatureJoiner. It is supports 1 to Many (this is what "0 to Many" is. The generalised form of 1-M doesn't have a predicate that there has to be a match between the two tables) and is high performance as it uses Attribute indices generated within the Transformer.

In order to "update" Table B, in conjunction with FeatureJoiner you just need to change the name of the Value Attribute in Table A to the same name as the "destination" Attribute in Table B. Eg. Say Table A Value Attribute was "FromValue" and Table B Value Attribute was "ToValue". If you rename "FromValue" to "ToValue" in Table A (Say using, AttributeRenamer or AttributeManager), then this will replace the original attribute value in Table B with the value from Table A. That's just generally how FME works, when dealing with 2 Attributes of the same name, in order to merge them together on a feature-by-feature level it will destroy one of them and replace it with the other through "Conflict Resolution". Conflict resolution happens feature-by-feature, one-by-one. If they Join, then conflict resolution occurs. If they don't then no conflict resolution occurs.

The renaming can be done either before or after the FeatureJoiner. If you rename "FromValue" before the FeatureJoiner, then set "Conflict Resolution" to "Right Features" with Table B entering on the "Left (Hand Side" as the Table to Update, and Table A entering on the "Right (Hand Side") as the Table to Update With. Then wherever there is a match, Table Bs value attribute will get overwritten with the value in Table A since they have the same Attribute Name and in order to resolve the "conflict" between the two values, Table A's will get used to replaced the original value in Table B. If there is no match for any given Feature, then there is no conflict to resolve and the original Attibute value will stay as it is.

Alternatively, you can also rename the attribute after Featurejoiner. On the Joined feature output port from FeatureJoiner, if you rename the "FromValue" Attribute to "ToValue", it will delete any existing "ToValue" attribute and replace it with the renamed "FromValue" Attribute. This can be done in say AttributeManager: As long as "ToValue" is set to "Do Nothing" and "FromValue" is set to "Rename" to "ToValue", and this operation is put lower on the list of AttributeManager operations, this will take precedence as the last-to-run operation and replace the original Attribute with the new Attribute, because if you rename an attribute to the name of an already existing attribute, it needs to destroy the original attribute value first to replace it with the renamed attribute.


Similar to @takashi, I find the best general purpose tool for this is FeatureJoiner. It is supports 1 to Many (this is what "0 to Many" is. The generalised form of 1-M doesn't have a predicate that there has to be a match between the two tables) and is high performance as it uses Attribute indices generated within the Transformer.

In order to "update" Table B, in conjunction with FeatureJoiner you just need to change the name of the Value Attribute in Table A to the same name as the "destination" Attribute in Table B. Eg. Say Table A Value Attribute was "FromValue" and Table B Value Attribute was "ToValue". If you rename "FromValue" to "ToValue" in Table A (Say using, AttributeRenamer or AttributeManager), then this will replace the original attribute value in Table B with the value from Table A. That's just generally how FME works, when dealing with 2 Attributes of the same name, in order to merge them together on a feature-by-feature level it will destroy one of them and replace it with the other through "Conflict Resolution". Conflict resolution happens feature-by-feature, one-by-one. If they Join, then conflict resolution occurs. If they don't then no conflict resolution occurs.

The renaming can be done either before or after the FeatureJoiner. If you rename "FromValue" before the FeatureJoiner, then set "Conflict Resolution" to "Right Features" with Table B entering on the "Left (Hand Side" as the Table to Update, and Table A entering on the "Right (Hand Side") as the Table to Update With. Then wherever there is a match, Table Bs value attribute will get overwritten with the value in Table A since they have the same Attribute Name and in order to resolve the "conflict" between the two values, Table A's will get used to replaced the original value in Table B. If there is no match for any given Feature, then there is no conflict to resolve and the original Attibute value will stay as it is.

Alternatively, you can also rename the attribute after Featurejoiner. On the Joined feature output port from FeatureJoiner, if you rename the "FromValue" Attribute to "ToValue", it will delete any existing "ToValue" attribute and replace it with the renamed "FromValue" Attribute. This can be done in say AttributeManager: As long as "ToValue" is set to "Do Nothing" and "FromValue" is set to "Rename" to "ToValue", and this operation is put lower on the list of AttributeManager operations, this will take precedence as the last-to-run operation and replace the original Attribute with the new Attribute, because if you rename an attribute to the name of an already existing attribute, it needs to destroy the original attribute value first to replace it with the renamed attribute.

that was a very helpful explanation! I had tried FeatureJoiner earlier but it didn't produce the dataset I needed so from your explanation I think I know what I need to do. BTW, I experimented with InlineQuerier and was able to both join and populate attributeB using a SQL statement and it created a correct TableB. I'm still going to try FeatureJoiner and AttributeManager.


that was a very helpful explanation! I had tried FeatureJoiner earlier but it didn't produce the dataset I needed so from your explanation I think I know what I need to do. BTW, I experimented with InlineQuerier and was able to both join and populate attributeB using a SQL statement and it created a correct TableB. I'm still going to try FeatureJoiner and AttributeManager.

@rsuinn, InlineQuerier is my 2nd most used Transformer for this, when dealing with non-SQL capable source data. Whilst FeatureJoiner does most of the Merging/Joins I need, particularly when there are only 2 Tables, InlineQuerier does have some advantages:

  • In performance, it is roughly equal to a simple FeatureJoiner situation, however the downside is that the user does have to write the SQL statement manually. It also (in its current form) writes the underlying temporary SQLIte database, used to execute the user's SQL statement, to disk instead of using a RAM option, so its performance is also dictated by the underlying file storage device(s). There is also no easy option to create Indices on the SQLite tables within InlineQuerier, although in most situations the brute power of the SQLite engine can often still process results quickly just through executing JOIN queries in a naive/non-indexed way.
  • It can, because it is SQL, handle NULL values in the JOIN fields. FeatureJoiner requires these to be stripped out from the features first, such as using a Tester to divert these Features around the FeatureJoiner Transformer
  • It outperforms FeatureJoiner when there are Mutilple Lookup tables to join to the Fact Table. Eg. You can use "SELECT * FROM FactTable INNER JOIN LookupTableA ON (...) INNER JOIN LookupTableB ON (...) INNER JOIN LookupTableC ON (...) to do a multi-table join within one operation.
  • When the "Update" value is complex, say in a Many-to-Many relationship where you want only one value returned for a Base Feature rather than multiple matching values, then having access to GROUP BY, WHERE, DISTINCT etc. helps set the constraints for calculating the value to return per Feature (and InlineQuerier typically significantly outperforms StatisticsCalculator)

Reply