Solved

How to use a DatabaseJoiner with list values?

  • 29 September 2022
  • 5 replies
  • 4 views

Badge +7

I'm using a DatabaseJoiner to look up rows in an Excel file. This Excel file has about 6,000 rows and about 20 columns. I have an integer FME attribute that finds a matching field in a row in Excel. About 5 'Fields to Add' are concatenated into a new FME attribute. This works great! 

 

I have 4 more FME attributes that contain 1 to 5 comma separated lookup values. One sample value looks like this: 13045,52136,38010. I need to look up a row in the Excel file (for each of the 3 values in this example), then cancatentate the 'Fields to Add' then cancatentate the three final strings into one big string.

 

Do I have to expand "13045,52136,38010" into three new attributes then use each of them in the DatabaseJoiner and create a string attribute for each of them? 

 

Since I have 4 FME attributes that contain 1 to 5 comma separated lookup values, this means that I'll need to add 20 (4*5) integer attributes for lookup and 20 (4*5) string attributes for output? That is 40 temporary attributes. Is there a better way? Is there some way to use a list attribute? Thanks.

icon

Best answer by markatsafe 29 September 2022, 20:49

View original

5 replies

Badge +20

I don't completely understand your case. Some sample data would help but try ListBasedFeatureMerger first.

Badge +2

@datablue​ I think the answer is probably yes, you need to split the attributes using AttributeSplitter, separate them into separate feature (ListExploder) do the lookup and then merge things back together. Or, you might be able to convert your "13045,52136,38010" into a list attribute (AttributeSplitter again) and then mage use of the ListBasedFeatureMerger. But, as @caracadrian​ suggests, a small sample dataset would let someone create a minimum working example for you.

Badge +7

Thanks for your response. Quick recap:

I have 4 attributes that each contain 0 to 5 lookup values like this: 

13045,52136,38010

 

Yesterday, I made some progress towards this goal.

  1. Used an AttributeSplitter to build a list of integers.
  2. Used a ListSorter because I'd like my integers (lookup values) in order.
  3. Used an AttributeCreator to add 5 new attributes and set them to each list element (_WarnCode_IntList{0}, etc.)

Next, I was going to use a DatabaseJoiner for each lookup value, add an AttributeCreator to create a string (of concatenated 'Fields to Add' ). Finally, I would concatenate the 0 to 5 strings to build one big string, then remove the unneeded attributes.

 

This is very involved. I tried putting most of the work in a separate workspace then calling that workspace 4 times. I was unable to get this working. And it was running very slow. I wanted to pass that workspace a string (e.g. "13045,52136,38010") and return a string (e.g. "My 13045 code info,My 52136 code info, My 38010 code info"). I successfully passed the string into the Workspace and I'm not sure how to return it.

 

Perhaps, I should get working in one workspace before attempting to optimize it.

Badge +7

@datablue​ I think the answer is probably yes, you need to split the attributes using AttributeSplitter, separate them into separate feature (ListExploder) do the lookup and then merge things back together. Or, you might be able to convert your "13045,52136,38010" into a list attribute (AttributeSplitter again) and then mage use of the ListBasedFeatureMerger. But, as @caracadrian​ suggests, a small sample dataset would let someone create a minimum working example for you.

@Mark Stoakes​  I FTPed a PDF to Safe with more details. (I don't think that I can publicly post my data.)

Thanks.

Badge +7

Thanks for your response. Quick recap:

I have 4 attributes that each contain 0 to 5 lookup values like this: 

13045,52136,38010

 

Yesterday, I made some progress towards this goal.

  1. Used an AttributeSplitter to build a list of integers.
  2. Used a ListSorter because I'd like my integers (lookup values) in order.
  3. Used an AttributeCreator to add 5 new attributes and set them to each list element (_WarnCode_IntList{0}, etc.)

Next, I was going to use a DatabaseJoiner for each lookup value, add an AttributeCreator to create a string (of concatenated 'Fields to Add' ). Finally, I would concatenate the 0 to 5 strings to build one big string, then remove the unneeded attributes.

 

This is very involved. I tried putting most of the work in a separate workspace then calling that workspace 4 times. I was unable to get this working. And it was running very slow. I wanted to pass that workspace a string (e.g. "13045,52136,38010") and return a string (e.g. "My 13045 code info,My 52136 code info, My 38010 code info"). I successfully passed the string into the Workspace and I'm not sure how to return it.

 

Perhaps, I should get working in one workspace before attempting to optimize it.

I finally got it working. Thanks to @Mark Stoakes​ and @caracadrian​ for responding.

 

Here's a screenshot of my partially implemented solution.

Lookup values to stringsMy goal is to use a comma separated list in an attribute (e.g. "13045,52136,38010") to create a new attribute  with descriptive strings for each lookup value (e.g. "String for code 13045,String for code 52136,String for code 38010"). When I use DatabaseJoiner, I use four strings from the "Fields to Add" field to build the descriptive string.

 

One issue I ran into was that I would get duplicate descriptive strings. I fixed this by adding an AttributeRemover to remove the four "Fields to Add" attributes. Apparently, without removing them, their values were being cached and re-used.

 

My partial implementation is only looking at the first 2 lookup values. I need to look at 5 lookup values (3 more). At the end, I will need to concatenate the 0 to 5 descriptive strings into a new attribute. In that attribute, I need each descriptive string to be separated by a comma and not have a comma at the end.

 

Finally, I need to repeat my entire solution three more times because I have a total of 4 attributes that each contain 0 to 5 lookup values ("13045,52136,38010"). I had hoped to encapsulate this solution into a separate workspace (for each  attribute) then call it 4 times, but I don't think that's the correct use of a WorkspaceRunner.

Reply