Skip to main content

I have an excel spreadsheet with years of maintenance data (2006-2015). I need to join that to our current pavement database, but in order. So if something was done to a road in 2007, and again in 2014, I need it to join the 2014 data (over write the 2007 data). Not sure if this matters, but not everything will match from the spreadsheet to the existing database (many fewer records in the excel sheet as we do not do something to every road every year). I've tried joiner, but no luck. In the parameters you can select all of the spreadsheets to read, but in the join section, you can only pick one year? Doesn't make sense to me.

Could you use a duplicate filter to remove the superceeded records (sort by the year first) prior to joining to the database?


@jdh - not sure how the duplicate remover works on multiple worksheets within an excel spreadsheet? Can that be done? Also how does it know to keep the latest record? Does it automatically delete the first record found of the two? And does it read the worksheets one by one in the order they are found in excel? Also, that doesn't fix the problem that I can only get the joiner to join one of the worksheets, not all of them.


@jdh - not sure how the duplicate remover works on multiple worksheets within an excel spreadsheet? Can that be done? Also how does it know to keep the latest record? Does it automatically delete the first record found of the two? And does it read the worksheets one by one in the order they are found in excel? Also, that doesn't fix the problem that I can only get the joiner to join one of the worksheets, not all of them.

You can use a merge filter on your Excel reader to have all the sheets come through one single merged feature type, or you can combine all the streams in a sorter. The DuplicateFilter will route the first feature through the Unique output port, and all other features through the Duplicate output port (based on a key attribute - presumably your road id).

 

 

You would use a featureMerger rather than a joiner to merge the data to your pavement database if you're reading in your pavement db features as part of your workflow, otherwise you can use the SQLExecuter to update your db directly

Wow, that is a totally different direction than I wanted to go. I only need 4 attributes from this spreadsheet, so thought the joiner would be a better option (not doing anything else with it). I am going to have to talk to the owner of the spreadsheet to see if he'll add a year field to each worksheet and populate that. I'll try the merger, then sort by the year then remove duplicates, then use the feature merger. My output/writer is technically not a database but an obscure GIS viewing software (GTViewer) where we just want to add these attributes to that system, not our full pavement database.

Thanks for your help and reply. I'll see how this works.


Wow, that is a totally different direction than I wanted to go. I only need 4 attributes from this spreadsheet, so thought the joiner would be a better option (not doing anything else with it). I am going to have to talk to the owner of the spreadsheet to see if he'll add a year field to each worksheet and populate that. I'll try the merger, then sort by the year then remove duplicates, then use the feature merger. My output/writer is technically not a database but an obscure GIS viewing software (GTViewer) where we just want to add these attributes to that system, not our full pavement database.

Thanks for your help and reply. I'll see how this works.

The joiner reads an external file (in this case the xls), but can only read one feature type at a time. If you read the xls file in as an independent feature type you have much more control as to what/how you join with the featureMerger. You can drop all the extraneous attributes from the excel file, and set suppliers first to reduce the memory overhead (make sure the xls file is the first reader)


The joiner reads an external file (in this case the xls), but can only read one feature type at a time. If you read the xls file in as an independent feature type you have much more control as to what/how you join with the featureMerger. You can drop all the extraneous attributes from the excel file, and set suppliers first to reduce the memory overhead (make sure the xls file is the first reader)

As an alternative, read the excel file, keep only the relevant attribute, and then on the feature merger set process duplicate suppliers to yes, and generate a list.

 

 

That way you get

 

_list{0}.fme_feature_type = 2006

 

_list{0}.attrA = x

 

_list{1}.fme_feature_type = 2014

 

_list{1}.attrA = y

 

 

you can sort the list and promote the attributes you want, or do other analyses.

Reply