Skip to main content
Question

Joining multiple worksheets in Excel to another database in order


Forum|alt.badge.img

I have an excel spreadsheet with years of maintenance data (2006-2015). I need to join that to our current pavement database, but in order. So if something was done to a road in 2007, and again in 2014, I need it to join the 2014 data (over write the 2007 data). Not sure if this matters, but not everything will match from the spreadsheet to the existing database (many fewer records in the excel sheet as we do not do something to every road every year). I've tried joiner, but no luck. In the parameters you can select all of the spreadsheets to read, but in the join section, you can only pick one year? Doesn't make sense to me.

6 replies

jdh
Contributor
Forum|alt.badge.img+28
  • Contributor
  • March 16, 2016

Could you use a duplicate filter to remove the superceeded records (sort by the year first) prior to joining to the database?


Forum|alt.badge.img
  • Author
  • March 16, 2016

@jdh - not sure how the duplicate remover works on multiple worksheets within an excel spreadsheet? Can that be done? Also how does it know to keep the latest record? Does it automatically delete the first record found of the two? And does it read the worksheets one by one in the order they are found in excel? Also, that doesn't fix the problem that I can only get the joiner to join one of the worksheets, not all of them.


jdh
Contributor
Forum|alt.badge.img+28
  • Contributor
  • March 16, 2016
candimk wrote:

@jdh - not sure how the duplicate remover works on multiple worksheets within an excel spreadsheet? Can that be done? Also how does it know to keep the latest record? Does it automatically delete the first record found of the two? And does it read the worksheets one by one in the order they are found in excel? Also, that doesn't fix the problem that I can only get the joiner to join one of the worksheets, not all of them.

You can use a merge filter on your Excel reader to have all the sheets come through one single merged feature type, or you can combine all the streams in a sorter. The DuplicateFilter will route the first feature through the Unique output port, and all other features through the Duplicate output port (based on a key attribute - presumably your road id).

 

 

You would use a featureMerger rather than a joiner to merge the data to your pavement database if you're reading in your pavement db features as part of your workflow, otherwise you can use the SQLExecuter to update your db directly

Forum|alt.badge.img
  • Author
  • March 16, 2016

Wow, that is a totally different direction than I wanted to go. I only need 4 attributes from this spreadsheet, so thought the joiner would be a better option (not doing anything else with it). I am going to have to talk to the owner of the spreadsheet to see if he'll add a year field to each worksheet and populate that. I'll try the merger, then sort by the year then remove duplicates, then use the feature merger. My output/writer is technically not a database but an obscure GIS viewing software (GTViewer) where we just want to add these attributes to that system, not our full pavement database.

Thanks for your help and reply. I'll see how this works.


jdh
Contributor
Forum|alt.badge.img+28
  • Contributor
  • March 16, 2016
candimk wrote:

Wow, that is a totally different direction than I wanted to go. I only need 4 attributes from this spreadsheet, so thought the joiner would be a better option (not doing anything else with it). I am going to have to talk to the owner of the spreadsheet to see if he'll add a year field to each worksheet and populate that. I'll try the merger, then sort by the year then remove duplicates, then use the feature merger. My output/writer is technically not a database but an obscure GIS viewing software (GTViewer) where we just want to add these attributes to that system, not our full pavement database.

Thanks for your help and reply. I'll see how this works.

The joiner reads an external file (in this case the xls), but can only read one feature type at a time. If you read the xls file in as an independent feature type you have much more control as to what/how you join with the featureMerger. You can drop all the extraneous attributes from the excel file, and set suppliers first to reduce the memory overhead (make sure the xls file is the first reader)


jdh
Contributor
Forum|alt.badge.img+28
  • Contributor
  • March 16, 2016
jdh wrote:

The joiner reads an external file (in this case the xls), but can only read one feature type at a time. If you read the xls file in as an independent feature type you have much more control as to what/how you join with the featureMerger. You can drop all the extraneous attributes from the excel file, and set suppliers first to reduce the memory overhead (make sure the xls file is the first reader)

As an alternative, read the excel file, keep only the relevant attribute, and then on the feature merger set process duplicate suppliers to yes, and generate a list.

 

 

That way you get

 

_list{0}.fme_feature_type = 2006

 

_list{0}.attrA = x

 

_list{1}.fme_feature_type = 2014

 

_list{1}.attrA = y

 

 

you can sort the list and promote the attributes you want, or do other analyses.

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings