Skip to main content
Question

Using a Python startup script to delete a row in an Excel spreadsheet


Hi,

 

 

I need to delete the first row of an Excel spreadsheet before reading and importing it in FME (column headers are in the second row).

 

 

I found this script to use at startup (from http://stackoverflow.com/questions/12311345/manipulating-excel-2007-files-using-python):

 

 

#First we need to access the module that lets us connect to Excel

 

import win32com.client

 

 

 

# Next we want to create a variable that represents Excel

 

app = win32com.client.Dispatch("Excel.Application")  

 

 

 

# Lastly we will assume that the workbook is active and get the first sheet

 

wbk = app.ActiveWorkbook

 

sheet = wbk.Sheets(1)

 

 

 

# delete the first row in your active sheet

 

sheet.Rows(1).Delete()

 

 

I tried it, but I get an error message saying I don’t have the win32com module on my computer. (I do not want people to have to download anything to use my workspace.)

 

 

 

Reading further, I found that maybe I should use someting like that instead :

 

 

 

import os

 

app = FME_MacroValues['SourceDataset_XLS_ADO']

 

wbk = app.ActiveWorkbook

 

sheet = wbk.Sheets(1)

 

sheet.Rows(1).Delete()

 

 

 

Something is still missing as I now get this message :

 

Python Exception <AttributeError>: 'str' object has no attribute 'ActiveWorkbook'

 

 

 

If you can provide any help ...

 

I am new both to FME and Python.

9 replies

david_r
Evangelist
  • April 23, 2013
Hi Cath,

 

 

if you just want to ignore the first n number of lines in your translation, it would be a lot easier to insert a Counter after the reader, followed by a tester with something like "_count > 2" as the test clause. You then only connect the PASSED port to the rest of your workflow, effectively dropping all other features.

 

 

David

takashi
Supporter
  • April 23, 2013
Hi Cath,

 

 

'FME_MacroValues['SourceDataset_XLS_ADO']' returns a character string of the Excel file path, you can't control Excel object unless using the win32com module.   I think using a combination of Counter and Tester would be a general solution in such a case, I also often use it. I would provide another option, that is to set 'Start Feature' parameter of the reader to N (0-based row number to start reading) on the Navigator window of your Workbench.

 

 

 

Takashi

 

 

 

  • Author
  • April 23, 2013
Thanks  David and  Takashi.

 

 

I should add some extra information:

 

I need to use the second row of my Excel spreadsheet as column headers. Thus, when I use a counter and tester, or StartFeature parameter, my file has already been imported with either first row as headers or no headers specified (column automatically designated as F1, F2, F3, ...). I have also tried using the bulkrenamer, but unsuccessfully.

 

 

Thanks in advance for any comment !

 

Catherine

david_r
Evangelist
  • April 23, 2013
Hi Catherine,

 

 

here is a way to accomplish this (tested with FME2013sp1):

 

  • Make a copy of your Excel input file without any header row(s)
  • Open FME and create your reader input type. This will create an input feature type with the correct field name definitions.
  • Set the Start Feature value (=2 if you have one header line), as per Takashi's suggestion
  • Point your reader to the Excel file with the header
This should hopefully give the desired result.

 

 

David

takashi
Supporter
  • April 23, 2013
Hi Catherine,

 

 

I'm not sure whether there is a way to read the second row of worksheet as attribute names. If number of columns is not so many, AttributeRenamer would be a simple solution. If not, SchemaMapper and AttributeExposer would be a more flexible way. Have a look at those transformers.

 

 

Takashi

  • Author
  • April 23, 2013
Hi,

 

 

I have tried both of your solutions.

 

 

David, I tried your suggestion, but although I do get the header columns I want, the data is not imported (looking at the table in the Data Inspector. Am I doing something wrong?

 

 

Takashi, Unfortunately I have a very large number of columns (Itried the bulkRenamer) and I cannot be assured that the column names will always be exactly the same.

 

 

Thanks again,

 

Catherine

david_r
Evangelist
  • April 23, 2013
Hi Catherine,

 

 

I tested my last suggestion on my pc using FME2013sp1 and the data was read correctly. Try sending the result to a logger rather than to the Data Inspector to see if that makes a difference.

 

 

David

  • Author
  • April 24, 2013
Hi David,

 

 

I use the same FME version on pc, and still, I do not get any data. I have tried the logger and 2 output files. The logger shows that values are attributed, but still, the result is empty.

 

 

Any other suggestion?

Forum|alt.badge.img+2
  • August 14, 2014
I am having a similar issue. I have two styles of Excel files; Ones that start with a header row and then data and ones that have 7 lines of general information, row 8 are my headers and row 9 starts my data. 

 

 

This is coming in as an email attachment, and all the headers are ordered differently and have different names. I am trying to normalize each of them and put the data into a SQL Non-spatial DB. 

 

 

I can't use the Start Feature parm because that is too general. I have tried the Py script above and as well, I am getting the same error message (not sure what to put for a universal workbook name since they are all named differently). When I read in the data FME reads that the headers start on row 8, but I they are attribute values instead of headers. 

 

 I get this warning message XLSX Reader: Feature type 'WorkOrderDetail' of dataset 'J:\\READ\\RTASQ\\EDW 7.26.14 (1).xlsx' has attribute 'Record Changed' that does not match a header column in the file. This attribute will not be produced. 

 

 

SO.....how can I make row 8 my header column?

Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings