Skip to main content
Question

Read Excel File from SharePoint Online as Attribute, Rather than Downloading

  • 30 June 2023
  • 4 replies
  • 307 views

I am trying to create a workspace to read the contents of an Excel file from SharePoint directly into memory to then operate on, rather than download the file to a temporary location first.

 

I understand that I should be able to read files to memory by using the SharePointOnlineConnector with 'Action: Download' and 'Download as: Attribute' settings, and have had a degree of success with a simple text file that I'm using for test purposes.

The '_contents' attribute returns the full content of my text file, as below:

SharePoint Text FileHowever, I'm not sure how to deal with this attribute to make it usable after this point…

I've tried passing the '_contents' attribute to a CSV FeatureReader as the 'Dataset', but this didn't work, as the reader tries to open a file at the location specified within that attribute. I guess I've already read the CSV file with the SharePointOnlineConnector, so now need to work with the attribute directly...

Reading an Excel file from SharePoint seems to work the same way, although the '_contents' attribute is more obtuse. I can only assume that it represents the contents of the Excel file in some way:

SharePoint Excel File 

Clearly the Connector is working, but how do I make the '_contents' attribute useful?

4 replies

Badge +41

I don't think you can read the _contents attribute directly. As far as I know, you always need to write the file and then read it. But what you can do is make use of the FME temp files so you wont have to clean up the downloaded files after the process is done. The transformer you need for this is the TempPathnameCreator.

 

Different approaches I know of:

  • Directly. This is what I use most of the time.
    • Create temp path using the TempPathnameCreator.
    • Download the file using the SharePointOnlineConnector to the temp path.
    • Read the file using the FeatureReader.
  • Using the AttributeFileWriter. It does work, but I do not know why would be a better way.
    • Download the file using the SharePointOnlineConnector to an attribute.
    • Create temp path using the TempPathnameCreator.
    • Write the file from the attribute using the AttributeFileWriter.
    • Read the file using the FeatureReader.
  • Workaround. Some transformers (I'm talking about you, AzureBlobStorageConnector!) can't write to a dynamic defined folder. So the TempPathnameCreator is a no go.
    • Download the file using the SharePointOnlineConnector to a folder.
    • Create temp path using the TempPathnameCreator.
    • Use the FeatureWriter, FileCopy writer to move the file to the temp path.
    • Read the file using the FeatureReader.
  • Manual.
    • Download the file using the SharePointOnlineConnector to a folder.
    • Read the file using the FeatureReader.
    • Use a PythonCaller to delete the file.
Badge +3

I don't think you can read the _contents attribute directly. As far as I know, you always need to write the file and then read it. But what you can do is make use of the FME temp files so you wont have to clean up the downloaded files after the process is done. The transformer you need for this is the TempPathnameCreator.

 

Different approaches I know of:

  • Directly. This is what I use most of the time.
    • Create temp path using the TempPathnameCreator.
    • Download the file using the SharePointOnlineConnector to the temp path.
    • Read the file using the FeatureReader.
  • Using the AttributeFileWriter. It does work, but I do not know why would be a better way.
    • Download the file using the SharePointOnlineConnector to an attribute.
    • Create temp path using the TempPathnameCreator.
    • Write the file from the attribute using the AttributeFileWriter.
    • Read the file using the FeatureReader.
  • Workaround. Some transformers (I'm talking about you, AzureBlobStorageConnector!) can't write to a dynamic defined folder. So the TempPathnameCreator is a no go.
    • Download the file using the SharePointOnlineConnector to a folder.
    • Create temp path using the TempPathnameCreator.
    • Use the FeatureWriter, FileCopy writer to move the file to the temp path.
    • Read the file using the FeatureReader.
  • Manual.
    • Download the file using the SharePointOnlineConnector to a folder.
    • Read the file using the FeatureReader.
    • Use a PythonCaller to delete the file.

Thanks so much for the suggestions, Niels.

 

I'll have a go downloading the file and using the TempPathnameCreator as a workaround, but it really does make me wonder what the point of downloading the file as an attribute is... it currently seems like half a solution, that doesn't result in anything different than just downloading the file in the first place.

 

I'll leave this question open for a little longer to see if anyone else has ideas on how to utilise the download to attribute functionality.

Badge

Thanks so much for the suggestions, Niels.

 

I'll have a go downloading the file and using the TempPathnameCreator as a workaround, but it really does make me wonder what the point of downloading the file as an attribute is... it currently seems like half a solution, that doesn't result in anything different than just downloading the file in the first place.

 

I'll leave this question open for a little longer to see if anyone else has ideas on how to utilise the download to attribute functionality.

How are you able to download as files/folders, I can't seem to figure out what I need to add to my workspace but my the SharePoint Online Connector is working fine for me (says translation is successful).

Badge +3

I am also interested in the download as attribute capability.

Reason: I have downloaded an xlsx file to disk: this changes the format of embedded URL cell values and assumes they are relative (to the workbench running folder, or the temppathfolder ) path values. This is not useful, as the original sharepoint edition has sharepoint URL values to other resources embedded in the data.

I assumed that bringing the file across as an attribute would alleviate this issue, by leaving the raw URL values untouched, but the question is how to utilize the attribute contents to pass to a reader without touching down on a disk first…..? Anyone @safe got some direction for this enquiry?

Reply