Skip to main content

I am able to upload PDFs to SharePoint using FME, but the content I get from another system's API is coming in as a BLOB, then I have to convert this binary to Base64 then convert again to Unicode or Binary (System Default) like this:

1The output of the BinaryEncoder transformer is testably correct:

2BUTT the TextDecoder is producing a trimmed output:

3When I upload this file to SharePoint it is incomplete and lacks the actual PDF content, but is openable and not corrupt.

 

Is there a known limitation here, a setting that can be changed, or alternative workflows that would work here?

 

Thank you

How big is the PDF (file size)? Also, it seems weird that you have to take the detour through base64, the BLOB contents should already have the binary file contents, unencoded.


How big is the PDF (file size)? Also, it seems weird that you have to take the detour through base64, the BLOB contents should already have the binary file contents, unencoded.

One of the test PDFs is ~21.1MB and most will be around this size, 10MB~30MB. I have tried the BLOB directly with many different content-types, but it created a corrupt PDF each time. This experience led me to this other community thread: How to extract PDF attachment from ArcGIS Online and upload to online location with HTTPCaller? (safe.com) and that got me to where I am with it just almost working. The original data or encoded to Base 64 is able to produce a valid and complete PDF locally using a FileWriter, but not the SharePoint API being used here.


One of the test PDFs is ~21.1MB and most will be around this size, 10MB~30MB. I have tried the BLOB directly with many different content-types, but it created a corrupt PDF each time. This experience led me to this other community thread: How to extract PDF attachment from ArcGIS Online and upload to online location with HTTPCaller? (safe.com) and that got me to where I am with it just almost working. The original data or encoded to Base 64 is able to produce a valid and complete PDF locally using a FileWriter, but not the SharePoint API being used here.

There are several things that could be the problem. What I'm saying about the BLOB, is that if you save it directly (as a binary file) with the file extension .pdf, it should open just fine in e.g. Acrobat. In other words, the BLOB should contain exactly the file contents on the file system.

If you're holding a 30MB file in an attribute and then do a Base64 encoding, you're suddenly holding a 40MB (~33% increase for Base64) attribute. I'm not sure if there are any limits somewhere in your workflow that could silently truncate the attribute contents. Even if that is not the case, my experience is that having so much data in each feature can seriously slow down the workspace.

Then there is also the Sharepoint API to contend with.

My recommendation is to try and isolate more precisely where the problem occurs, and ideally also look if it's possible to streamline the process somewhat, e.g. hold smaller amounts of data in attributes.


Reply