Skip to main content
Solved

How to decrypt Excel Files from HTTP Caller?

  • January 14, 2025
  • 5 replies
  • 71 views

hodsen96
Contributor
Forum|alt.badge.img+4

Hi all,

Currently I have a process that gets an excel file from an open data site, saves it into a Network File share location, and then i read it back into my workbench to continue my ETL process.
(Download I am using from open.canada.ca: 
https://data.ontario.ca/dataset/7a049187-cf29-4ffe-9028-235b95c61fa3/resource/6545c5ec-a5ce-411c-8ad5-d66363da8891/download/private_school_contact_information_eng.xlsx)

I would like to stream line the process by avoiding the need to download the file and then reading it into my workbench. I’d rather save it to an attribute and go from there. When I use a BinaryDecoder with default settings, it’s able to read the attributes (see attached image) from the attached HTTPCaller, however, when I run it, It just fails with this huge hexadecimal error log. 

FYI, I am running FME Workbench 2022.2, and I shared a portion of my workbench in the attached zip.

 

Anyone have a similar process that can provide some guidance? Am I limited to only downloading? Any and all help is appreciated :)

Kind Regards,

Senad
 

Best answer by virtualcitymatt

hodsen96 wrote:
ebygomm wrote:

Have you tried just using the url as the source in an excel reader?

Yeah I have. I get a “File Does Not Exist” error and then my translation fails. I am not sure why exactly since I am using the Direct Download Link. I am not sure if I require the headers (Which I’ve set in the HTTPCaller).

One thing I noticed is this:

Excel Reader: Failed to open the dataset 'HTTP:\data.ontario.ca\dataset\7a049187-cf29-4ffe-9028-235b95c61fa3\resource\6545c5ec-a5ce-411c-8ad5-d66363da8891\download\private_school_contact_information_eng.xlsx'.  Error message was 'File does not exist' 

It’s accessing a different link than the one i’ve inserted and not sure if that’s the reason why its not working? Thoughts?

Ha, I see you’ve figured out you need to change the user agent - this is why the request from the reader isn’t working. I have to say, it would be nice to have access to all of the HTTPCaller functionality inside normal readers, then it wouldn’t be an issue.

I recommend that you use just use a TempPathNameCreator before the HttpCaller followed by a FeatureReader.  

 


The downloaded file will automatically get cleaned up when the workspace is finished. 
 

View original
Did this help you find an answer to your question?

5 replies

ebygomm
Influencer
Forum|alt.badge.img+32
  • Influencer
  • January 14, 2025

Have you tried just using the url as the source in an excel reader?


hodsen96
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • January 14, 2025
ebygomm wrote:

Have you tried just using the url as the source in an excel reader?

Yeah I have. I get a “File Does Not Exist” error and then my translation fails. I am not sure why exactly since I am using the Direct Download Link. I am not sure if I require the headers (Which I’ve set in the HTTPCaller).

One thing I noticed is this:

Excel Reader: Failed to open the dataset 'HTTP:\data.ontario.ca\dataset\7a049187-cf29-4ffe-9028-235b95c61fa3\resource\6545c5ec-a5ce-411c-8ad5-d66363da8891\download\private_school_contact_information_eng.xlsx'.  Error message was 'File does not exist' 

It’s accessing a different link than the one i’ve inserted and not sure if that’s the reason why its not working? Thoughts?


virtualcitymatt
Celebrity
Forum|alt.badge.img+35
  • Celebrity
  • Best Answer
  • January 15, 2025
hodsen96 wrote:
ebygomm wrote:

Have you tried just using the url as the source in an excel reader?

Yeah I have. I get a “File Does Not Exist” error and then my translation fails. I am not sure why exactly since I am using the Direct Download Link. I am not sure if I require the headers (Which I’ve set in the HTTPCaller).

One thing I noticed is this:

Excel Reader: Failed to open the dataset 'HTTP:\data.ontario.ca\dataset\7a049187-cf29-4ffe-9028-235b95c61fa3\resource\6545c5ec-a5ce-411c-8ad5-d66363da8891\download\private_school_contact_information_eng.xlsx'.  Error message was 'File does not exist' 

It’s accessing a different link than the one i’ve inserted and not sure if that’s the reason why its not working? Thoughts?

Ha, I see you’ve figured out you need to change the user agent - this is why the request from the reader isn’t working. I have to say, it would be nice to have access to all of the HTTPCaller functionality inside normal readers, then it wouldn’t be an issue.

I recommend that you use just use a TempPathNameCreator before the HttpCaller followed by a FeatureReader.  

 


The downloaded file will automatically get cleaned up when the workspace is finished. 
 


virtualcitymatt
Celebrity
Forum|alt.badge.img+35

Note: I just created an idea to have the UI expanded when reading a file from the web:
 

 


hodsen96
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • January 15, 2025
virtualcitymatt wrote:
hodsen96 wrote:
ebygomm wrote:

Have you tried just using the url as the source in an excel reader?

Yeah I have. I get a “File Does Not Exist” error and then my translation fails. I am not sure why exactly since I am using the Direct Download Link. I am not sure if I require the headers (Which I’ve set in the HTTPCaller).

One thing I noticed is this:

Excel Reader: Failed to open the dataset 'HTTP:\data.ontario.ca\dataset\7a049187-cf29-4ffe-9028-235b95c61fa3\resource\6545c5ec-a5ce-411c-8ad5-d66363da8891\download\private_school_contact_information_eng.xlsx'.  Error message was 'File does not exist' 

It’s accessing a different link than the one i’ve inserted and not sure if that’s the reason why its not working? Thoughts?

Ha, I see you’ve figured out you need to change the user agent - this is why the request from the reader isn’t working. I have to say, it would be nice to have access to all of the HTTPCaller functionality inside normal readers, then it wouldn’t be an issue.

I recommend that you use just use a TempPathNameCreator before the HttpCaller followed by a FeatureReader.  

 


The downloaded file will automatically get cleaned up when the workspace is finished. 
 

Hey Thank you for replying as well as creating an idea post about the HTTPCaller. I am just unsure of how your settings are set up. I think I get your process (I never used TempPathnameCreator), that you would use the _pathname default value as the “Output File Name” in the HTTPCaller. Therefore it saves it in that temp folder. After that, connect it to the Feature Reader to read that file. But I am just not sure how to configure the Feature Reader? Do I set the Format as Microsoft Excel with the Dataset parameter as “_response_file_path”? Do I copy and paste the File Path that gets output in the feature cache from the TempPathnameCreator? Would I need to do that once for it to do it automatically everytime?

 


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings