Question

reading data from a url

  • 5 February 2016
  • 6 replies
  • 54 views

Badge +4

Hi,

I'm downloading data from a url, an excel spreadsheet in this case, and i want to be able to download then read the file in the same workbench. I'm discovering the url in the first bit of my workbench so i don't know it at the start.

I can get the file into an attribute using the httpcaller, however to open it i think i would then have to use the attribute file writer to make a physical representation of the file, before using the feature reader to read that temp file.

This works, but i was wondering whether there was anyway to skip the temp file phase?

I have tried using the download url in the featurereader transformer, but i think the download is guessing at the file encoding and getting it wrong - selecting fme-binary instead of unicode. I can't see a way of telling featurereader what type of encoding to use when obtaining the url response.

or is their another way to read my file from the attribute directly rather than writing it out?

any help appreciated.


6 replies

Userlevel 2
Badge +17

Did you try the Excel reader?

Badge +4

Hi @takashi,

I can give it a go, but I don't know the url until midway though my pipeline, and I'm trying not to branch to another workbench.

I'm attempting to use the featurereader, in excel mode, instead.

I'll give the excel reader a quick go, to see if that handles the filedownload better, but I was assuming the library behind the reader and the feature reader would be very similar, if not the same?

Userlevel 2
Badge +17

Hi @takashi,

I can give it a go, but I don't know the url until midway though my pipeline, and I'm trying not to branch to another workbench.

I'm attempting to use the featurereader, in excel mode, instead.

I'll give the excel reader a quick go, to see if that handles the filedownload better, but I was assuming the library behind the reader and the feature reader would be very similar, if not the same?

The internal reader module is surely the same. If you got an error from the Excel reader, the error message might be helpful to specify the issue.

Userlevel 2
Badge +17

Hi @takashi,

I can give it a go, but I don't know the url until midway though my pipeline, and I'm trying not to branch to another workbench.

I'm attempting to use the featurereader, in excel mode, instead.

I'll give the excel reader a quick go, to see if that handles the filedownload better, but I was assuming the library behind the reader and the feature reader would be very similar, if not the same?

Both the Excel reader and FeatureReader (Excel format) downloads a file from specified URL and saves it into TEMP folder, and then the reader module reads the download file. The file will be downloaded as-is, you don't need to mind encoding.

If you use the HTTPCaller, this setting is possible.

  • Request URL: <URL of the xlsx file>
  • HTTP Method: GET
  • Save Response Body: File
  • Output File Name: <specify an xlsx file path to which download file will be saved>
  • File Path Attribute: _response_file_path (you can modify this)

You can then use the FeatureReader to read the download file by setting "_response_file_path" to Dataset.

Badge +4

I've just retired the featurereader with the url directly and i get the error "can't open file for reading" from the XLSX_READER.

i had a look in the temp directory and i see an empty(0KB) .TMP file, so perhaps it's more an issue with the reader picking up the sessioncookie and authentication, rather than encoding.

As I have a viable solution using the httpcaller and the attributefilewriter (all be it with some file management i wasn't planning on) I think i'll go with that for now.

Maybe it's a future enhancement request to allow the readers, when using a web endpoint to retrieve a file, to set encoding/cookie/redirect etc, as you can in the httpcaller ?

Thanks for your help @takashi

Userlevel 4
Badge +13

I've just retired the featurereader with the url directly and i get the error "can't open file for reading" from the XLSX_READER.

i had a look in the temp directory and i see an empty(0KB) .TMP file, so perhaps it's more an issue with the reader picking up the sessioncookie and authentication, rather than encoding.

As I have a viable solution using the httpcaller and the attributefilewriter (all be it with some file management i wasn't planning on) I think i'll go with that for now.

Maybe it's a future enhancement request to allow the readers, when using a web endpoint to retrieve a file, to set encoding/cookie/redirect etc, as you can in the httpcaller ?

Thanks for your help @takashi

Very good point RE: authentication when doing http:// for the dataset in both readers as well as FeatureReader. I'll let the team know. Some complex user interface awaits, but I believe it would be necessary.

BTW in 2016.1 we'll have a TempPathname maker transformer that would at least clean up automatically for you in this scenario.

Reply