Question

reading data from a url

9 years ago
February 5, 2016
6 replies
189 views

nrich_defra
Contributor
61 replies

Hi,

I'm downloading data from a url, an excel spreadsheet in this case, and i want to be able to download then read the file in the same workbench. I'm discovering the url in the first bit of my workbench so i don't know it at the start.

I can get the file into an attribute using the httpcaller, however to open it i think i would then have to use the attribute file writer to make a physical representation of the file, before using the feature reader to read that temp file.

This works, but i was wondering whether there was anyway to skip the temp file phase?

I have tried using the download url in the featurereader transformer, but i think the download is guessing at the file encoding and getting it wrong - selecting fme-binary instead of unicode. I can't see a way of telling featurereader what type of encoding to use when obtaining the url response.

or is their another way to read my file from the attribute directly rather than writing it out?

any help appreciated.

takashi
7703 replies
9 years ago
February 5, 2016

Did you try the Excel reader?

nrich_defra
Author
Contributor
61 replies
9 years ago
February 5, 2016

Hi @takashi,

I can give it a go, but I don't know the url until midway though my pipeline, and I'm trying not to branch to another workbench.

I'm attempting to use the featurereader, in excel mode, instead.

I'll give the excel reader a quick go, to see if that handles the filedownload better, but I was assuming the library behind the reader and the feature reader would be very similar, if not the same?

takashi
7703 replies
9 years ago
February 5, 2016

nrich wrote:

Hi @takashi,

I can give it a go, but I don't know the url until midway though my pipeline, and I'm trying not to branch to another workbench.

I'm attempting to use the featurereader, in excel mode, instead.

I'll give the excel reader a quick go, to see if that handles the filedownload better, but I was assuming the library behind the reader and the feature reader would be very similar, if not the same?

The internal reader module is surely the same. If you got an error from the Excel reader, the error message might be helpful to specify the issue.

takashi
7703 replies
9 years ago
February 5, 2016

nrich wrote:

Hi @takashi,

I can give it a go, but I don't know the url until midway though my pipeline, and I'm trying not to branch to another workbench.

I'm attempting to use the featurereader, in excel mode, instead.

I'll give the excel reader a quick go, to see if that handles the filedownload better, but I was assuming the library behind the reader and the feature reader would be very similar, if not the same?

Both the Excel reader and FeatureReader (Excel format) downloads a file from specified URL and saves it into TEMP folder, and then the reader module reads the download file. The file will be downloaded as-is, you don't need to mind encoding.

If you use the HTTPCaller, this setting is possible.

Request URL: <URL of the xlsx file>
HTTP Method: GET
Save Response Body: File
Output File Name: <specify an xlsx file path to which download file will be saved>
File Path Attribute: _response_file_path (you can modify this)

You can then use the FeatureReader to read the download file by setting "_response_file_path" to Dataset.

nrich_defra
Author
Contributor
61 replies
9 years ago
February 5, 2016

I've just retired the featurereader with the url directly and i get the error "can't open file for reading" from the XLSX_READER.

i had a look in the temp directory and i see an empty(0KB) .TMP file, so perhaps it's more an issue with the reader picking up the sessioncookie and authentication, rather than encoding.

As I have a viable solution using the httpcaller and the attributefilewriter (all be it with some file management i wasn't planning on) I think i'll go with that for now.

Maybe it's a future enhancement request to allow the readers, when using a web endpoint to retrieve a file, to set encoding/cookie/redirect etc, as you can in the httpcaller ?

Thanks for your help @takashi

+19

fmelizard
Safer
3725 replies
9 years ago
February 6, 2016

nrich wrote:

I've just retired the featurereader with the url directly and i get the error "can't open file for reading" from the XLSX_READER.

i had a look in the temp directory and i see an empty(0KB) .TMP file, so perhaps it's more an issue with the reader picking up the sessioncookie and authentication, rather than encoding.

As I have a viable solution using the httpcaller and the attributefilewriter (all be it with some file management i wasn't planning on) I think i'll go with that for now.

Maybe it's a future enhancement request to allow the readers, when using a web endpoint to retrieve a file, to set encoding/cookie/redirect etc, as you can in the httpcaller ?

Thanks for your help @takashi

Very good point RE: authentication when doing http:// for the dataset in both readers as well as FeatureReader. I'll let the team know. Some complex user interface awaits, but I believe it would be necessary.

BTW in 2016.1 we'll have a TempPathname maker transformer that would at least clean up automatically for you in this scenario.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

reading data from a url