Skip to main content
Solved

Read CSV File and split the content into lines and columns

  • July 18, 2023
  • 7 replies
  • 608 views

dataman
Contributor
Forum|alt.badge.img+6

Hey there,

I have to download several CSV Files from a website. I do it with the HTTPCALLER Transformer. It works perfect. I get ALL the data (several columns and lines) in one "cell":

imageI need the informations in lines and columns, so that I can manipulate the data.

I have tried with Pythoncaller, ListExploder, AttributeCreator,... but it doesn't work.

Do you have any idea, how can I get that? I think the problem is the configuration of the HTTPCALLER or of the other Transformers but I have tried a lot of things and nothing works.

 

Many thanks in advance!

 

Best answer by nielsgerrits

Multiple ways:

  • Writing and reading as a file. Will cost some I/O, but is probably the fastest. You can write to temp files to avoid having temp files to clear afterwards using the TempPathnameCreator. This is the way I prefer.
    • Write as CSV using a FeatureWriter.
    • Write as CSV from HTTPCaller.
    • Read written file using a FeatureReader.
  • Splitting the attributes to a list. Create attributes from listelements.
    • Split file to lines based on newline.
    • Split lines to columns based on semicolon.
    • Create attributes from listelements.
This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

7 replies

nielsgerrits
VIP
Forum|alt.badge.img+62
  • Best Answer
  • July 18, 2023

Multiple ways:

  • Writing and reading as a file. Will cost some I/O, but is probably the fastest. You can write to temp files to avoid having temp files to clear afterwards using the TempPathnameCreator. This is the way I prefer.
    • Write as CSV using a FeatureWriter.
    • Write as CSV from HTTPCaller.
    • Read written file using a FeatureReader.
  • Splitting the attributes to a list. Create attributes from listelements.
    • Split file to lines based on newline.
    • Split lines to columns based on semicolon.
    • Create attributes from listelements.

ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • July 18, 2023

Multiple ways:

  • Writing and reading as a file. Will cost some I/O, but is probably the fastest. You can write to temp files to avoid having temp files to clear afterwards using the TempPathnameCreator. This is the way I prefer.
    • Write as CSV using a FeatureWriter.
    • Write as CSV from HTTPCaller.
    • Read written file using a FeatureReader.
  • Splitting the attributes to a list. Create attributes from listelements.
    • Split file to lines based on newline.
    • Split lines to columns based on semicolon.
    • Create attributes from listelements.

No need for the FeatureWriter, you can choose save response body to file instead of attribute in the http caller


nielsgerrits
VIP
Forum|alt.badge.img+62

Multiple ways:

  • Writing and reading as a file. Will cost some I/O, but is probably the fastest. You can write to temp files to avoid having temp files to clear afterwards using the TempPathnameCreator. This is the way I prefer.
    • Write as CSV using a FeatureWriter.
    • Write as CSV from HTTPCaller.
    • Read written file using a FeatureReader.
  • Splitting the attributes to a list. Create attributes from listelements.
    • Split file to lines based on newline.
    • Split lines to columns based on semicolon.
    • Create attributes from listelements.

Attached workspace demonstrating this.

 


dataman
Contributor
Forum|alt.badge.img+6
  • Author
  • Contributor
  • July 18, 2023

Attached workspace demonstrating this.

 

Hello,

thanks for the answer. If I save the URL as CSV with the featurewriter and after that I read it as featureReader, it goes so fast, that the informations won't be as attribute read... I suppouse that the reader is too fast and the file was not saved


dataman
Contributor
Forum|alt.badge.img+6
  • Author
  • Contributor
  • July 18, 2023

Multiple ways:

  • Writing and reading as a file. Will cost some I/O, but is probably the fastest. You can write to temp files to avoid having temp files to clear afterwards using the TempPathnameCreator. This is the way I prefer.
    • Write as CSV using a FeatureWriter.
    • Write as CSV from HTTPCaller.
    • Read written file using a FeatureReader.
  • Splitting the attributes to a list. Create attributes from listelements.
    • Split file to lines based on newline.
    • Split lines to columns based on semicolon.
    • Create attributes from listelements.

The second way works perfect. Awesome! thanks


nielsgerrits
VIP
Forum|alt.badge.img+62

Attached workspace demonstrating this.

 

You need to expose the attributes manually, or using import.


nielsgerrits
VIP
Forum|alt.badge.img+62

The second way works perfect. Awesome! thanks

Cheers :)