Question

Appending updated csv file to another csv file

11 years ago
May 12, 2014
6 replies
113 views

mappymatty
Contributor
14 replies

Hello again!

I have a set of csv file pairs called e.g.

1111_AB.csv

1111_AB_type47.csv

1112_AB.csv

1112_AB_type47.csv

I need to manipulate the contents of the *_type47 files and then append them, minus the header line, to the foot of the master file.

In my workspace I've got a CSV reader --> FeatureTypeExtractor (to write the file name to an attribute) --> StringReplacer (to replace the _type47.csv with an empty string) --> AttributeCreator (to update the value of an attribute for each feature from 47 to 52).

Once updated I'd like to write the features to the foot of the master CSV files (based on the value in my FeatureTypeExtractor) in EXACTLY the same order and format as the source.

What's the best strategy for this? I'm a bit stuck...

Thanks,

Matt

+15

gio
Contributor
2252 replies
11 years ago
May 12, 2014

Hi,

You could use the csv_line_number. Expose this attribute on the format attribute tab.

Add the last csv_line_number to the linenumbers from the manipulated file.

This, after u have skipped (or removed) the headerline(s) of the manipulated file.

Route both to a sorter, sort and write. (this last step might not be needed, but to make sure..)

takashi
7703 replies
11 years ago
May 13, 2014

Hi Matt,

I would use the fanout option of the CSV writer feature type. You can control reading order for source datasets in Navigator window. That is, upper writer in Navigator will be run first.

Takashi

+15

gio
Contributor
2252 replies
11 years ago
May 13, 2014

When doing it that way you may want to skip header, using the csv reader parameter panel.

mappymatty
Author
Contributor
14 replies
11 years ago
May 15, 2014

Thanks Gio and Takashi for your rapid reponses. I've been working on another problem for the last couple of days so have only just checked back.

I've attempted the approach Takashi has provided in his screenshot. The problem I have is that the 1111_AB.csv and 1111_AB_type47.csv have records of different schemas so I receive datatype mismatch messages such as

Attribute of type decimal has an illegal value of '2014-04-30'. Value must be < 1000 or > -1000. Value set to missing value

String value `2014-04-30' contains invalid characters and could not be converted into a float

String value `%0' contains invalid characters and could not be converted into a float

when I run the workspace.

How do I accommodate this?

takashi
7703 replies
11 years ago
May 15, 2014

If it's allowed that the fields contain mixed type data, specify "text" to types for the fields in the CSV writer feature type. Otherwise, use the AttributeClassifier to determine data type, and modify Failed values appropriately.

mappymatty
Author
Contributor
14 replies
11 years ago
May 21, 2014

I ended up using two separate workbenches. 1 to perform the manipulation of attributes in the *type47.csv files and then a workspace with text line readers and writers with the Read Whole File at Once parameter set to Yes to put the files together in exactly the way they're formatted in the source files.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Appending updated csv file to another csv file