Skip to main content
Question

Appending updated csv file to another csv file


mappymatty
Contributor
Forum|alt.badge.img+3
Hello again!

 

 

I have a set of csv file pairs called e.g.

 

 

1111_AB.csv

 

1111_AB_type47.csv

 

 

1112_AB.csv

 

1112_AB_type47.csv

 

 

I need to manipulate the contents of the *_type47 files and then append them, minus the header line, to the foot of the master file.

 

 

In my workspace I've got a CSV reader --> FeatureTypeExtractor (to write the file name to an attribute) --> StringReplacer (to replace the _type47.csv with an empty string) --> AttributeCreator (to update the value of an attribute for each feature from 47 to 52).

 

 

Once updated I'd like to write the features to the foot of the master CSV files (based on the value in my FeatureTypeExtractor) in EXACTLY the same order and format as the source.

 

 

What's the best strategy for this? I'm a bit stuck...

 

 

Thanks,

 

 

Matt

 

 

6 replies

gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • May 12, 2014
Hi,

 

 

You could use the csv_line_number. Expose this attribute on the format attribute tab.

 

 

Add the last csv_line_number to the linenumbers from the manipulated file.

 

This, after u have skipped (or removed) the headerline(s) of the manipulated file.

 

Route both to a sorter, sort and write. (this last step might not be needed, but to make sure..)

 


takashi
Influencer
  • May 13, 2014
Hi Matt,

 

 

I would use the fanout option of the CSV writer feature type. You can control reading order for source datasets in Navigator window. That is, upper writer in Navigator will be run first.

 

 

 

Takashi

gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • May 13, 2014
When doing it that way you may want to skip header, using the csv reader parameter panel.

mappymatty
Contributor
Forum|alt.badge.img+3
  • Author
  • Contributor
  • May 15, 2014
Thanks Gio and Takashi for your rapid reponses. I've been working on another problem for the last couple of days so have only just checked back.

 

 

I've attempted the approach Takashi has provided in his screenshot. The problem I have is that the 1111_AB.csv and 1111_AB_type47.csv have records of different schemas so I receive datatype mismatch messages such as

 

 

Attribute of type decimal has an illegal value of '2014-04-30'. Value must be < 1000 or > -1000. Value set to missing value

 

String value `2014-04-30' contains invalid characters and could not be converted into a float

 

String value `%0' contains invalid characters and could not be converted into a float

 

String value `%0' contains invalid characters and could not be converted into a float

 

 

when I run the workspace.

 

 

How do I accommodate this?

 


takashi
Influencer
  • May 15, 2014
If it's allowed that the fields contain mixed type data, specify "text" to types for the fields in the CSV writer feature type. Otherwise, use the AttributeClassifier to determine data type, and modify Failed values appropriately.

mappymatty
Contributor
Forum|alt.badge.img+3
  • Author
  • Contributor
  • May 21, 2014
I ended up using two separate workbenches. 1 to perform the manipulation of attributes in the *type47.csv files and then a workspace with text line readers and writers with the Read Whole File at Once parameter set to Yes to put the files together in exactly the way they're formatted in the source files.

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings