Skip to main content
Solved

CSV "Flat file" Restructure


Forum|alt.badge.img+3

Hi,

I'd like to restructure a csv by separating what is essentially one column of data with a huge text string which makes up constituent attributes and the corresponding data below it into the attribute values. Something akin to the 'text to column' in Excel.

Apologies if I haven't made this very clear, hopefully the csv I've provided will explain further. The top row contains the column headings I wish to split out.

Any suggestions would be most appreciated.

Best answer by mark2catsafe

As a follow up to this...

FME will read the file and split it into fields using the CSV reader. Set the separator to the tab character and check the option to ignore duplicate separators.

However... what you also need to do is remove the quote characters (") from the beginning and end of each line. That's because separator characters inside quotes are taken to be part of the string, not separators. So FME assumes it is a single field of data. Remove the quotes and the data can be read quote easily.

Hope this helps

Mark

View original
Did this help you find an answer to your question?

8 replies

pratap
Contributor
Forum|alt.badge.img+11
  • Contributor
  • June 22, 2016

Hi,

I think, it is better to do in Excel itself. Use text to columns with tab delimiters.

Is there any reason that you are using FME for this activity.


redgeographics
Celebrity
Forum|alt.badge.img+47

This appears to be a tab-separated file. When you create a CSV reader you have the option to specify the separator character in the parameters. Please note that for CSV this can only be done when creating the reader, as it influences the way feature types are set up. You can also specify that the first line of the file holds the attribute names.

There's some tutorials here.

Hope this helps.


david_r
Evangelist
  • June 22, 2016

You can use either the CSV or the textfile reader, then use an AttributeSplitter and use a TAB as the delimiter, like this:

Make sure to set "Drop emtpy parts" to No.

You will then get a list with all the columns.

Alternatively you could do it in two steps

  1. Workspace with Textfile reader/writer and an AttributeTrimmer to get rid of the quotation marks
  2. Another workspace to read the now regular CSV file using TAB as delimiter

Forum|alt.badge.img+5
redgeographics wrote:

This appears to be a tab-separated file. When you create a CSV reader you have the option to specify the separator character in the parameters. Please note that for CSV this can only be done when creating the reader, as it influences the way feature types are set up. You can also specify that the first line of the file holds the attribute names.

There's some tutorials here.

Hope this helps.

The CSV reader should let you choose tab as a delimiter, but when I did this FME didn't identify the different fields (I'm using 2016.1) - so I filed a PR with our developers to see what they think. I agree with you; it should be possible to read this one fairly easily.


takashi
Contributor
Forum|alt.badge.img+19
  • Contributor
  • June 22, 2016

Hi @jake, in FME 2016, it might be possible to read the source CSV, write an intermediate CSV, and then read it and write all the records into an Excel file with Dynamic Schema.

This is an experimental dynamic workflow including FeatureWriter and FeatureReader. See also the attached template: experimental-dynamic-textline2xlsxw.fmwt (FME 2016.1)


Forum|alt.badge.img+3
  • Author
  • July 5, 2016

Hi All,

Apologies for the delay and thanks a lot for your ideas. Plenty to mull over there!

Jake


Forum|alt.badge.img+5
  • Best Answer
  • July 13, 2016

As a follow up to this...

FME will read the file and split it into fields using the CSV reader. Set the separator to the tab character and check the option to ignore duplicate separators.

However... what you also need to do is remove the quote characters (") from the beginning and end of each line. That's because separator characters inside quotes are taken to be part of the string, not separators. So FME assumes it is a single field of data. Remove the quotes and the data can be read quote easily.

Hope this helps

Mark


tim_wood
Contributor
Forum|alt.badge.img+8
  • Contributor
  • October 19, 2018
pratap wrote:

Hi,

I think, it is better to do in Excel itself. Use text to columns with tab delimiters.

Is there any reason that you are using FME for this activity.

Why would anyone NOT want to use FME for every activity? ;-)

 

 


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings