Question

Remove commas within quotes

Forum|Forum|13 years ago
November 27, 2012
9 replies
220 views

odeneho

Hi,

I have a csv file which I would like to split by comma into separate columns with Attributesplitter. However, there are instances where there are commas within quotes which I don't want to split.

Is there any way I can split by comma whilst ignoring the ones within quotes or how can I remove commas within quotes? I have tried with regular expression but no success.

An example data;

"BASEMENT, GROUND","Y","W8 6AH","LONDON","",,,""

TO:

"BASEMENT GROUND","Y","W8 6AH","LONDON","",,,""

"BASEMENT, GROUND, FIRST","Y","SW3 3AD"

TO:

"BASEMENT GROUND FIRST","Y","SW3 3AD"

Thank you

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+25

sigtill
Supporter
Forum|Forum|13 years ago
November 27, 2012

Hah, just working with the exact same issue: Solution:

Make sure all attributes are surrounded by "" and Split by:

","

The real SigTill

Upvote

odeneho
Author
Forum|Forum|13 years ago
November 27, 2012

SigTill thanks for your reply.

Not sure, but this is an exact example of the a record from the csv file, there are some areas with quotes and some without;

24,"U",470,217102840,"5600L000102488","ENG",1,2000-05-04,,2000-05-04,2012-03-05,,"",,"","GROUND FLOOR",17,"",,"","",21700001,"BASEMENT, GROUND","Y","W8 6AH","LONDON","",,,""

I would like to split by the commas ignoring those within the quotes.

I hope this make more sense.

Cheers.

Upvote

vindastra
Forum|Forum|13 years ago
November 27, 2012

Is the *.csv file being input to a "csv" reader or "text" reader? If it is a *.csv reader, then the reader would have already split the values based on the commas whilst ignoring the commas enclosed within quotes.

Upvote

odeneho
Author
Forum|Forum|13 years ago
November 27, 2012

vindastra

the csv file is being input as a "text" reader.

Upvote

vindastra
Forum|Forum|13 years ago
November 27, 2012

How about a string replacer in front of the splitter with the regex:

[*,*]

Upvote

odeneho
Author
Forum|Forum|13 years ago
November 27, 2012

Thanks Vindastra,

It doesn't work with my requirement. It changes the table structure.

Upvote

tdavis
Forum|Forum|13 years ago
November 28, 2012

If you don't want to use a csv reader, how about using a stringreplacer to change ,<space> to a | or similar character (I am assuming all your data is structured that way?). Then use an attributesplitter on the comma and then convert the | back to ,<space>

e.g. 21700001,"BASEMENT, GROUND","Y"

to 21700001,"BASEMENT|GROUND","Y"

then attribute split

then stringreplacer turning "BASEMENT|GROUND" back to "BASEMENT, GROUND"

Upvote

odeneho
Author
Forum|Forum|13 years ago
November 28, 2012

Thanks TDavis, your solution works.

Upvote

russell
Forum|Forum|13 years ago
November 28, 2012

I have found that the CSV reader in FME handles this quite well, however if you're looking for a more programmatical way you could write your own reader using a PythonCreator and the Python csv module. See http://docs.python.org/2/library/csv.html for details. There are some good examples, but the doco is a bit long winded. I think though the reader embedded in FME willalready hande this quite well. Most parsers take quotes into consideration. However if you have your heart set on removing the inner commas from all text then there are two methods I can think of...

Either outside or inside FME you could write a python parser that loops over each character in a row, keeping track of when it is in quotes, and when it is not, copying to a buffer only when the character is not a comma, or when it is a comma and it's not within quotes.

A fast method though that wouldn't involve coding would be to read the file with a CSVReader into FME, send each attribute through a StringReplacer replacing commas, then outputing the file through a CSVWriter.

Upvote

Remove commas within quotes

9 replies

Community Stats

Latest FME

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded