Hah, just working with the exact same issue: Solution:
Make sure all attributes are surrounded by "" and Split by:
","
SigTill thanks for your reply.
Not sure, but this is an exact example of the a record from the csv file, there are some areas with quotes and some without;
24,"U",470,217102840,"5600L000102488","ENG",1,2000-05-04,,2000-05-04,2012-03-05,,"",,"","GROUND FLOOR",17,"",,"","",21700001,"BASEMENT, GROUND","Y","W8 6AH","LONDON","",,,""
I would like to split by the commas ignoring those within the quotes.
I hope this make more sense.
Cheers.
Is the *.csv file being input to a "csv" reader or "text" reader? If it is a *.csv reader, then the reader would have already split the values based on the commas whilst ignoring the commas enclosed within quotes.
vindastra
the csv file is being input as a "text" reader.
How about a string replacer in front of the splitter with the regex:
>*,*]
Thanks Vindastra,
It doesn't work with my requirement. It changes the table structure.
If you don't want to use a csv reader, how about using a stringreplacer to change ,<space> to a | or similar character (I am assuming all your data is structured that way?). Then use an attributesplitter on the comma and then convert the | back to ,<space>
e.g. 21700001,"BASEMENT, GROUND","Y"
to 21700001,"BASEMENT|GROUND","Y"
then attribute split
then stringreplacer turning "BASEMENT|GROUND" back to "BASEMENT, GROUND"
I have found that the CSV reader in FME handles this quite well, however if you're looking for a more programmatical way you could write your own reader using a PythonCreator and the Python csv module. See
http://docs.python.org/2/library/csv.html for details. There are some good examples, but the doco is a bit long winded. I think though the reader embedded in FME willalready hande this quite well. Most parsers take quotes into consideration. However if you have your heart set on removing the inner commas from all text then there are two methods I can think of...
Either outside or inside FME you could write a python parser that loops over each character in a row, keeping track of when it is in quotes, and when it is not, copying to a buffer only when the character is not a comma, or when it is a comma and it's not within quotes.
A fast method though that wouldn't involve coding would be to read the file with a CSVReader into FME, send each attribute through a StringReplacer replacing commas, then outputing the file through a CSVWriter.