Question

CSV writer - quote if needed not handling " correctly

  • 30 November 2020
  • 8 replies
  • 64 views

Userlevel 1
Badge +21

When writing to a csv file with the following settingsCaptureIf writing a value such as 18" pipe the field is written to the csv file as

18" pipe

This is different behaviour to the 2015 writer where this would be represented as

"18"" pipe"

It looks like the 2019 writer doesn't consider the presence of a single quote character to warrant the field value to be quoted (although i would consider it to be an unsafe character)

Is this expected? 


8 replies

Badge +3

No, this would most likely be a bug? Whilst compliance by different vendors has been variable, nonetheless the governing specification for CSV is RFC4180.

https://www.ietf.org/rfc/rfc4180.txt

...Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields...

...Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes...

...If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote...

Userlevel 1
Badge +21

No, this would most likely be a bug? Whilst compliance by different vendors has been variable, nonetheless the governing specification for CSV is RFC4180.

https://www.ietf.org/rfc/rfc4180.txt

...Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields...

...Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes...

...If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote...

Thanks, that's my thought too. I'll raise it via the supplier

Userlevel 1
Badge +21

No, this would most likely be a bug? Whilst compliance by different vendors has been variable, nonetheless the governing specification for CSV is RFC4180.

https://www.ietf.org/rfc/rfc4180.txt

...Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields...

...Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes...

...If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote...

I see you also replied here :-)

https://community.safe.com/s/question/0D54Q000080hduOSAQ/issue-with-featurewriter-csv

 

Userlevel 1
Badge +21

In the python csv module csv.QUOTE_MINIMAL works as per the 2015 csv writer

Badge +3

Yep, I remembered that similar post which was somewhat similar issue. It comes down to an FME design decision probably: If "Qualify Field Values" is set to "Is Needed", then should internal double quotes be both escaped with a double quote and the entire field value enclosed with double quotes?

 

Strictly speaking, my interpretation of RFC4180 is that the answer to that is "Yes". "If Required" should mean "If RFC4180 requires double quotes to be used". This would yield output that could be interpreted by all applications that read data in using the rules in RFC4180, rather than just a more limited set of applications like MS Excel rather than the user having to set this to Parameter to "Yes" to enforce double quoting of everything , whether needed or not!

Userlevel 1
Badge +21

I notice that in 2021 the CSV writer now qualifies field values as I would expect.

 

@mark2atsafe​ - do you know how I can find out when this was updated?

Userlevel 4
Badge +25

I notice that in 2021 the CSV writer now qualifies field values as I would expect.

 

@mark2atsafe​ - do you know how I can find out when this was updated?

Without the issue number (FMEENGINE-68298), it would need sorting through all the What's New files. But I had a look in our database and found that it was fixed for FME 2021.1 - build 21522 or greater.

Userlevel 1
Badge +21

Without the issue number (FMEENGINE-68298), it would need sorting through all the What's New files. But I had a look in our database and found that it was fixed for FME 2021.1 - build 21522 or greater.

Thanks, I'm currently working across 3 different FME versions so trying to keep track!

Reply