Skip to main content
Question

CSV file with comma-space as separator -> attribute name starting with space


arnold_bijlsma
Enthusiast
Forum|alt.badge.img+14

My input CSV file has comma-space as separator (see attached). On the CSV reader, there doesn't seem to be an option to separate two characters, so I read it in by specifying just a comma. For the data this is not an issue as I can just trim the values. However, for the header line containing the attribute names it is a problem, because all my attribute names (bar the first one) are now prefixed by a space, and hardly any transformer can handle such attribute names. I can't use the AttributeManager or AttributeCopier or AttributeRenamer or ExpressionEvaluator, as they all automatically strip away the starting space from the name, and then they turn red for being an unrecognised attribute name. The only exception is the BulkAttributeRenamer, which allows me to strip away the prefixed space character, so I do have a workaround.

Is there a way of setting comma-space as separator?

Should I report it as a bug that all these commonly used transformers are unable to handle an attribute name that gets provided by a commonly used reader?

Short example file attached:

2 replies

Forum|alt.badge.img+2
  • November 1, 2019

@arnold_bijlsma I don't think we've seen a comma/space used as a CSV separator before. You have a workaround with BulkAttributeRenamer, but you could post this as an idea and see if other users vote for it.

I think I'd lean towards creating a workspace to preprocess your data and clean it up. Workspace example (2019.1) ReplaceCommaSpaceDelimiter.fmwt


arnold_bijlsma
Enthusiast
Forum|alt.badge.img+14
  • Author
  • Enthusiast
  • November 5, 2019
markatsafe wrote:

@arnold_bijlsma I don't think we've seen a comma/space used as a CSV separator before. You have a workaround with BulkAttributeRenamer, but you could post this as an idea and see if other users vote for it.

I think I'd lean towards creating a workspace to preprocess your data and clean it up. Workspace example (2019.1) ReplaceCommaSpaceDelimiter.fmwt

Thanks for confirming I didn't miss anything obvious re. the CSV separator. I think space-comma as a separator is too rare to gain any traction as an idea, in particular since the workarounds are straightforward.

 

 

But I'm still not sure whether the Open Text Editor and Open Arithmetic Editor in the AttributeManager (and many more) should be trimming attribute names automatically. Excel is another example where attribute names very rarely (often unintentionally) can have a space prefix. The Reader can handle it, but the editors on the transformers cannot.

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings