You could use some regex in a stringreplacer to replace any comma that is preceded by an even number of quote marks with some other character and then use that in the attribute splitter (assuming your quotes are balanced)
,(?=(?:Â^\"]*\"a^\"]*\")*p^\"]*$)
@david.benoit I think the CSV reader should be able to handle this. The Field Qualifier Character controls whether <quoted> fields can include the Delimieter Character. One problem you may have encountered is that the CSV reader has an 'auto' mode for the delimiter, so in this case it seems to use <space> as the default. So being explicit about the Delimeter Character might also help:
You could use some regex in a stringreplacer to replace any comma that is preceded by an even number of quote marks with some other character and then use that in the attribute splitter (assuming your quotes are balanced)
,(?=(?:Â^\"]*\"a^\"]*\")*p^\"]*$)
Thank you!.
Â
One thing im noticing is it seems to get stuck (no error, no finish .. just stuck!) on a line break.. eg:Â
-------
,JOSMCAABFMK,2017-01-01,00:01:00,2017-01-02,00:01:00,MST,2017-01-01,07:01:00,2017-01-02,07:01:00,1.65,0.03,V0,0.70,0.03,V0,4.31,0.10,V0,-999.00,-999.00,M1,0.68,0.06,V0,3.10,0.28,V0,0.95,0.10,V0,2.64,0.17,V0,-999.00,-999.00,M1,1.05,0.06,V0,1.69,0.05,V0,0.13,0.21,V1,0.01,0.05,V1,0.51,0.13,V0,0.12,0.04,V0,4.19,0.10,V0,0.07,0.03,V0,-999.00,-999.00,M1,0.04,0.10,V1,-999.00,-999.00,M1,0.07,0.03,V0,0.03,0.07,V1,0.04,0.08,V1,4.34,0.10,V0,1.55,0.12,V0,0.16,0.13,V0,0.26,0.07,V0,2.45,0.14,V0,0.07,0.07,V0,0.26,
Â
---------
thoughts?
@david.benoit I think the CSV reader should be able to handle this. The Field Qualifier Character controls whether <quoted> fields can include the Delimieter Character. One problem you may have encountered is that the CSV reader has an 'auto' mode for the delimiter, so in this case it seems to use <space> as the default. So being explicit about the Delimeter Character might also help:
Thanks @markatsafe.
Â
Â
My version of FME already has these settings as default. I will keep working on this option and follow up.
Thanks again,
Dave
Lots of good ideas here, I'll just add that it's also possible to use e.g. the Text Line reader to read either line-by-line or the entire file in one block, then use the Python CSV module on a per-line basis, as needed.Â
Example PythonCaller:
import fmeobjects
import csv
def SplitCSVLine(feature):
    text = feature.getAttribute('text_line_data')
    if text:
        values = csv.reader(        feature.setAttribute('values{}', list(values)(0])
Sample output:
You can then either explode the list or rename the individual items as necessary.
You could use some regex in a stringreplacer to replace any comma that is preceded by an even number of quote marks with some other character and then use that in the attribute splitter (assuming your quotes are balanced)
,(?=(?:Â^\"]*\"a^\"]*\")*p^\"]*$)
Okay after filtering out NULLs (which arent needed anyway) and some giant chunks that are also not needed, i was able to make this solution work. It takes 12 minutes to run compared to about two minutes before. but this might be due to a slow network on sql connection today. Thank you!Â