Solved

Linebreaks in a csv file

12 years ago
March 27, 2013
10 replies
183 views

bas
4 replies

I use a csv reader to read a csv file. Unfortunately this csv file contains line breaks in some records (CRLF). This causes a record to be split into more than one, which is undesirable.

Is there a way for the reader to ignore these linebreaks?

Best answer by david_r

Hi,

I does not seem like FME supports CSV files with newlines, even when they are quoted.

Try to insert the following script into a PythonCreator, it uses the Python CSV module which supports newlines:

import fmeobjects
import csv

class FeatureCreator(object):
    def __init__(self):
        self.inputfilename = FME_MacroValues['INPUT_CSV_FILE']
        self.csvdelimiter = ',' # Modify as needed
        self.csvquotechar = '"' # Modify as needed
        self.log = fmeobjects.FMELogFile()
        self.fieldnames = []
        
    def close(self):
        with open(self.inputfilename, 'rb') as csvfile:
            csvreader = csv.reader(csvfile, 
                                   delimiter=self.csvdelimiter, 
                                   quotechar=self.csvquotechar)
            for n, row in enumerate(csvreader):
                if n == 0:
                    self.fieldnames = row
                    self.log.logMessageString("Attribute names to expose " + \
                        "in the PythonCreator:", fmeobjects.FME_WARN)
                    for field in row:
                        self.log.logMessageString("    "+field, fmeobjects.FME_WARN)
                else:
                    feature = fmeobjects.FMEFeature()
                    for m, value in enumerate(row):
                        feature.setAttribute(self.fieldnames[m], value)
                    self.pyoutput(feature)

Notes:

The CSV filename must be defined in a User Parameter (public or private) called INPUT_CSV_FILE
To make the attribute names of the CSV file visible in the Workbench, you will have to add the list of parameter names to the PythonCreator as "Attributes to expose". When you run the script it will output this list for you to the FME log window (blue lines near the top).
Tested with the wikipedia CSV test data and FME2013.

Hope this helps.

David

View original

Did this help you find an answer to your question?

david_r
8355 replies
12 years ago
March 27, 2013

Hi,

is your field value containing the line break surrounded by quotation marks?

Example:

"the is a

multi-line text"

It would help if you could post a sample record.

David

bas
Author
4 replies
12 years ago
March 27, 2013

Hi David,

Thank you for the quick reply

Yes, the value is surrounded by quotation marks. I don't see a way to attach a csv file here but the record looks something like this:

"E","30-05-2007 22:00:00","16-07-2007 22:00:00","ACTIVE","BEMETERD","18062013XX259<crlf>

","18062013XX","SPZ REGIO 7 ZUIDWEST E"

david_r
8355 replies
Best Answer
12 years ago
March 27, 2013

Hi,

I does not seem like FME supports CSV files with newlines, even when they are quoted.

Try to insert the following script into a PythonCreator, it uses the Python CSV module which supports newlines:

import fmeobjects
import csv

class FeatureCreator(object):
    def __init__(self):
        self.inputfilename = FME_MacroValues['INPUT_CSV_FILE']
        self.csvdelimiter = ',' # Modify as needed
        self.csvquotechar = '"' # Modify as needed
        self.log = fmeobjects.FMELogFile()
        self.fieldnames = []
        
    def close(self):
        with open(self.inputfilename, 'rb') as csvfile:
            csvreader = csv.reader(csvfile, 
                                   delimiter=self.csvdelimiter, 
                                   quotechar=self.csvquotechar)
            for n, row in enumerate(csvreader):
                if n == 0:
                    self.fieldnames = row
                    self.log.logMessageString("Attribute names to expose " + \
                        "in the PythonCreator:", fmeobjects.FME_WARN)
                    for field in row:
                        self.log.logMessageString("    "+field, fmeobjects.FME_WARN)
                else:
                    feature = fmeobjects.FMEFeature()
                    for m, value in enumerate(row):
                        feature.setAttribute(self.fieldnames[m], value)
                    self.pyoutput(feature)

Notes:

The CSV filename must be defined in a User Parameter (public or private) called INPUT_CSV_FILE
To make the attribute names of the CSV file visible in the Workbench, you will have to add the list of parameter names to the PythonCreator as "Attributes to expose". When you run the script it will output this list for you to the FME log window (blue lines near the top).
Tested with the wikipedia CSV test data and FME2013.

Hope this helps.

David

bas
Author
4 replies
12 years ago
March 27, 2013

Thank you David, this works great!

markatsafe
1891 replies
9 years ago
April 19, 2016

fixcsvpython.fmw This little workspace will clean-up CSV files that have embedded linefeeds and then the CSV will process the data OK.

adamajm
Participant
2 replies
9 years ago
May 9, 2016

@MarkAtSafe - can you share that fmw files again. Seems to be coming up empty. Thanks!

+17

tino
Contributor
27 replies
8 years ago
October 12, 2016

@david_r : Thank you very much, this is a great solution!

@MarkAtSafe : It would be nice, if this could become an option for the default CSV-Reader and writer.

+39

ebygomm
Influencer
3313 replies
8 years ago
October 12, 2016

tino wrote:

@david_r : Thank you very much, this is a great solution!

@MarkAtSafe : It would be nice, if this could become an option for the default CSV-Reader and writer.

Add ability to read csv with linebreaks

Coming in 2017, see link above

takashi
7715 replies
8 years ago
October 12, 2016

Hi @tino and everyone, take a look at the CSV2 Reader/Writer in the latest FME 2017.0 beta!

markatsafe
1891 replies
8 years ago
October 12, 2016

The workspace to pre-process your CSV to remove embedded linefeed / linebreaks is available in the KnowledgeBase article. Thanks @takashi for pointing out that this has been been addressed in FME 2017 beta releases in the updated CSV reader. To take advantage of the new reader in an existing workspace you need to add a new CSV reader and then remove or disable the original one.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Linebreaks in a csv file