Solved

Parse CSV data stored in database table

6 years ago
September 14, 2018
4 replies
122 views

+6

djmcdermott
Contributor
42 replies

Hi FME'er,

I have CSV data that is stored in a table in Oracle. The data is created using a stored procedure and is stored in a single field with one row per line. I would like to read and parse this data as a CSV file into FME for further processing. I figured I would start by aggregating the field (attribute) into a single row but don't know where to go from there. I can't use the AttributeSplitter because it uses " qualifiers around data that has commas.

Many Thanks,

David

Best answer by david_r

I'm thinking that a PythonCaller with the CSV module is perfect for this, it already has all the necessary mechanisms for dealing with lots of edge cases like quotations, newlines, etc.

Try something like:

from fmeobjects import *import csv class ParseCSVString(object):    def input(self, feature):        csv_string = [feature.getAttribute('CSV_LINE') or '']        csv_parser = csv.reader(csv_string)        for record in csv_parser:            f = feature.clone()            for n, value in enumerate(record):                f.setAttribute('value{%s}' % n, value)            self.pyoutput(f)

Expose the attribute list "value{}" in the PythonCaller.

csv-line.fmwt

View original

Did this help you find an answer to your question?

david_r
8354 replies
Best Answer
6 years ago
September 14, 2018

I'm thinking that a PythonCaller with the CSV module is perfect for this, it already has all the necessary mechanisms for dealing with lots of edge cases like quotations, newlines, etc.

Try something like:

from fmeobjects import *import csv class ParseCSVString(object):    def input(self, feature):        csv_string = [feature.getAttribute('CSV_LINE') or '']        csv_parser = csv.reader(csv_string)        for record in csv_parser:            f = feature.clone()            for n, value in enumerate(record):                f.setAttribute('value{%s}' % n, value)            self.pyoutput(f)

Expose the attribute list "value{}" in the PythonCaller.

csv-line.fmwt

+15

gio
Contributor
2252 replies
6 years ago
September 14, 2018

You can simple read this field. You would then get an attribute with the csv in it.

Separate/split by newline (texteditor->specialcharacters->newline). Explode list.

Connect a datainspector transformer.

(If fieldname row has different qualifers, separate it by choosing _element_index = 0, to treat is separately. Else, no need to.)

Export the fieldnames list to a txt list by copying it from the feature information window in the Data inspector. (select the list{} attribute names and the attribute values with "copy text with indentation" and save txt file.

If there is no attribute value containing a comma, split by comma, else first use stringreplacer to replace by for instance a backslash (again via texteditor or copying a backslash form another file). Then a stringreplacer to remove the qualifiers, use regexp so you can get the first and last (as to not remove non qualifying ') by regexp= ^'|'$

Now u can split by backslash (or whatever you chose).

Use a renamer to rename the _list{} attributes using the import function. (a addition in recent fme versions I am very happy about)

+15

gio
Contributor
2252 replies
6 years ago
September 14, 2018

gio wrote:

You can simple read this field. You would then get an attribute with the csv in it.

Separate/split by newline (texteditor->specialcharacters->newline). Explode list.

Connect a datainspector transformer.

(If fieldname row has different qualifers, separate it by choosing _element_index = 0, to treat is separately. Else, no need to.)

Export the fieldnames list to a txt list by copying it from the feature information window in the Data inspector. (select the list{} attribute names and the attribute values with "copy text with indentation" and save txt file.

If there is no attribute value containing a comma, split by comma, else first use stringreplacer to replace by for instance a backslash (again via texteditor or copying a backslash form another file). Then a stringreplacer to remove the qualifiers, use regexp so you can get the first and last (as to not remove non qualifying ') by regexp= ^'|'$

Now u can split by backslash (or whatever you chose).

Use a renamer to rename the _list{} attributes using the import function. (a addition in recent fme versions I am very happy about)

I'm assuming the csv is stored in a single record.

Else first part (splitting by newline) is not necessary.

+6

djmcdermott
Author
Contributor
42 replies
6 years ago
September 14, 2018

david_r wrote:

I'm thinking that a PythonCaller with the CSV module is perfect for this, it already has all the necessary mechanisms for dealing with lots of edge cases like quotations, newlines, etc.

Try something like:

from fmeobjects import *import csv class ParseCSVString(object):    def input(self, feature):        csv_string = [feature.getAttribute('CSV_LINE') or '']        csv_parser = csv.reader(csv_string)        for record in csv_parser:            f = feature.clone()            for n, value in enumerate(record):                f.setAttribute('value{%s}' % n, value)            self.pyoutput(f)

Expose the attribute list "value{}" in the PythonCaller.

csv-line.fmwt

Works perfectly. Thanks.

Parse CSV data stored in database table

4 replies

Reply

Helpful Members This Week

Recently Solved Questions

FME Flow version control how to use different branch

Parameters within group parameters not available in a webhook?

How to restart a REST Server in ArcGIS Server?

Remove last CR/LF from a CSV

1019 error with change detector and polygons

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

Helpful Members This Week

Recently Solved Questions

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings