Question

splitting records with multiple delimiters?

8 years ago
January 10, 2017
8 replies
201 views

+10

nicholas
Contributor
112 replies

Hello,

I have a CSV dataset that is mostly records like this, with a "|" delimiter

2185996|SP213604|1|C|P|UK|||||857||||GYMPIE|RD||CHERMSIDE|BRISBANE CITY|1000|-27.38682393|153.03160112|PC|GDA94

However, some of the records also have a "," delimiter

2388577|SP246762|6|C|P|UK||||WHELLER ON THE PARK|950||||GYMPIE|RD||CHERMSIDE|BRISBANE CITY|1000|-27.38253041,-27.380533|153.02547625,153.027633|PC,BC|GDA94

This is a dataset of street addresses and property points. Most points are "parcel centre" or PC. But some also have a "building centre" BC. These records have two lats, two longs and two types, separated by commas.

Is there a transformer in FME that will split each of these records into two records?

+53

nielsgerrits
2781 replies
8 years ago
January 10, 2017

You could use the AttributeSplitter to solve this. "Splits a selected attribute into a list attribute. Each item in the list will contain a single token split from the list." Then you can duplicate the record and overwrite the coordinate columns with input from the list.

01-csv-ffs.fmwt

You could / should expand this with checks.

- Do the lon lat lists contain the same number of elements within the record. (ListElementCounter, Tester)

- Are the new lon lat values numeric? (Tester, Type is Numeric.)

Workbench and data added as template.

david_r
8318 replies
8 years ago
January 10, 2017

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.

jeroenstiers
178 replies
8 years ago
January 10, 2017

david_r wrote:

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.

I was about to give the same advise but right before typing my answer, I saw your comment appear. Damn you're fast ;)

david_r
8318 replies
8 years ago
January 10, 2017

jeroenstiers wrote:

I was about to give the same advise but right before typing my answer, I saw your comment appear. Damn you're fast ;)

Can't sleep around here! ;-)

+17

itay
Supporter
1441 replies
8 years ago
January 10, 2017

I would also opt for unifying the delimiters and then using a single AttributeSplitter.

+53

nielsgerrits
2781 replies
8 years ago
January 10, 2017

david_r wrote:

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.

I'm learning every day so please be gentle. I would like to know how you would solve this? The rows with multiple separators will get more list elements / columns. I can't get my head around it how to do it other than manual assign the correct column names?

david_r
8318 replies
8 years ago
January 10, 2017

nielsgerrits wrote:

I just had another look at your data. If your CSV contains column names on the first row, I think the easiest would be to read it using the CSV reader, then split up any "double" values (e.g. coordinate pairs) using the AttributeSplitter.

takashi
7556 replies
8 years ago
January 10, 2017

I think the MultiAttributeSplitter from the FME Hub can be used effectively in conjunction with the CSV reader.

Result from the two "pipe separated values" sample lines. The second sample line has been separated into two records- one for PC, another for BC.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos

splitting records with multiple delimiters?