Question

splitting records with multiple delimiters?

Forum|Forum|9 years ago
January 10, 2017
8 replies
315 views

+14

nicholas
Contributor

Hello,

I have a CSV dataset that is mostly records like this, with a "|" delimiter

2185996|SP213604|1|C|P|UK|||||857||||GYMPIE|RD||CHERMSIDE|BRISBANE CITY|1000|-27.38682393|153.03160112|PC|GDA94

However, some of the records also have a "," delimiter

2388577|SP246762|6|C|P|UK||||WHELLER ON THE PARK|950||||GYMPIE|RD||CHERMSIDE|BRISBANE CITY|1000|-27.38253041,-27.380533|153.02547625,153.027633|PC,BC|GDA94

This is a dataset of street addresses and property points. Most points are "parcel centre" or PC. But some also have a "building centre" BC. These records have two lats, two longs and two types, separated by commas.

Is there a transformer in FME that will split each of these records into two records?

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+66

nielsgerrits
Forum|Forum|9 years ago
January 10, 2017

You could use the AttributeSplitter to solve this. "Splits a selected attribute into a list attribute. Each item in the list will contain a single token split from the list." Then you can duplicate the record and overwrite the coordinate columns with input from the list.

01-csv-ffs.fmwt

You could / should expand this with checks.

- Do the lon lat lists contain the same number of elements within the record. (ListElementCounter, Tester)

- Are the new lon lat values numeric? (Tester, Type is Numeric.)

Workbench and data added as template.

Buckle your seatbelt Dorothy, cause Kansas is going bye-bye...

Upvote

david_r
Forum|Forum|9 years ago
January 10, 2017

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.

Upvote

jeroenstiers
Forum|Forum|9 years ago
January 10, 2017

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.

I was about to give the same advise but right before typing my answer, I saw your comment appear. Damn you're fast ;)

Upvote

david_r
Forum|Forum|9 years ago
January 10, 2017

I was about to give the same advise but right before typing my answer, I saw your comment appear. Damn you're fast ;)

Can't sleep around here! ;-)

Upvote

+19

itay
Supporter
Forum|Forum|9 years ago
January 10, 2017

I would also opt for unifying the delimiters and then using a single AttributeSplitter.

Upvote

+66

nielsgerrits
Forum|Forum|9 years ago
January 10, 2017

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.

I'm learning every day so please be gentle. I would like to know how you would solve this? The rows with multiple separators will get more list elements / columns. I can't get my head around it how to do it other than manual assign the correct column names?

Buckle your seatbelt Dorothy, cause Kansas is going bye-bye...

Upvote

david_r
Forum|Forum|9 years ago
January 10, 2017

I just had another look at your data. If your CSV contains column names on the first row, I think the easiest would be to read it using the CSV reader, then split up any "double" values (e.g. coordinate pairs) using the AttributeSplitter.