Skip to main content
Question

splitting records with multiple delimiters?

  • January 10, 2017
  • 8 replies
  • 201 views

nicholas
Contributor
Forum|alt.badge.img+10

Hello,

I have a CSV dataset that is mostly records like this, with a "|" delimiter

2185996|SP213604|1|C|P|UK|||||857||||GYMPIE|RD||CHERMSIDE|BRISBANE CITY|1000|-27.38682393|153.03160112|PC|GDA94

However, some of the records also have a "," delimiter

2388577|SP246762|6|C|P|UK||||WHELLER ON THE PARK|950||||GYMPIE|RD||CHERMSIDE|BRISBANE CITY|1000|-27.38253041,-27.380533|153.02547625,153.027633|PC,BC|GDA94

This is a dataset of street addresses and property points. Most points are "parcel centre" or PC. But some also have a "building centre" BC. These records have two lats, two longs and two types, separated by commas.

Is there a transformer in FME that will split each of these records into two records?

8 replies

nielsgerrits
VIP
Forum|alt.badge.img+53

You could use the AttributeSplitter to solve this. "Splits a selected attribute into a list attribute. Each item in the list will contain a single token split from the list." Then you can duplicate the record and overwrite the coordinate columns with input from the list.

01-csv-ffs.fmwt

You could / should expand this with checks.

 

- Do the lon lat lists contain the same number of elements within the record. (ListElementCounter, Tester)

 

- Are the new lon lat values numeric? (Tester, Type is Numeric.)

Workbench and data added as template.


david_r
Evangelist
  • January 10, 2017

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.


Forum|alt.badge.img+7
david_r wrote:

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.

I was about to give the same advise but right before typing my answer, I saw your comment appear. Damn you're fast ;)

 

 


david_r
Evangelist
  • January 10, 2017
jeroenstiers wrote:
I was about to give the same advise but right before typing my answer, I saw your comment appear. Damn you're fast ;)

 

 

Can't sleep around here! ;-)

 


itay
Supporter
Forum|alt.badge.img+17
  • Supporter
  • January 10, 2017

I would also opt for unifying the delimiters and then using a single AttributeSplitter.


nielsgerrits
VIP
Forum|alt.badge.img+53
david_r wrote:

If you have a lot of different separators, you can simplify your workspace a bit by using the StringPairReplacer to homogenize all the separators into a single one before the AttributeSplitter.

I'm learning every day so please be gentle. I would like to know how you would solve this? The rows with multiple separators will get more list elements / columns. I can't get my head around it how to do it other than manual assign the correct column names?

david_r
Evangelist
  • January 10, 2017
nielsgerrits wrote:
I'm learning every day so please be gentle. I would like to know how you would solve this? The rows with multiple separators will get more list elements / columns. I can't get my head around it how to do it other than manual assign the correct column names?
I just had another look at your data. If your CSV contains column names on the first row, I think the easiest would be to read it using the CSV reader, then split up any "double" values (e.g. coordinate pairs) using the AttributeSplitter.

takashi
Contributor
  • January 10, 2017

I think the MultiAttributeSplitter from the FME Hub can be used effectively in conjunction with the CSV reader.

Result from the two "pipe separated values" sample lines. The second sample line has been separated into two records- one for PC, another for BC.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings