Question

Isolate first line of text using RegEx

Forum|Forum|6 years ago
April 24, 2019
4 replies
479 views

tlabelleacc

Hi @takashi, do you have any ideas on how I would isoloate this first line of text in each 'cluster' of text? I have about 3000 to isolate.....The cluster formats are all the same but the first line of text varies in length.

This information is stored in a text file at the moment and each clusters is separated by 1 character return.

WGS84-YORK_LANDING-CZAC

DESC_NM WGS84 YORK_LANDING CZAC

DT_NAME WGS84

PROJ LM

UNIT INCH

WGS84-CASTLEGAR_-CCT3

DESC_NM WGS84 CASTLEGAR_ CCT3

DT_NAME WGS84

PROJ LM

UNIT INCH

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+10

lars_de_vries
Forum|Forum|6 years ago
April 24, 2019

@tlabelleacc,

It is not clear to me whether the so called clusters are in separate files or can be found within a file.

If it is the latter, I think I would first break up the text by searching for a dubble newline. This can be done using an AttributeSplitter and a ListExploder.

Second, when I look at the texts that are marked bold, I see no white spaces or other similar characters. So you could use a RegEx syntax like ^(.*)?\\s to get the first line or, if you don't want to use RegEx, just repeat the previous step and search for a single newline character and put the first list item in a new attribute. That would probably do the trick as well.

Though I'm not @Takashi, I can only wish to become a grandmaster like him, I do hope it helps you forward.

Upvote

T

tlabelleacc
Author
Forum|Forum|6 years ago
April 24, 2019

@tlabelleacc,

It is not clear to me whether the so called clusters are in separate files or can be found within a file.

If it is the latter, I think I would first break up the text by searching for a dubble newline. This can be done using an AttributeSplitter and a ListExploder.

Second, when I look at the texts that are marked bold, I see no white spaces or other similar characters. So you could use a RegEx syntax like ^(.*)?\\s to get the first line or, if you don't want to use RegEx, just repeat the previous step and search for a single newline character and put the first list item in a new attribute. That would probably do the trick as well.

Though I'm not @Takashi, I can only wish to become a grandmaster like him, I do hope it helps you forward.

All the clusters of text are in one TXT File.

The AttributeSplitter would create over 6000+ lists as my text file has over 3000 clusters...Unfortunately the RegEx did not return anything through the RegAttributeSplitter?

Upvote

+46

ebygomm
Influencer
Forum|Forum|6 years ago
April 24, 2019

Can you provide a sample of the text file itself? My first thought would be to read the text file line by line and use adjacent attribute handling to isolate the lines you want.

keep_after_blank.fmwt

Upvote

takashi
Forum|Forum|6 years ago
April 25, 2019

If the first line always starts with 'WGS84-' and other lines don't, simply you can use a Tester with the "Begins With" operator to isolate the first line.

Why not inspect features with Visual/Data Preview and Feature/Record Information before writing them into a destination dataset?

Upvote

Isolate first line of text using RegEx

4 replies

Community Stats

Latest FME

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded