Skip to main content
Solved

Extract text strings into different attributes


Forum|alt.badge.img

Hi FME experts,

I have a number of text files that looks like this (attached). I would like to extract the numbers and store them into different attributes to process them further.

A 468832.92042276752

B 7175381.1634021383

C 2829

D 388

What is the best way to approach this?

So far, I tried unsuccessfully using attributesplitter, substringextractor

I tried wresting with regex

For instance, the following expression returns

\\((.*)\\,

everything in between the first ( and the last , but how can I extract each string separately?

Many thanks in advance.

!table !version 300 !charset WindowsLatin1

Definition Table File "a2.jpg" Type "RASTER" (468832.92042276752,7175381.1634021383) (2829,388) Label "Pt 1", (468991.81894034828,7175225.3923977558) (4127,1684) Label "Pt 2", (468917.96947360213,7175142.166946603) (3520,2370) Label "Pt 3", (468878.98707272188,7175056.807626294) (3195,3063) Label "Pt 4", (468691.66407155019,7175209.742227288) (1660,1784) Label "Pt 5", (468758.97137093107,7175297.6046385625) (2221,1071) Label "Pt 6" CoordSys Earth Projection 8, 116, "m", 153, 0, 0.9996, 500000, 10000000 Units "m"

Best answer by takashi

Hi @thiru, I think the StringSearcher with this regular expression can extract the number parts from each text line.

\((.+?),(.+?)\)

The workflow looks like this.

0684Q00000ArLlzQAF.png

View original
Did this help you find an answer to your question?

7 replies

takashi
Influencer
  • Best Answer
  • January 6, 2017

Hi @thiru, I think the StringSearcher with this regular expression can extract the number parts from each text line.

\((.+?),(.+?)\)

The workflow looks like this.

0684Q00000ArLlzQAF.png


Forum|alt.badge.img
  • Author
  • January 6, 2017

thanks @takashi for your suggestion. that worked perfectly.


Forum|alt.badge.img
  • Author
  • January 6, 2017
takashi wrote:

Hi @thiru, I think the StringSearcher with this regular expression can extract the number parts from each text line.

\((.+?),(.+?)\)

The workflow looks like this.

0684Q00000ArLlzQAF.png

 

Hi @takashi, how do I read the second line into E, F, G, H and I need to do this until I exhausted all the number strings.

 

Should I read one line of text from the text file and use your suggestion?

 

0684Q00000ArMqDQAV.png

 


takashi
Influencer
  • January 6, 2017
thiru wrote:

 

Hi @takashi, how do I read the second line into E, F, G, H and I need to do this until I exhausted all the number strings.

 

Should I read one line of text from the text file and use your suggestion?

 

 

This workflow does the trick. The NumToAlphaConverter custom transformer is published in the FME Hub.

 

 

See also the attached demo workspace (FME 2016.1.3): regex-example-2.fmwt

 

 

[Edit] Updated the workspace so that the result will be written into an Excel spreadsheet: regex-example-3.fmwt

Forum|alt.badge.img
  • November 21, 2018
thiru wrote:

thanks @takashi for your suggestion. that worked perfectly.

 

Hello, I have a similar case and have a problem finding the right regular expression. I get from various programs a coordinate list, which differs slightly in formatting.

Here are two examples of text lines from different files:

list.txt

 

From these lines I have to read X and Y, which are additionally interrupted with one or more spaces.

Can someone please give me a hint?

Many thanks in advance.


takashi
Influencer
  • November 21, 2018
rt_gis wrote:

 

Hello, I have a similar case and have a problem finding the right regular expression. I get from various programs a coordinate list, which differs slightly in formatting.

Here are two examples of text lines from different files:

0684Q00000ArMeHQAV.png

list.txt

 

From these lines I have to read X and Y, which are additionally interrupted with one or more spaces.

Can someone please give me a hint? 

Many thanks in advance.

Hi @rt_gis, if the value always has decimal places, this expression matches that.

\d+\s+\d+\.\d+

For example, the StringSearcher below populates the two values from a text line into a list attribute "_list{}.match".

0684Q00000ArNETQA3.png


Forum|alt.badge.img
  • November 21, 2018
takashi wrote:

Hi @rt_gis, if the value always has decimal places, this expression matches that.

\d+\s+\d+\.\d+

For example, the StringSearcher below populates the two values from a text line into a list attribute "_list{}.match".

0684Q00000ArNETQA3.png

Many Thanks. You helped me again.

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings