Solved

Extract text strings into different attributes

  • 6 January 2017
  • 7 replies
  • 135 views

Badge

Hi FME experts,

I have a number of text files that looks like this (attached). I would like to extract the numbers and store them into different attributes to process them further.

A 468832.92042276752

B 7175381.1634021383

C 2829

D 388

What is the best way to approach this?

So far, I tried unsuccessfully using attributesplitter, substringextractor

I tried wresting with regex

For instance, the following expression returns

\\((.*)\\,

everything in between the first ( and the last , but how can I extract each string separately?

Many thanks in advance.

!table !version 300 !charset WindowsLatin1

Definition Table File "a2.jpg" Type "RASTER" (468832.92042276752,7175381.1634021383) (2829,388) Label "Pt 1", (468991.81894034828,7175225.3923977558) (4127,1684) Label "Pt 2", (468917.96947360213,7175142.166946603) (3520,2370) Label "Pt 3", (468878.98707272188,7175056.807626294) (3195,3063) Label "Pt 4", (468691.66407155019,7175209.742227288) (1660,1784) Label "Pt 5", (468758.97137093107,7175297.6046385625) (2221,1071) Label "Pt 6" CoordSys Earth Projection 8, 116, "m", 153, 0, 0.9996, 500000, 10000000 Units "m"

icon

Best answer by takashi 6 January 2017, 02:29

View original

7 replies

Userlevel 2
Badge +17

Hi @thiru, I think the StringSearcher with this regular expression can extract the number parts from each text line.

\((.+?),(.+?)\)

The workflow looks like this.

0684Q00000ArLlzQAF.png

Badge

thanks @takashi for your suggestion. that worked perfectly.

Badge

Hi @thiru, I think the StringSearcher with this regular expression can extract the number parts from each text line.

\((.+?),(.+?)\)

The workflow looks like this.

0684Q00000ArLlzQAF.png

 

Hi @takashi, how do I read the second line into E, F, G, H and I need to do this until I exhausted all the number strings.

 

Should I read one line of text from the text file and use your suggestion?

 

0684Q00000ArMqDQAV.png

 

Userlevel 2
Badge +17

 

Hi @takashi, how do I read the second line into E, F, G, H and I need to do this until I exhausted all the number strings.

 

Should I read one line of text from the text file and use your suggestion?

 

 

This workflow does the trick. The NumToAlphaConverter custom transformer is published in the FME Hub.

 

 

See also the attached demo workspace (FME 2016.1.3): regex-example-2.fmwt

 

 

[Edit] Updated the workspace so that the result will be written into an Excel spreadsheet: regex-example-3.fmwt
Badge

thanks @takashi for your suggestion. that worked perfectly.

 

Hello, I have a similar case and have a problem finding the right regular expression. I get from various programs a coordinate list, which differs slightly in formatting.

Here are two examples of text lines from different files:

list.txt

 

From these lines I have to read X and Y, which are additionally interrupted with one or more spaces.

Can someone please give me a hint?

Many thanks in advance.

Userlevel 2
Badge +17

 

Hello, I have a similar case and have a problem finding the right regular expression. I get from various programs a coordinate list, which differs slightly in formatting.

Here are two examples of text lines from different files:

0684Q00000ArMeHQAV.png

list.txt

 

From these lines I have to read X and Y, which are additionally interrupted with one or more spaces.

Can someone please give me a hint? 

Many thanks in advance.

Hi @rt_gis, if the value always has decimal places, this expression matches that.

\d+\s+\d+\.\d+

For example, the StringSearcher below populates the two values from a text line into a list attribute "_list{}.match".

0684Q00000ArNETQA3.png

Badge

Hi @rt_gis, if the value always has decimal places, this expression matches that.

\d+\s+\d+\.\d+

For example, the StringSearcher below populates the two values from a text line into a list attribute "_list{}.match".

0684Q00000ArNETQA3.png

Many Thanks. You helped me again.

Reply