Skip to main content
Solved

Extract text from string into attribute


I have assembled table in SpatiaLite holding 30 million points of depth sounding observations originating from a number of vessels, each with a large number of files.

 

To keep some metadata about where each point originates from, I've kept the file path and name in an attribute called "filepath".

 

From the attribute "filepath", I need to extract a specific part which holds the name of the vessel. I guess it would be some regex in use here?

 

 

Example of the attribute's content:

 

\\folder\\VESSEL_Lola\\folder2\\file1.shp

 

\\folder\\VESSEL_Maria\\folder1\\file2.shp

 

\\folder\\VESSEL_Lily\\folder4\\file3.shp

 

\\folder\\VESSEL_Christine\\folder1\\file4.shp

 

\\folder\\VESSEL_ClaudiaMaria\\folder2\\file5.shp

 

\\folder\\VESSEL_Maria\\folder3\\file6.shp

 

and so on..

 

 

I need to extract the "VESSEL_Maria" etc. from the attribute and map to a more explanatory value in e.g. AttributeValueMapper. There is only 12 different "VESSEL_YY" categories, but a lot of different filenames of subfolders and files written in the filepath attribute.

 

How should I construct the Source Value parameter in AttributeValueMapper or similar?

Best answer by david_r

Hi

 

 

you can use a StringSearcher like this:

 

 

 

 

You will then get the following new attributes, e.g.:

 

 

`_matched_characters' has value `VESSEL_Maria'

 

`_matched_parts{0}' has value `Maria'

 

 

David

 

 

 
View original
Did this help you find an answer to your question?
<strong>This post is closed to further activity.</strong><br /> It may be a question with a best answer, an implemented idea, or just a post needing no comment.<br /> If you have a follow-up or related question, please <a href="https://community.safe.com/topic/new">post a new question or idea</a>.<br /> If there is a genuine update to be made, please contact us and request that the post is reopened.

3 replies

david_r
Celebrity
  • Best Answer
  • August 4, 2015
Hi

 

 

you can use a StringSearcher like this:

 

 

 

 

You will then get the following new attributes, e.g.:

 

 

`_matched_characters' has value `VESSEL_Maria'

 

`_matched_parts{0}' has value `Maria'

 

 

David

 

 

 

Hi David,

 

Thank you, works fine.

 

Regex is very useful but a bit hard to grasp and construct for me.

 

More on regex in the docs for the next readers:

 

http://docs.safe.com/fme/html/FME_Workbench/FME_Workbench.htm#Workbench/Regular_Expressions.htm

 

 

So far I understand it:

 

() - matches an empty string - look for something in a text.

 

\\w - looks for alphanumeric characters (letters and numbers)

 

* - Indicates zero or more characters.

 

 

How the backslashes in my strings are avoided, I am not sure, really.

david_r
Celebrity
  • August 4, 2015
Hi

 

 

Yes, regular expressions can be incredibly powerful, but they're anything but user friendly :-)

 

 

We avoid the backslashes since they're not included in the alphanumeric character class "\\w".

 

 

The paranthesis (...) is a numbered grouping operator, it matches whatever is inside it, so that you can reference it later using the "_matched_parts{}" list attribute.

 

 

If you want to learn more about regular expressions, I can recommend this tutorial: http://www.regular-expressions.info/tutorial.html

 

 

David

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings