I'm very new to Python and to the PythonCaller, but we all need to start somewhere. I have wrestled with this problem basically all day and I still don't get it to work.
I have a textstring "A=0,5m, B = Bredd 0 m, C= Reduktionstal". From this string I want to extract the value between "Bredd " and " m" - in this case it is 0, but in can be 10, it can be 10.2.
I'm attaching an image of my progress so far... And the PythonCaller script so far:
I have often come across this problem and I really want to learn how to solve it. Pls help!
Peter
Best answer by takashi
Hi Peter,
I would also use the StringSearcher in this case. However, if you want to learn Python regex operations, it's also a good practice of course. There could be some possible implementations, this is an example.
import fme
import fmeobjects
import re
defFeatureProcessor(feature):
m = re.search(r'Bredd\s*(\d+\.?\d*)\s*m', feature.getAttribute('text'))
if m:
feature.setAttribute('substr', m.group(1))
The editor is not good. A backslash before the dot in the regex cannot be displayed. Please insert a backslash before the dot!
Bredd\s*(\d+\\.?\d*)\s*m
Note that the "re.search" method returns a MatchObject instance, not a matched substring.
See here to learn more about regex operations with Python.
Did this help you find an answer to your question?
This post is closed to further activity.
It may be a question with a best answer, an implemented idea, or just a post needing no comment.
If you have a follow-up or related question, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.
#Get the text attribute
att = feature.getAttribute('text')
#Right part of Bredd
right = att.split('Bredd')[1]
#Left part of m
left = right.split('m')[0]
#Trim extra spacessubstr = left.strip()
feature.setAttribute('substr', substr)
I would also use the StringSearcher in this case. However, if you want to learn Python regex operations, it's also a good practice of course. There could be some possible implementations, this is an example.
import fme
import fmeobjects
import re
defFeatureProcessor(feature):
m = re.search(r'Bredd\s*(\d+\.?\d*)\s*m', feature.getAttribute('text'))
if m:
feature.setAttribute('substr', m.group(1))
The editor is not good. A backslash before the dot in the regex cannot be displayed. Please insert a backslash before the dot!
Bredd\s*(\d+\\.?\d*)\s*m
Note that the "re.search" method returns a MatchObject instance, not a matched substring.
See here to learn more about regex operations with Python.
#Get the text attribute
att = feature.getAttribute('text')
#Right part of Bredd
right = att.split('Bredd')[1]
#Left part of m
left = right.split('m')[0]
#Trim extra spacessubstr = left.strip()
feature.setAttribute('substr', substr)
Larry
Larry, thank you very much for this enlightening answer. Very much appreciated. Peter
I would also use the StringSearcher in this case. However, if you want to learn Python regex operations, it's also a good practice of course. There could be some possible implementations, this is an example.
import fme
import fmeobjects
import re
defFeatureProcessor(feature):
m = re.search(r'Bredd\s*(\d+\.?\d*)\s*m', feature.getAttribute('text'))
if m:
feature.setAttribute('substr', m.group(1))
The editor is not good. A backslash before the dot in the regex cannot be displayed. Please insert a backslash before the dot!
Bredd\s*(\d+\\.?\d*)\s*m
Note that the "re.search" method returns a MatchObject instance, not a matched substring.
See here to learn more about regex operations with Python.
Takashi, thank you very much . This was perfect and educating for me. I will use this and solve all my string related problems in a flash. Much appreciated. Peter
I made some performance measurement of all proposed solutions and here are the numbers:
Test case
Nb. of features
Run 1
Run 2
Run 3
Average
PythonCaller + string.find
1 000 000
20.5
20.3
20.1
20.3
PythonCaller + string.split
1 000 000
20.5
20.4
20.9
20.6
PythonCaller + regex
1 000 000
23.3
21.8
23.6
22.9
StringSearcher
1 000 000
32.7
32.0
32.4
32.4
Larry
Nice analysis, very helpful. I would suspect that the StringSearcher will be faster starting FME 2016, as the've replaced the Tcl regex engine with something (hopefully) more efficient. As you can see, the StringSearcher currently calls the Tcl engine for each feature that enters, leading to quite a bit of overhead.
For the sake of performance, you could also use a pre-compiled regex, it shaves off a couple of seconds:
precomp_regex = re.compile(r'Bredd\s*(\d+\\.?\d*)\s*m')
defFeatureProcessor(feature):
m = precomp_regex.search(feature.getAttribute('text'))
if m:
feature.setAttribute('substr', m.group(1))
We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.