Skip to main content
Solved

Regex to match lines with coordinates

  • December 20, 2019
  • 2 replies
  • 94 views

dustin
Influencer
Forum|alt.badge.img+31

From the attribute above containing newlines, I'm trying to extract each line of coordinates into a list using a StringSearcher with regex: \\n.+°.+°.+\\n

I don't want lines of text to be included in the list that are not coordinates, so the first line starting with "Buoy" would not be captured. It should also be noted that the lines that are not coordinates may contain a '°' symbol, such as "Range line at 186°". Because of this, my methodology consisted of matching any set of characters containing 2 '°' symbols that fall between two newlines.

My example regex seems to close, but it is also capturing the first 2 numbers of the next string. How would I need to format my regex to avoid this?

Best answer by markatsafe

@cartoscro In the StringSearcher you can use a regex something like:

([0-9]+°[0-9]+'[NS] [0-9]+°[0-9]+[EW])

The () designate a substring. Under the Advanced panel you can create a list of all the substrings that are matched.

The result will look something like:

_coords{0}.part 44°17'N 67°41W_coords{0}.startIndex 36_coords{1}.part 43°57'N 67°45W_coords{1}.startIndex 51

I've attached a sample workspace (2019.2):regexwithsubstring.fmw

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

Forum|alt.badge.img+2
  • 1891 replies
  • Best Answer
  • December 20, 2019

@cartoscro In the StringSearcher you can use a regex something like:

([0-9]+°[0-9]+'[NS] [0-9]+°[0-9]+[EW])

The () designate a substring. Under the Advanced panel you can create a list of all the substrings that are matched.

The result will look something like:

_coords{0}.part 44°17'N 67°41W_coords{0}.startIndex 36_coords{1}.part 43°57'N 67°45W_coords{1}.startIndex 51

I've attached a sample workspace (2019.2):regexwithsubstring.fmw


arnold_bijlsma
Enthusiast
Forum|alt.badge.img+15
  • Enthusiast
  • 126 replies
  • December 20, 2019

Let's take step back: What are you actually trying to do? Could you not read in the file as a text file, throw away the lines you don't want, and then use an AttributeSplitter transformer to get the coordinates into a list?

http://docs.safe.com/fme/2019.0/html/FME_Desktop_Documentation/FME_Transformers/Transformers/attributesplitter.htm