Solved

Regex to match lines with coordinates

  • 20 December 2019
  • 2 replies
  • 3 views

Userlevel 3
Badge +26

From the attribute above containing newlines, I'm trying to extract each line of coordinates into a list using a StringSearcher with regex: \\n.+°.+°.+\\n

I don't want lines of text to be included in the list that are not coordinates, so the first line starting with "Buoy" would not be captured. It should also be noted that the lines that are not coordinates may contain a '°' symbol, such as "Range line at 186°". Because of this, my methodology consisted of matching any set of characters containing 2 '°' symbols that fall between two newlines.

My example regex seems to close, but it is also capturing the first 2 numbers of the next string. How would I need to format my regex to avoid this?

icon

Best answer by markatsafe 20 December 2019, 17:05

View original

2 replies

Badge +2

@cartoscro In the StringSearcher you can use a regex something like:

([0-9]+°[0-9]+'[NS] [0-9]+°[0-9]+[EW])

The () designate a substring. Under the Advanced panel you can create a list of all the substrings that are matched.

The result will look something like:

_coords{0}.part 44°17'N 67°41W_coords{0}.startIndex 36_coords{1}.part 43°57'N 67°45W_coords{1}.startIndex 51

I've attached a sample workspace (2019.2):regexwithsubstring.fmw

Badge +3

Let's take step back: What are you actually trying to do? Could you not read in the file as a text file, throw away the lines you don't want, and then use an AttributeSplitter transformer to get the coordinates into a list?

http://docs.safe.com/fme/2019.0/html/FME_Desktop_Documentation/FME_Transformers/Transformers/attributesplitter.htm

Reply