Skip to main content
Solved

How to type STX control character to use as delimiter


aaron
Contributor
Forum|alt.badge.img+10
  • Contributor

I have a CSV file where one of the fields contains XY coordinate pairs. I want to use the AttributeSplitter to separate them so they can be mapped. Easy enough, no? Well, not so much if the coordinates are separated by an STX control character. I've tried several different ways in the Delimiter or Format String parameter of the AttributeSplitter but no luck. Does anyone know how I could?

Thanks,

Aaron Allen

Best answer by aaron

Using the StringReplacer worked @erik_jan, although I had to search for anything other than a positive or negative number to any number of decimal digits so the regexp I used is [^-0-9.]

Note the above works for my data but it's not perfect in all cases. You can see what I mean if you use the expression at http://www.regexr.com/

View original
Did this help you find an answer to your question?

7 replies

erik_jan
Contributor
Forum|alt.badge.img+17
  • Contributor
  • March 23, 2016

If the X and Y values have a fixed length you could use the SubstringExtractor to get first the X and then the Y values in two steps.


aaron
Contributor
Forum|alt.badge.img+10
  • Author
  • Contributor
  • March 23, 2016

That's a good idea @erik_jan but unfortunately the values are not a fixed length.


erik_jan
Contributor
Forum|alt.badge.img+17
  • Contributor
  • March 23, 2016

Or you can use the StringReplacer with the regular expression \\D (= anything but a digit) and replace with ";". Then you could use the AttributeSplitter on the ";". This will not work if decimals are involved.


erik_jan
Contributor
Forum|alt.badge.img+17
  • Contributor
  • March 23, 2016
erik_jan wrote:

Or you can use the StringReplacer with the regular expression \\D (= anything but a digit) and replace with ";". Then you could use the AttributeSplitter on the ";". This will not work if decimals are involved.

If you use the regex [^\\d.] all characters except digits and points (decimals) will be replaced.


aaron
Contributor
Forum|alt.badge.img+10
  • Author
  • Contributor
  • Best Answer
  • March 23, 2016

Using the StringReplacer worked @erik_jan, although I had to search for anything other than a positive or negative number to any number of decimal digits so the regexp I used is [^-0-9.]

Note the above works for my data but it's not perfect in all cases. You can see what I mean if you use the expression at http://www.regexr.com/


mark2atsafe
Safer
Forum|alt.badge.img+43
  • Safer
  • March 24, 2016

As I understand it (though I haven't tried it) you could type \\cB to identify an STX character. It may even work directly in the AttributeSplitter

See: http://www.regular-expressions.info/nonprint.html


aaron
Contributor
Forum|alt.badge.img+10
  • Author
  • Contributor
  • March 24, 2016

@Mark2AtSafe I tried \\cB and several others but no joy. I even tried copying and pasting the STX character from Notepad++. (I can do a find and replace in Notepad++ BTW.) The StringReplacer will work in this case so I'll go with that. Both [^-0-9.] and [^-\\d.] as @erik_jan offered work.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings