Skip to main content
Solved

How to type STX control character to use as delimiter

  • March 23, 2016
  • 7 replies
  • 212 views

aaron
Contributor
Forum|alt.badge.img+12

I have a CSV file where one of the fields contains XY coordinate pairs. I want to use the AttributeSplitter to separate them so they can be mapped. Easy enough, no? Well, not so much if the coordinates are separated by an STX control character. I've tried several different ways in the Delimiter or Format String parameter of the AttributeSplitter but no luck. Does anyone know how I could?

Thanks,

Aaron Allen

Best answer by aaron

Using the StringReplacer worked @erik_jan, although I had to search for anything other than a positive or negative number to any number of decimal digits so the regexp I used is [^-0-9.]

Note the above works for my data but it's not perfect in all cases. You can see what I mean if you use the expression at http://www.regexr.com/

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

7 replies

erik_jan
Contributor
Forum|alt.badge.img+23
  • Contributor
  • March 23, 2016

If the X and Y values have a fixed length you could use the SubstringExtractor to get first the X and then the Y values in two steps.


aaron
Contributor
Forum|alt.badge.img+12
  • Author
  • Contributor
  • March 23, 2016

That's a good idea @erik_jan but unfortunately the values are not a fixed length.


erik_jan
Contributor
Forum|alt.badge.img+23
  • Contributor
  • March 23, 2016

Or you can use the StringReplacer with the regular expression \\D (= anything but a digit) and replace with ";". Then you could use the AttributeSplitter on the ";". This will not work if decimals are involved.


erik_jan
Contributor
Forum|alt.badge.img+23
  • Contributor
  • March 23, 2016

Or you can use the StringReplacer with the regular expression \\D (= anything but a digit) and replace with ";". Then you could use the AttributeSplitter on the ";". This will not work if decimals are involved.

If you use the regex [^\\d.] all characters except digits and points (decimals) will be replaced.


aaron
Contributor
Forum|alt.badge.img+12
  • Author
  • Contributor
  • Best Answer
  • March 23, 2016

Using the StringReplacer worked @erik_jan, although I had to search for anything other than a positive or negative number to any number of decimal digits so the regexp I used is [^-0-9.]

Note the above works for my data but it's not perfect in all cases. You can see what I mean if you use the expression at http://www.regexr.com/


mark2atsafe
Safer
Forum|alt.badge.img+59
  • Safer
  • March 24, 2016

As I understand it (though I haven't tried it) you could type \\cB to identify an STX character. It may even work directly in the AttributeSplitter

See: http://www.regular-expressions.info/nonprint.html


aaron
Contributor
Forum|alt.badge.img+12
  • Author
  • Contributor
  • March 24, 2016

@Mark2AtSafe I tried \\cB and several others but no joy. I even tried copying and pasting the STX character from Notepad++. (I can do a find and replace in Notepad++ BTW.) The StringReplacer will work in this case so I'll go with that. Both [^-0-9.] and [^-\\d.] as @erik_jan offered work.