Skip to main content
Question

filter alphabetic characters

  • 22 March 2013
  • 8 replies
  • 114 views

Hello,

 

i've got an excel file with two attributes (serial id and a text field) which should be joined to a fgdb feature class by sid. Up to here no problem.

 

But in the text field there are parcel numbers and also owners. My question: Is there a possibility to filter the alphabetic characters in order to keep only the parcel owners in this field?

 

 

best regards
Hi,

 

 

take a look at the AttributeClassifier.

 

 

David
I tried AttributeClassifier before asking here... but nothing passed filtering by alphabetic Classifictaion To Test.

 

 

The text field looks something like this:

 

099933-000001,135,TEXTABC,0/0,0002.00.00.00.00,0/0,,,TEXTABC,,,,,,,,

 


Hi,

 

A stringsearcher with the following regular expression ((a-zA-Z]) will result in alphabetic characters only.

 

hope this helps
Hi again,

 

 

that looks like a comma-separated text string. If that is the case, you can use an AttributeSplitter on the comma to get the individual values into a list.

 

 

David
Just to elaborate on Itay's answer: you will have to use

 

 

(\\w*)

 

for your regexp if you treat data with an characters not only limited to the english/american ones, i.e. special characters and accented characters.

 

 

David
Thanks up to here!

 

 

The StringSearcher seems to be the right for me... cause the text field has sometimes a different syntax in one field like this:

 

 

099933-000001,135,TEXTABC,0/0,0002.00.00.00.00,0/0,,,TEXTDEF,,,,,,,,

 

077733-000001,9999,TEXTGHI,0/0,0009.00.00.00.00,0/0,,,TEXTJKL,,,,,,,,

 

 

---

 

 

But when i use StringSearcher with (ea-zA-Z])(\\w*) i get:

 

TEXTABC

 

 

How can i get something like: TEXTABC TEXTDEF ... ?
Hi,

 

 

it seems like your text values always appear as item 2 and 8 in your comma-separated text string.

 

 

First use an AttributeSplitter with a comma as the delimiter. Then use a ListIndexer on item 2 and 8 to get them into a regular attribute.

 

 

This works fine here, no use for the StringSearcher, which will just lump everthing into one big string. (Btw, do no use both "(&a-zA-Z])" and "(\\w*)". Use either one or the other.)

 

 

David
Hi,

 

 

If the alphabetic string elements always appear in the middle of a text line just twice, the following regular expression might be also available:   ^.*,(*a-zA-Z]+).*,(*a-zA-Z]+).*$   If not, the way would be a little complicated. For example: 1) Give unique ID to each feature by Counter. 2) Split the text line at comma by AttributeSplitter, all split elements are saved in a list attribute. 3) Create copy features by ListExploder, each created feature retains an element of the list as an attribute. 4) Filter out the features which have alphabetic element by StringSearcher. 5) Aggregate the filtered features by AttributeAccumulator (group by ID),  the alphabetic elements are saved in a list. 6) Concatenate the elements in the list by ListConcatenator. 7) Give the concatenated string to the original feature by FeatureMerger using ID as join attribute.

 

Takashi

Reply