Skip to main content
We have a workbench with close to 50 transformers that do a .dgn to GIS ETL.  The ETL creates building room polygons in GIS with a unique id for each room.  Part of the workbench extracts text(the unique id) from a dgn level.  Some of the text on this level is preceeded by a room type (stairs_202M).  My question is, which transformer should I use to select a subset of text like this.  So for the example above, I would search for text that contains "stairs" and then which transformer would I use to just extract the "202M" and leave out the "stairs_".  A tranformer that would either replace "stairs_" with a blank or would omit the first 7 characters (stairs_)?

 

 

Thank you
 stringreplacer or stringsearcher with a published parameter might help you? 
Hi,

 

 

I'd like to use StringReplacer with regular expression in such case. ex. Text to Find: ^stairs_(.+)$ Replacement Text: \\1 Use Regular Expressions: yes   If there are other prefix types and you want to delete all of them, the following expression could be effective. ^ ^_]+_(.+)$  

I don't know whether  the text on this level have a fixed format, such as there will always have a  '_' or some other char, and you'd like to extract strings on the right of the '_' or on the left .  If in such situation, using AttributeSplitter may be a solution,I think.


You can use stringsearcher transformer to filter out strings which contains following common special characters

 

p"."|"("|")"|","|"-"|" "|"&"|"/"]
When filtering out some special characters by regular expressions, escape sequence might be better than double quotations.
Thank you for the all the responses.  Based on your feedback I used stringsearcher to search for "stairs_" and stringreplacer to replace it w/a blank.  In going thru the data, there are situations are some unique situations that I did not realize initially so I want to slightly modify my initial post to say that how can I filter out a select number of character from the beginning of a string instead of filtering out "stairs_"?  Would I still use stringreplacer for that?  So I will still use stringsearcher to search for a string that contains "stairs_" and then I want to filter out 6 or 7 of the first characters of the string and output the rest of the string to the GIS attribute table.

 

 

Thank you
Using a combination of a StringSearcher and SubstringExtractor should accomplish what you need. It depends if you can rely on the underscore being the separator between the two halves within the string. Locate the position of the underscore using the StringSearcher, then extract starting from that index point plus one, until the last string index (you use -1 to start from the end) and it should extract the part you need.
Hi Steve,

 

  Sorry, I'm not sure why StringReplacer with the regular expression - ^stairs_(.+)$ or ^ ^_]+_(.+)$ - didn't work. Could you explain a little more concretely the unique situations on which StringReplacer failed? If the real situations became clear, I may be able to provide a more suitable regular expression or other way.  

Reply