Question

Detect and remove extra spaces with other characters

Forum|Forum|7 years ago
December 18, 2018
3 replies
251 views

bokaj

I am attempting to remove extra spaces and other characters from a attribute entries. I was able to find small script that does the job but in python.

For Example:

point_NAME (Before Processing)

A B C Bank , DreamTown

point_NAME (After Processing)

ABC Bank, DreamTown

I am able to detect the points which have these extra spaces and I am attempting to pass these on to pythoncaller transformer.

I want to use this syntax in python caller. Can someone help me out?

import re
string4= "POI_Name"
print (re.sub(' +', ' ',string4))

Thanks for all the help in advance!! :)

@takashi @david_r @Mark2AtSafe @MarkAtSafe

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+46

ebygomm
Influencer
Forum|Forum|7 years ago
December 18, 2018

A string replacer using regex is probably a more straightforward way to do the replacement. Although I don't think your regex does what you want it to.

((?<=[A-Z])\s(?=[A-Z]\b))|(\s(?=\,))

If you want to remove the spaces between the single letters but not the words you will need something like. This is removing spaces that are preceded by a single character and followed by a single character and which come before a comma. This works for your single example but depending on other values in your data, you may need something slightly different.

If you really wanted to use a pythoncaller (*note the slightly different regex)

import fme
import fmeobjects
import re

def processFeature(feature):
    string4 = feature.getAttribute('POINT_NAME')
    newpointname = (re.sub('((?<=[A-Z])\s(?=[A-Z]\\b))|(\s(?=\,))', '',string4))
    feature.setAttribute('NEW_POINT_NAME', newpointname)

Upvote

B

bokaj
Author
Forum|Forum|7 years ago
December 19, 2018

A string replacer using regex is probably a more straightforward way to do the replacement. Although I don't think your regex does what you want it to.

((?<=[A-Z])\s(?=[A-Z]\b))|(\s(?=\,))

If you want to remove the spaces between the single letters but not the words you will need something like. This is removing spaces that are preceded by a single character and followed by a single character and which come before a comma. This works for your single example but depending on other values in your data, you may need something slightly different.

If you really wanted to use a pythoncaller (*note the slightly different regex)

import fme
import fmeobjects
import re

def processFeature(feature):
    string4 = feature.getAttribute('POINT_NAME')
    newpointname = (re.sub('((?<=[A-Z])\s(?=[A-Z]\\b))|(\s(?=\,))', '',string4))
    feature.setAttribute('NEW_POINT_NAME', newpointname)

Thanks @egomm.. I will test this.

Considering different cases

they could be as below.

Test caseExpected outputtext Space Comma text commatext Space comma Spacetext comma spacetext Comma Space Commatext comma spacetext Double commatext commatext comma dot texttext comma texttext hypen commatext comma

Thanks again!

Cheers,

BokaJ

Upvote

B

bokaj
Author
Forum|Forum|7 years ago
December 19, 2018

A string replacer using regex is probably a more straightforward way to do the replacement. Although I don't think your regex does what you want it to.

((?<=[A-Z])\s(?=[A-Z]\b))|(\s(?=\,))

If you want to remove the spaces between the single letters but not the words you will need something like. This is removing spaces that are preceded by a single character and followed by a single character and which come before a comma. This works for your single example but depending on other values in your data, you may need something slightly different.

If you really wanted to use a pythoncaller (*note the slightly different regex)

import fme
import fmeobjects
import re

def processFeature(feature):
    string4 = feature.getAttribute('POINT_NAME')
    newpointname = (re.sub('((?<=[A-Z])\s(?=[A-Z]\\b))|(\s(?=\,))', '',string4))
    feature.setAttribute('NEW_POINT_NAME', newpointname)

Hi @egomm,

I tried the pythoncaller statements and it was giving me error

"Python Exception <TypeError>: expected string or buffer"

I edited the code to pass string instead of list object and now it works fine.

import fme
import fmeobjects
import re

def processFeature(feature):
    string4 = feature.getAttribute('POINT_NAME')
    newpointname = ""
    for row in newpointname:
        for cell in row:
            newpointname = (re.sub('((?<=[A-Z])\s(?=[A-Z]\\b))|(\s(?=\,))', '',string4))
        feature.setAttribute('Corrected_POINT_NAME', newpointname)

Thanks again for the foundation!

Cheers,

BokaJ

Upvote

Detect and remove extra spaces with other characters

3 replies

Community Stats

Latest FME

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded