Question

Removing blank spaces

  • 23 September 2016
  • 12 replies
  • 300 views

Badge

I have this simple problem that I just cant find a solution to. I have a xls-file with x and y coordinates. Both include some blank spaces in the beginning and in the end. It seems incredibly simple to find the blank spaces and replace them with nothing, but it doesnt work. First I tried the find and replace -function in excel. Didnt work. Then I tried the stringreplacer in FME: replace blank space with nothing. I also tried the regular expression [ ]+ but the blank spaces remained. Same result with the attributetrimmer.

Could anyone please explain to me why I cant get rid of the blank spaces?

Thanks...

//Tobias

 


12 replies

Userlevel 1
Badge +21

An attributetrimmer should work here.

If you wanted to use an string searcher with a regular expression \\s should work

It might be that your fields actually contain non-printing characters rather than spaces, in which case it might be easier to pick out the digits with a string searcher using regex such as [0-9\\.]*

Badge +16

Hi @tobiasp Just an idea, replace missing, empty and null with -9999 (NullAttributeMapper) and replace -9999 with nothing (StringReplacer)

You can actually do all this and what @egomm suggests about regular expressions in the NullAttributeMapper, the only down side is that you cannot map it to nothing

Userlevel 2
Badge +17

Hi @tobiasp, sounds strange. The AttributeTrimmer should work to trim both leading and trailing whitespaces.

Could you please post an Excel file containing some rows that reproduces the same situation?

Badge

Hey!

Thanks for your efforts!

 

I managed to get around the problem using built-in excel-functions, but it still annoys me that I cannot get FME to do it for me... :)

Here is the excel-file I am trying to clean from whitespaces.

 

example.xlsx

 

//Tobias

 

Badge

@zzupljanin is right. The invisible character was the "non-breaking space". The AttributeTrimer also works if you copy and paste the character into the "Trim Characters" parameter.

Hey!

Thanks for your efforts!

 

I managed to get around the problem using built-in excel-functions, but it still annoys me that I cannot get FME to do it for me... :)

Here is the excel-file I am trying to clean from whitespaces.

 

example.xlsx

 

//Tobias

 

Badge +1

This might be stupid question. But here it goes. Is it possible that it's not actual space sign but some other unicode whitespace character?

Badge +1

This might be stupid question. But here it goes. Is it possible that it's not actual space sign but some other unicode whitespace character?

Yes, that is the problem. If you type space key on keyboard you get U+0020 unicode sign (regular space). What you have in your Excell file is No Break-Space (unicode sign U+00A0). Just copy and place that sign into StringReplacer

 

Userlevel 1
Badge +21

Replace everything that's not a number or a comma with nothing, e.g.

Badge

Hi @tobiasp, sounds strange. The AttributeTrimmer should work to trim both leading and trailing whitespaces.

Could you please post an Excel file containing some rows that reproduces the same situation?

Beautiful!

There are apparently several kinds of blank spaces... :)

 

Again: thanks to you all and have a nice weekend!

//Tobias

Userlevel 2
Badge +17

@zzupljanin is right. The invisible character was the "non-breaking space". The AttributeTrimer also works if you copy and paste the character into the "Trim Characters" parameter.

Userlevel 2
Badge +17

I think that copy and paste is the quickest way, but found other interesting ways. For what it's worth...

1.
The character code in hexadecimal of non-breaking space is C2A0 in the UTF-8 encoding. This workflow trims both leading and trailing non-breaking spaces.

0684Q00000ArJoEQAV.png

2. Python "str.strip()" method is available to trim white-space characters including the non-breaking space. A PythonCaller with this script does the trick.
def trimSpaces(feature):
    x = feature.getAttribute('x')
    y = feature.getAttribute('y')
    feature.setAttribute('x', x.strip())
    feature.setAttribute('y', y.strip())
Badge +7

I've just finally resolved a similar problem. In my case I was trying to figure out why the concatenation of two attributes and adding some text wasn't working (@Value(REF) @Value(COMMENTS).jpg).

Figuring it out was made harder by the fact that when I added an Inspector, this showed the result I was expecting, but what was written to the FGDB layer was missing the .jpg bit at the end of the value. No warnings or errors were recorded in the Translation Log.

It turned out that there is a null character (Char 0 - you learn something new every day!) at the end of the value in one of the attributes in the source CSV file. This doesn't show up in Excel and in TextPad appears as a space. I only started to figure it out when I tried to copy a row from the CSV to make a test file and TextPad said "Cannot cut, copy, or drag and drop text containing null (code = 0) characters"

This topic was useful because it put me on the track of looking for funny character, and using StringReplacer and Regular Expressions to fix it. A quick Google gave me the Regular Expression (\\0 see https://stackoverflow.com/questions/7578632/what-is-the-difference-between-char0-and-0-in-c) which I replace with nothing. I'm still struggling with the metaphysical aspect of replacing null with nothing but my translation now works! :-)

AttributeTrimmer doesn't seem to accept Regular Expressions. Is there another transformer that will do this? If not I'll submit an idea for AttributeTrimmer to allow Regular Expressions.

Reply