Skip to main content
Solved

How do I remove a non breaking space from an attribute?

  • December 16, 2015
  • 2 replies
  • 443 views

Forum|alt.badge.img+4

Morning, I've had some feedback that an XML file produced using FME ( the file can be downloaded here )

has created a problem whilst being parsed by Lxml. The error that's getting returned is "Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration." I've been told that this is referring to a non breaking space character.

The XML file is a list of recent planning applications and for planning case 15/0779 the <casetext> tag seems to contain the offending character after the full stop at the end of the casetext description.

I've tried the attributetrimmer to remove this but with no luck. Just wondering if anyone has come across this before and has worked out how to remove these characters.

Thanks

Best answer by james_rutter

Actually I worked my own answer out to this one.

Using a stringreplacer I did a regex search for the unicode value of \\u00A0 . This is the code for a no-break space character. I left the replace with field blank so it just removes this character and doesn't replace anything.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

Forum|alt.badge.img+4
  • Author
  • Best Answer
  • December 16, 2015

Actually I worked my own answer out to this one.

Using a stringreplacer I did a regex search for the unicode value of \\u00A0 . This is the code for a no-break space character. I left the replace with field blank so it just removes this character and doesn't replace anything.


takashi
Celebrity
  • December 16, 2015

Hi, I think this link is helpful >> Python unicode strings