Question

Unicode conversion error

  • 16 September 2015
  • 2 replies
  • 9 views

Badge +1
  • Participant
  • 126 replies
I am reading from a shapefile into a Python caller and get this error message:

 

 

Attribute value of 'DESCRIPT' could not be converted to Unicode using attribute's encoding of 'utf-8'

 

 

This error occurs at this line:

 

 

feature.getAttribute("DESCRIPT")).strip()

 

 

I have used the Python shapefile module in a standalone Python script to isolate the offending character. It is the hyphen in this string:

 

 

200 Block of W Georgia - 

 

 

In my standalone Python script I can convert the hyphen by converting from its Unicode number to a Window hyphen, like this:

 

 

0x96:'-'

 

 

However, the exception is raised in the FME Python Caller in the line above before it gets to my conversion function.

 

 

This happens with FME 2015, not FME 2011. Is there any workaround?

 

 

Thanks

2 replies

Userlevel 4
Hi

 

 

What does the FME Logger specify as the character encoding if you output this feature before passing it on to the PythonCaller? Example:

 

 

Attribute(encoded: utf-8) : `test' has value `Français'

 

 

The solution will depend on how FME has recognized the character set of the attribute.

 

 

Also remember that you can specify the character set when creating the Shape reader, and it might be worth trying to set it to UTF-8:

 

 

 

 

David
Badge +1
This was a good suggestion, but it didn't work for all of the strings in my input file. As well, I have DWG inputs which do not allow you to set the character encoding.

 

 

The solution that works is to use the encode() function of Python as in the code below. The function call:

 

 

encode("ascii","ignore")

 

...tells Python to convert the string to ASCII, and if there are errors, ignore them.

 

 

Here's the code that works for me:

 

def input(self, feature):

 

    lstAttributeName = feature.getAllAttributeNames()

 

    for attribute in lstAttributeName:

 

            try:

 

                value = feature.getAttribute(attribute).encode("ascii","ignore").strip()

 

            except Exception,e:

 

                # value must be a number so convert it to a string

 

                value = str(feature.getAttribute(attribute))

Reply