Skip to main content
Question

Attribute's encoding check ?


Forum|alt.badge.img
Hey,

 

I need to control the string encoding (utf-8) but i didn't find a transformer for.

 

Do you have a tip to test the encoding ?

 

Thanks for your help.

 

Alexy

 

15 replies

pratap
Contributor
Forum|alt.badge.img+11
  • Contributor
  • November 12, 2015
Hi,

 

 

Is it possible to explain further more

 

 

Pratap

Forum|alt.badge.img
  • Author
  • November 12, 2015
Of course,

 

I need to make sure that the attribute type is "string" and check its encoding to utf-8.

 

I can not find transformers to perform these actions.

 

Thanks in advance

 


takashi
Influencer
  • November 12, 2015
Hi,

 

 

If you will change the type of attribute(s) to utf-8 unconditionally, the AttributeEncoder might help you.

 

However, there isn't a transformer to check if the type is "string". If you have to check that, consider using a Python script (PythonCaller). The "FMEFeature.getAttributeType" method returns an integer identifier indicating the internal attribute type of specified attribute.

 

 

Takashi

david_r
Celebrity
  • November 12, 2015
Hi

 


 


If you need to know if a particular attribute is unicode or not, you can do the following in a PythonCaller:

 


 



import fme
import fmeobjects
 
def FeatureProcessor(feature):
    s = feature.getAttribute('MyString'# Change attribute name as needed
    if isinstance(s, str):
        print "ordinary string"
    elif isinstance(s, unicode):
        print "unicode string"
    else:
        print "not a string"

 


David

takashi
Influencer
  • November 12, 2015
david_r wrote:
Hi

 


 


If you need to know if a particular attribute is unicode or not, you can do the following in a PythonCaller:

 


 



import fme
import fmeobjects
 
def FeatureProcessor(feature):
    s = feature.getAttribute('MyString'# Change attribute name as needed
    if isinstance(s, str):
        print "ordinary string"
    elif isinstance(s, unicode):
        print "unicode string"
    else:
        print "not a string"

 


David

@david_r, aside, how did you paste the script into the code block? I tried a few times in other Q&A;, but the script will not be contained in the code block correctly anyway. See here. I gave up and reported it to Safe...

https://knowledge.safe.com/questions/19701/point-to-line-to-point-retain-original-attributes.html#answer-19789


david_r
Celebrity
  • November 12, 2015
takashi wrote:

@david_r, aside, how did you paste the script into the code block? I tried a few times in other Q&A;, but the script will not be contained in the code block correctly anyway. See here. I gave up and reported it to Safe...

https://knowledge.safe.com/questions/19701/point-to-line-to-point-retain-original-attributes.html#answer-19789

Yeah, that code block thingy isn't very good... Basically, you can't have any blank lines, or it'll mess it up.


david_r
Celebrity
  • November 12, 2015
david_r wrote:

Yeah, that code block thingy isn't very good... Basically, you can't have any blank lines, or it'll mess it up.

I found a workaround for the blank lines. You can toggle into HTML tag mode when editing your post and replace the blank lines with a <br> tag. Then you get two blank lines for the price of one ;-)


takashi
Influencer
  • November 12, 2015
david_r wrote:

I found a workaround for the blank lines. You can toggle into HTML tag mode when editing your post and replace the blank lines with a <br> tag. Then you get two blank lines for the price of one ;-)

Great, thanks. I'll try it at the next chance. Hope the editor will be fixed as soon as possible. @mitahajirakar, @dewetvannieker


Forum|alt.badge.img
  • Author
  • November 12, 2015
takashi wrote:
Hi,

 

 

If you will change the type of attribute(s) to utf-8 unconditionally, the AttributeEncoder might help you.

 

However, there isn't a transformer to check if the type is "string". If you have to check that, consider using a Python script (PythonCaller). The "FMEFeature.getAttributeType" method returns an integer identifier indicating the internal attribute type of specified attribute.

 

 

Takashi

Tank you.

 

Takashi, do you know if a grid of interpretation of "feature.getAttributeType"'s result is available somewhere?

takashi
Influencer
  • November 12, 2015
alexy wrote:

Tank you.

 

Takashi, do you know if a grid of interpretation of "feature.getAttributeType"'s result is available somewhere?

You can find required information in the API reference. Go to:

Knowledge Center home > FME Documentation > Python FME Objects API Reference


david_r
Celebrity
  • November 12, 2015
takashi wrote:

You can find required information in the API reference. Go to:

Knowledge Center home > FME Documentation > Python FME Objects API Reference

Link for the lazy: getAttributeType

But as you can tell, it won't help telling the difference between a unicode or a regular string. You will have to use the Python code below for that.


takashi
Influencer
  • November 13, 2015
david_r wrote:

Link for the lazy: getAttributeType

But as you can tell, it won't help telling the difference between a unicode or a regular string. You will have to use the Python code below for that.

Thanks for the link for lazy :-)

I thought that FME_ATTR_STRING (=11) returned by the getAttributeType indicates "string". Was I wrong?


david_r
Celebrity
  • November 13, 2015
david_r wrote:

Link for the lazy: getAttributeType

But as you can tell, it won't help telling the difference between a unicode or a regular string. You will have to use the Python code below for that.

@takashi, you're right about FME_ATTR_STRING, of course. My point was that it won't tell you if the attribute is a string encoded to the current locale (e.g. cp1252 here in western Europe) or if it is in unicode (utf-8 or utf-16).


takashi
Influencer
  • November 14, 2015
david_r wrote:

Link for the lazy: getAttributeType

But as you can tell, it won't help telling the difference between a unicode or a regular string. You will have to use the Python code below for that.

@david_r, thanks for the clarification. Yes, of course it's necessary to add some codes to check the encoding.


takashi
Influencer
  • November 14, 2015
david_r wrote:

I found a workaround for the blank lines. You can toggle into HTML tag mode when editing your post and replace the blank lines with a <br> tag. Then you get two blank lines for the price of one ;-)

@mitahajirakar, thanks for your efforts. Regards.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings