Skip to main content
Solved

How to find truncated values

  • August 30, 2017
  • 15 replies
  • 208 views

canerakin
Contributor
Forum|alt.badge.img+6

I am trying to export data from SQL to CMDM Geodatabase. 200 values got truncated. I changed some of the attribute widths in CMDM, and truncated values decreased to 4. Instead of trying to check each attribute manually, is there any way to find these values directly?

Best answer by gio

@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.
This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

15 replies

danilo_fme
Celebrity
Forum|alt.badge.img+51
  • Celebrity
  • 2077 replies
  • August 30, 2017

Hi @canerakin111,

Are you use the Reader SQL or the transformer SQLCreator?

Thanks,

Danilo


erik_jan
Contributor
Forum|alt.badge.img+22
  • Contributor
  • 2179 replies
  • August 30, 2017

Have you tried using the StringLengthCalculator transformer to calculate the required length of the attributes?


canerakin
Contributor
Forum|alt.badge.img+6
  • Author
  • Contributor
  • 51 replies
  • August 30, 2017

Hi @canerakin111,

Are you use the Reader SQL or the transformer SQLCreator?

Thanks,

Danilo

 

I am using the Reader SQL (MSSQL Spatial)

canerakin
Contributor
Forum|alt.badge.img+6
  • Author
  • Contributor
  • 51 replies
  • August 30, 2017

 

I am using the Reader SQL (MSSQL Spatial)
by using formulas in Excel, I can find the longest word and check the length of the word. Instead of doing that, if there is a way in FME to do it, it could be better for me

 

 


canerakin
Contributor
Forum|alt.badge.img+6
  • Author
  • Contributor
  • 51 replies
  • August 30, 2017

Have you tried using the StringLengthCalculator transformer to calculate the required length of the attributes?

I tried it right now, but I have more than 20 different attribute columns. StringLengthCalculator can check only one. maybe I need to write a python script to make a loop to check every attribute. This is going to be so complex though

 

 


ebygomm
Influencer
Forum|alt.badge.img+44
  • Influencer
  • 3427 replies
  • August 30, 2017

You can use the attributevalidator to check for string length and can test this across multiple attributes


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • Best Answer
  • August 30, 2017

@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.

danilo_fme
Celebrity
Forum|alt.badge.img+51
  • Celebrity
  • 2077 replies
  • August 30, 2017
I tried it right now, but I have more than 20 different attribute columns. StringLengthCalculator can check only one. maybe I need to write a python script to make a loop to check every attribute. This is going to be so complex though

 

 

With the transformer AttributeManager its possible to create a new attribute to check the lengh for each Attribute your Database:

 

Thanks,

 

Danilo

erik_jan
Contributor
Forum|alt.badge.img+22
  • Contributor
  • 2179 replies
  • August 30, 2017
I tried it right now, but I have more than 20 different attribute columns. StringLengthCalculator can check only one. maybe I need to write a python script to make a loop to check every attribute. This is going to be so complex though

 

 

If you can use SQL this might help too:

 

Select max(length(attr1)) as max_attr1, max(length(attr2) as max_attr2 etc from table;

 

That would return all maximum lengths for the attributes.

 

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • August 30, 2017

@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.
don't forget to filter out the fme standard attributes before counting btw. Filter out using _attr_name not like fme%

 

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • August 30, 2017

@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.

 

here is a pic (I also filtered oracle% as i used data from a oracle database)

 

 

 


tim_wood
Contributor
Forum|alt.badge.img+8
  • Contributor
  • 311 replies
  • July 4, 2018

@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.
@canerakin111 @gio @egomm @erik_jan @danilo_fme

 

I think this is the best way of doing it, particularly if you want to find out the maximum length of the values in each field, for which I added Group By = _attr_name to the StatisticsCalculator. I needed to do this after getting the warning "257 text values were truncated during write" in a translation. Which values in which fields? Now I can compare the Inspector results with my Writer and update the field lengths accordingly.


tim_wood
Contributor
Forum|alt.badge.img+8
  • Contributor
  • 311 replies
  • June 17, 2019

bwn
Evangelist
Forum|alt.badge.img+26
  • Evangelist
  • 562 replies
  • December 20, 2019

@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.

I found this solution quite useful for same problem.

A way to speed it up further though potentially is I replaced the StatisticsCalculator, which can be quite slow with large datasets instead to something that should execute much faster:

  • Send the output of the String Length AttributeCreator to Sorter->Sort by _attr_name (Ascending), StrLength (Numeric Descending)
  • Then pass that into DuplicateFilter->Key Attribute = _attr_name with "Input is Ordered" = "Yes"

This is kind of like the "SELECT TOP...ORDER BY..." or "SELECT .... ORDER BY... LIMIT 1" SQL methodologies commonly used to solve this problem to find the highest or lowest value in a group in situations where using GROUP BY becomes too expensive/slow to execute: You order/sort the results by the highest value first within each Key value group to make it the "Top/First" result, and then limit the selection to the first item in each group which is essentially what a Sorter+DuplicateFilter combination does. Similarly, both Sorter and DuplicateFilter are Transformers that execute very fast in the Workspace.


tim_wood
Contributor
Forum|alt.badge.img+8
  • Contributor
  • 311 replies
  • December 23, 2019

I found this solution quite useful for same problem.

A way to speed it up further though potentially is I replaced the StatisticsCalculator, which can be quite slow with large datasets instead to something that should execute much faster:

  • Send the output of the String Length AttributeCreator to Sorter->Sort by _attr_name (Ascending), StrLength (Numeric Descending)
  • Then pass that into DuplicateFilter->Key Attribute = _attr_name with "Input is Ordered" = "Yes"

This is kind of like the "SELECT TOP...ORDER BY..." or "SELECT .... ORDER BY... LIMIT 1" SQL methodologies commonly used to solve this problem to find the highest or lowest value in a group in situations where using GROUP BY becomes too expensive/slow to execute: You order/sort the results by the highest value first within each Key value group to make it the "Top/First" result, and then limit the selection to the first item in each group which is essentially what a Sorter+DuplicateFilter combination does. Similarly, both Sorter and DuplicateFilter are Transformers that execute very fast in the Workspace.

Like it! I'll try that next time I need to do this.