Skip to main content

I am trying to export data from SQL to CMDM Geodatabase. 200 values got truncated. I changed some of the attribute widths in CMDM, and truncated values decreased to 4. Instead of trying to check each attribute manually, is there any way to find these values directly?

Hi @canerakin111,

Are you use the Reader SQL or the transformer SQLCreator?

Thanks,

Danilo


Have you tried using the StringLengthCalculator transformer to calculate the required length of the attributes?


Hi @canerakin111,

Are you use the Reader SQL or the transformer SQLCreator?

Thanks,

Danilo

 

I am using the Reader SQL (MSSQL Spatial)

 

I am using the Reader SQL (MSSQL Spatial)
by using formulas in Excel, I can find the longest word and check the length of the word. Instead of doing that, if there is a way in FME to do it, it could be better for me

 

 


Have you tried using the StringLengthCalculator transformer to calculate the required length of the attributes?

I tried it right now, but I have more than 20 different attribute columns. StringLengthCalculator can check only one. maybe I need to write a python script to make a loop to check every attribute. This is going to be so complex though

 

 


You can use the attributevalidator to check for string length and can test this across multiple attributes


@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.
I tried it right now, but I have more than 20 different attribute columns. StringLengthCalculator can check only one. maybe I need to write a python script to make a loop to check every attribute. This is going to be so complex though

 

 

With the transformer AttributeManager its possible to create a new attribute to check the lengh for each Attribute your Database:

 

Thanks,

 

Danilo
I tried it right now, but I have more than 20 different attribute columns. StringLengthCalculator can check only one. maybe I need to write a python script to make a loop to check every attribute. This is going to be so complex though

 

 

If you can use SQL this might help too:

 

Select max(length(attr1)) as max_attr1, max(length(attr2) as max_attr2 etc from table;

 

That would return all maximum lengths for the attributes.

 

 


@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.
don't forget to filter out the fme standard attributes before counting btw. Filter out using _attr_name not like fme%

 

 


@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.

 

here is a pic (I also filtered oracle% as i used data from a oracle database)

 

 

 


@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.
@canerakin111 @gio @egomm @erik_jan @danilo_fme

 

I think this is the best way of doing it, particularly if you want to find out the maximum length of the values in each field, for which I added Group By = _attr_name to the StatisticsCalculator. I needed to do this after getting the warning "257 text values were truncated during write" in a translation. Which values in which fields? Now I can compare the Inspector results with my Writer and update the field lengths accordingly.


Vote for this idea!

https://knowledge.safe.com/content/idea/75676/enhanced-logging-of-data-truncation-warnings.html


@canerakin111

If you want to know longest attribute value of a set of attributes.

Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :

New Attribute = @StringLength(@Value(_attr_value))

Then StatisticsCalculator max on that value (no group by attribute required).

If you only require this max you can use Summary output from the

Statistics Calculator

.

I found this solution quite useful for same problem.

A way to speed it up further though potentially is I replaced the StatisticsCalculator, which can be quite slow with large datasets instead to something that should execute much faster:

  • Send the output of the String Length AttributeCreator to Sorter->Sort by _attr_name (Ascending), StrLength (Numeric Descending)
  • Then pass that into DuplicateFilter->Key Attribute = _attr_name with "Input is Ordered" = "Yes"

This is kind of like the "SELECT TOP...ORDER BY..." or "SELECT .... ORDER BY... LIMIT 1" SQL methodologies commonly used to solve this problem to find the highest or lowest value in a group in situations where using GROUP BY becomes too expensive/slow to execute: You order/sort the results by the highest value first within each Key value group to make it the "Top/First" result, and then limit the selection to the first item in each group which is essentially what a Sorter+DuplicateFilter combination does. Similarly, both Sorter and DuplicateFilter are Transformers that execute very fast in the Workspace.


I found this solution quite useful for same problem.

A way to speed it up further though potentially is I replaced the StatisticsCalculator, which can be quite slow with large datasets instead to something that should execute much faster:

  • Send the output of the String Length AttributeCreator to Sorter->Sort by _attr_name (Ascending), StrLength (Numeric Descending)
  • Then pass that into DuplicateFilter->Key Attribute = _attr_name with "Input is Ordered" = "Yes"

This is kind of like the "SELECT TOP...ORDER BY..." or "SELECT .... ORDER BY... LIMIT 1" SQL methodologies commonly used to solve this problem to find the highest or lowest value in a group in situations where using GROUP BY becomes too expensive/slow to execute: You order/sort the results by the highest value first within each Key value group to make it the "Top/First" result, and then limit the selection to the first item in each group which is essentially what a Sorter+DuplicateFilter combination does. Similarly, both Sorter and DuplicateFilter are Transformers that execute very fast in the Workspace.

Like it! I'll try that next time I need to do this.


Reply