I am trying to export data from SQL to CMDM Geodatabase. 200 values got truncated. I changed some of the attribute widths in CMDM, and truncated values decreased to 4. Instead of trying to check each attribute manually, is there any way to find these values directly?
Hi @canerakin111,
Are you use the Reader SQL or the transformer SQLCreator?
Thanks,
Danilo
Have you tried using the StringLengthCalculator transformer to calculate the required length of the attributes?
Hi @canerakin111,
Are you use the Reader SQL or the transformer SQLCreator?
Thanks,
Danilo
I am using the Reader SQL (MSSQL Spatial)
I am using the Reader SQL (MSSQL Spatial)
Have you tried using the StringLengthCalculator transformer to calculate the required length of the attributes?
You can use the attributevalidator to check for string length and can test this across multiple attributes
@canerakin111
If you want to know longest attribute value of a set of attributes.
Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :
New Attribute = @StringLength(@Value(_attr_value))
Then StatisticsCalculator max on that value (no group by attribute required).
If you only require this max you can use Summary output from the
Statistics Calculator
.
Danilo
Select max(length(attr1)) as max_attr1, max(length(attr2) as max_attr2 etc from table;
That would return all maximum lengths for the attributes.
@canerakin111
If you want to know longest attribute value of a set of attributes.
Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :
New Attribute = @StringLength(@Value(_attr_value))
Then StatisticsCalculator max on that value (no group by attribute required).
If you only require this max you can use Summary output from the
Statistics Calculator
.
@canerakin111
If you want to know longest attribute value of a set of attributes.
Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :
New Attribute = @StringLength(@Value(_attr_value))
Then StatisticsCalculator max on that value (no group by attribute required).
If you only require this max you can use Summary output from the
Statistics Calculator
.here is a pic (I also filtered oracle% as i used data from a oracle database)
@canerakin111
If you want to know longest attribute value of a set of attributes.
Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :
New Attribute = @StringLength(@Value(_attr_value))
Then StatisticsCalculator max on that value (no group by attribute required).
If you only require this max you can use Summary output from the
Statistics Calculator
.I think this is the best way of doing it, particularly if you want to find out the maximum length of the values in each field, for which I added Group By = _attr_name to the StatisticsCalculator. I needed to do this after getting the warning "257 text values were truncated during write" in a translation. Which values in which fields? Now I can compare the Inspector results with my Writer and update the field lengths accordingly.
Vote for this idea!
https://knowledge.safe.com/content/idea/75676/enhanced-logging-of-data-truncation-warnings.html
@canerakin111
If you want to know longest attribute value of a set of attributes.
Attribute Exploder (default setting) followed by a attribute creator using the arithmetic editor :
New Attribute = @StringLength(@Value(_attr_value))
Then StatisticsCalculator max on that value (no group by attribute required).
If you only require this max you can use Summary output from the
Statistics Calculator
.I found this solution quite useful for same problem.
A way to speed it up further though potentially is I replaced the StatisticsCalculator, which can be quite slow with large datasets instead to something that should execute much faster:
- Send the output of the String Length AttributeCreator to Sorter->Sort by _attr_name (Ascending), StrLength (Numeric Descending)
- Then pass that into DuplicateFilter->Key Attribute = _attr_name with "Input is Ordered" = "Yes"
This is kind of like the "SELECT TOP...ORDER BY..." or "SELECT .... ORDER BY... LIMIT 1" SQL methodologies commonly used to solve this problem to find the highest or lowest value in a group in situations where using GROUP BY becomes too expensive/slow to execute: You order/sort the results by the highest value first within each Key value group to make it the "Top/First" result, and then limit the selection to the first item in each group which is essentially what a Sorter+DuplicateFilter combination does. Similarly, both Sorter and DuplicateFilter are Transformers that execute very fast in the Workspace.
I found this solution quite useful for same problem.
A way to speed it up further though potentially is I replaced the StatisticsCalculator, which can be quite slow with large datasets instead to something that should execute much faster:
- Send the output of the String Length AttributeCreator to Sorter->Sort by _attr_name (Ascending), StrLength (Numeric Descending)
- Then pass that into DuplicateFilter->Key Attribute = _attr_name with "Input is Ordered" = "Yes"
This is kind of like the "SELECT TOP...ORDER BY..." or "SELECT .... ORDER BY... LIMIT 1" SQL methodologies commonly used to solve this problem to find the highest or lowest value in a group in situations where using GROUP BY becomes too expensive/slow to execute: You order/sort the results by the highest value first within each Key value group to make it the "Top/First" result, and then limit the selection to the first item in each group which is essentially what a Sorter+DuplicateFilter combination does. Similarly, both Sorter and DuplicateFilter are Transformers that execute very fast in the Workspace.
Like it! I'll try that next time I need to do this.