Question

How to find the best attribute type and string length?

  • 10 December 2018
  • 9 replies
  • 43 views

Badge +2

I have around 35 gml-files, all with many attributes (25 attributes per xml). I am writing the files to an Oracle ArcSDE database.

How can I find and set the best attribute type?

In the writer all attributes are set to: type char | width 200 or 2048.

Some of these attribute are dates or integers. Somehow FME is not defining the correct attribute type. Can this be done?

 

Other attributes are indeed char but not with a width of 200 or 2048. Is there a way to find the string length for all attributes in once?

I could set a StringLengthCalculator for each attribute, but that is an enormous amount of work.

 

 


9 replies

Userlevel 4

If it's mostly about string lengths, why not simply leave them as-is? SDE will usually store string fields as VARCHARs in Oracle, which means that only the actual string lengths are allocated in database table.

Badge +3

I have the same question: "Is there a way to find the string length for all attributes in once?"

 

 

My issue is that my MapInfo .tab files are very large, so I want to trim the fat from all the text fields.
Userlevel 4

You could use an AttributeExploder followed by a StringLengthCalculator. Then finally a StatisticsCalculator to find the max of all the values of _length with a group by _attr_name from the AttributeExploder.

For example:

Badge +3

You could use an AttributeExploder followed by a StringLengthCalculator. Then finally a StatisticsCalculator to find the max of all the values of _length with a group by _attr_name from the AttributeExploder.

For example:

@david_r: that worked perfectly! The only downside is that it's very slow for large datasets.

maxattributelengthcalculator.fmw

 

Badge +2

I'm using a FeatureReader to read a wfs service and getting strange failures.

The following error is being logged

HTTP transfer error: 'Failure when receiving data from the peer'
XML Parser error: 'Error in input dataset:'https://data.linz.govt.nz:443/services;key=xxxxxxxxxxxxxx/wfs/layer-53382?SERVICE=WFS&VERSION=2.0.0&REQUEST=GetFeature&TYPENAMES=data.linz.govt.nz%3Alayer-53382&NAMESPACES=xmlns%28data.linz.govt.nz%2Chttp%3A%2F%2Fdata.linz.govt.nz%29&BBOX=-36.939824852066955%2C174.70954193086246%2C-36.91494916200109%2C174.76090622708065' line:1 column:121683 message:unexpected end of input'
The XML Module halted on error, see the logfile for details
<WFS> An error occurred while reading a GetFeature response, to ignore the error and avoid losing the previously read features, set the WFS "Ignore GetFeature Error" parameter.  Total features downloaded: '0'
A fatal error has occurred. Check the logfile above for details

Accessing the link and checking the file manually it is valid xml and does not have an unexpected end or any special characters at the logged position.

Stranger still, if I run my request via Fiddler proxy, it does not truncate the response.

Reader settings:

  • WFS Version: 2.0.0
  • &OUTPUTFORMAT=GML2
  • Use Old Reader for GML v3.1.1 and v2.1.2 Documents

While running fiddler is fine for me, I want to share this workbench with other users.

Userlevel 4

@david_r: that worked perfectly! The only downside is that it's very slow for large datasets.

maxattributelengthcalculator.fmw

 

Yes, I can confirm that using the AttributeExploder on large datasets can be (very) slow, in particular in combination with tables having lots of attributes. In fact the slowdown is related to <number of records> * <number of attributes per record>.

It's possible to do it faster using Python, however, since the bottleneck in the above solution lies in all the features it creates (creating new features is fairly "expensive" in FME, which is why I try to avoid all the *Exploder transformers whenever possible). Using Python, it's possible to iterate over all the attribute values without creating any extra features.

Userlevel 4

Thanks for the explanation this is really helpful

Glad I could help, even though it's been a while since your original question.

Badge +16

How about reading the xml/gml schema to define your output data type? Or generating a workspace? That should map the input data type to your output.

Badge +2

For anyone that has a need for a better solution to this workflow, there is an idea suggested to potentially make this easier to do: https://knowledge.safe.com/idea/90983/provide-a-convinient-way-to-dynamically-set-writer.html? Please add your vote and comment to let us know your interest! Thank you!

Reply