Question

Why does the CSV reader classify a column full of integers as time?

  • 15 February 2022
  • 3 replies
  • 1 view

Badge

I have a CSV file that looks like this:-

 

ID,X_COR,Y_COR

187347635,335375,369977

185565488,282016,438601

185768788,281568,438584

 

FME has classified the X_COR and Y_COR columns as time? I lost hours of work by not spotting this and getting mysterious geocoding errors.

 

I can't see an easy way to override it in the Reader and it doesn't make sense to me to have the time classification?

 

 


3 replies

Userlevel 4

By default, FME tries to guess the CSV column datatypes, but you can also override this when creating the reader:

CSV ParametersEither set the correct datatypes, or simply set everything to 'string' if you want to have the most control.

Badge +3

By default, FME tries to guess the CSV column datatypes, but you can also override this when creating the reader:

CSV ParametersEither set the correct datatypes, or simply set everything to 'string' if you want to have the most control.

I also recently encountered this. I found the option to change the data type definitions as mentioned by david_r above, and I get that automatic detection options won't be a 100% guarantee for success, but also to me it seems odd that (sometimes?) FME identifies numeric/integer values automatically as time.

 

I think it would be more logical if a time type is something a user manually switches to in the CSV reader, and the CSV reader never guesses this by default.

 

In my case, because of the data type guess for 'time' the values for that attribute were also altered, and I think that's not something you want a reader to do by default.

 

Furthermore, I recently upgraded to using a newer version of FME (now running FME 2021.2.2.0 (build 21806 - win64)). In the past I also worked with these CSV datasets in the past, and I don't recall encountering this behaviour then. Can someone with an earlire version of FME maybe verify if he/she also encounters the same behaviour?

 

See the below screenshots and sample (space seperated) CSV for the (automatic) behaviour I currently encounter in FME 2021.2.2.0 (build 21806 - win64):

image 

 

Badge +2

@davebarter​ This has been fixed in FME 2022. The issue is that the default date format for the CSV schema scan was the FME format. So pretty well any integer could be picked up as a date. Also, it's unlikely a CSV date would be in FME format. In 2022 the default was changed to an ISO date format. To prevent this in FME 2021 and earlier, you can select ISO (auto detect) as the CSV Date Input Format (under the Schema Generation disclosure group of the CSV reader parameters). Then save that as the default preset so you don't have to worry about this happening again - see image below:dialog

Reply