Solved

Trouble in reading a csv file

  • 23 March 2021
  • 2 replies
  • 56 views

Badge +4

I have a csv file that has 259200 rows. The first column CELL_ID has consecutive integers from 1 to 259200. When used this file in the CSV reader, its data type is automatically set to be uint16, which limits its max value to be 65535, consequently all other rows after 65535 are read as NULL. The data type can't be changed. I also notice other issues. For example, one column has many initial rows with a value of 0, so the CSV reader set its data type to be uint8. But this column has many rows with data type of double, and those end up as being read as null . It would be nice if all columns can be changed to be read as string so we don't miss anything. I would say the default settings are not correct in most cases. I have used the CSV reader for years, and this is the first time I notice the bad default settings.

icon

Best answer by fhilding 24 March 2021, 14:33

View original

2 replies

Badge +1

Hi! I also encounter this in files where the values are sorted. Two things that you can do to fix this:

 

1) Click parameters after you've selected the file in the CSV reader, to see more about the feature type you're about to read. In there, you can (the first red circle), set it to read more rows when assessing the schema.

 

2) You can also go to manual -> set the type to the uint64 it almost always is for numeric ID columns.

 

Sometimes, I'd appreciate if the schema scanner checked the first X rows, as well as the _last_ row, just to see if we're in the case of "oh, this looks like an ordered file". But, until FME magically does that for me, I'll have to remind myself to check either one of the two options.

Badge +4

Hi! I also encounter this in files where the values are sorted. Two things that you can do to fix this:

 

1) Click parameters after you've selected the file in the CSV reader, to see more about the feature type you're about to read. In there, you can (the first red circle), set it to read more rows when assessing the schema.

 

2) You can also go to manual -> set the type to the uint64 it almost always is for numeric ID columns.

 

Sometimes, I'd appreciate if the schema scanner checked the first X rows, as well as the _last_ row, just to see if we're in the case of "oh, this looks like an ordered file". But, until FME magically does that for me, I'll have to remind myself to check either one of the two options.

@fhilsding, thanks. I also found the data types can be changed by a manual setting in the Parameters when setting up the reader. All columns are read correctly now.

Reply