Skip to main content
Solved

Columns recognised but dropped when reading a CSV file

  • November 25, 2020
  • 7 replies
  • 114 views

I have CSVs with over 2 million rows. For some reasons, Reader appears to be dropping some columns , and yet I know Reader recognised they existed because they are missing in the column name order. For example, when columns col3 and col6 are dropped the auto-generated attribute names are col0, col0,col1,col2,col4,col5,col7,col8,col9. I have tried various options in reader but clearly not the right one . There is plenty of RAM and disk space. In some files, the missing columns are blank but I have also seen this happen in files that do not have blanks in missing columns. Some columns have strings enclosed in double-quotes. Any pointers please

Best answer by ebygomm

This occurs if Scan for Types is set to Yes (I'm not sure if 2020 now defaults to this behaviour). It doesn't look like the desired outcome of this setting in any case, as it's not just skipping columns but putting some data in the wrong place and losing other bits

Capture 

Changing Scan for Types to No reads all the columns correctly.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

7 replies

redgeographics
Celebrity
Forum|alt.badge.img+62

That's really odd. Would you be able to post a sample file so we can take a closer look? Just the header line and a few lines of data should be enough


That's really odd. Would you be able to post a sample file so we can take a closer look? Just the header line and a few lines of data should be enough

here you go:

23,"I",405612,10008752428,"4620X029950346","3998563000",,"7666VN",2012-02-21,,2016-02-10,2012-12-13

23,"I",405692,37005945,"1725X042609307","10080803000",,"7666VN",2016-01-04,,2016-02-07,2015-12-02

23,"I",405753,10012134708,"1720X016961597","244731175",,"7666VN",2012-06-12,,2016-02-10,2012-12-13

 

Col6 and Col9 are skipped. I am using 2020.1.0.0 (20200707 - Build 20594 - WIN64)

Thanks.


ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • Best Answer
  • November 25, 2020

This occurs if Scan for Types is set to Yes (I'm not sure if 2020 now defaults to this behaviour). It doesn't look like the desired outcome of this setting in any case, as it's not just skipping columns but putting some data in the wrong place and losing other bits

Capture 

Changing Scan for Types to No reads all the columns correctly.


Forum|alt.badge.img+2
  • November 25, 2020

This occurs if Scan for Types is set to Yes (I'm not sure if 2020 now defaults to this behaviour). It doesn't look like the desired outcome of this setting in any case, as it's not just skipping columns but putting some data in the wrong place and losing other bits

Capture 

Changing Scan for Types to No reads all the columns correctly.

@fme_superuser​  This is a known issue and has been fixed for FME 2021.0. It occurs when the CSV file has no headers and there are blank columns. Column numbering gets offset. Issue FMEENGINE-66612


ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • November 25, 2020

@fme_superuser​  This is a known issue and has been fixed for FME 2021.0. It occurs when the CSV file has no headers and there are blank columns. Column numbering gets offset. Issue FMEENGINE-66612

Do you know what Scan for Types defaults to in 2020? The sample file above is OS Addressbase data so it might crop up quite a bit


Forum|alt.badge.img+2
  • November 26, 2020

@fme_superuser​  This is a known issue and has been fixed for FME 2021.0. It occurs when the CSV file has no headers and there are blank columns. Column numbering gets offset. Issue FMEENGINE-66612

@ebygomm​  @fme_superuser​  Scan for Types defaults to Yes in 2020. You can change the default under Presets.

The problem only occurs if Scan for Types = Yes.


Forum|alt.badge.img+2
  • November 26, 2020

@fme_superuser​ @ebygomm​ We have ported this fix back into an FME 2020.2 patch. It should be in the next FME 2020.2 release that is available on our downloads page (any build after 20804).