Skip to main content
Solved

Columns recognised but dropped when reading a CSV file

  • November 25, 2020
  • 7 replies
  • 67 views

I have CSVs with over 2 million rows. For some reasons, Reader appears to be dropping some columns , and yet I know Reader recognised they existed because they are missing in the column name order. For example, when columns col3 and col6 are dropped the auto-generated attribute names are col0, col0,col1,col2,col4,col5,col7,col8,col9. I have tried various options in reader but clearly not the right one . There is plenty of RAM and disk space. In some files, the missing columns are blank but I have also seen this happen in files that do not have blanks in missing columns. Some columns have strings enclosed in double-quotes. Any pointers please

Best answer by ebygomm

This occurs if Scan for Types is set to Yes (I'm not sure if 2020 now defaults to this behaviour). It doesn't look like the desired outcome of this setting in any case, as it's not just skipping columns but putting some data in the wrong place and losing other bits

Capture 

Changing Scan for Types to No reads all the columns correctly.

View original
Did this help you find an answer to your question?

7 replies

redgeographics
Celebrity
Forum|alt.badge.img+49

That's really odd. Would you be able to post a sample file so we can take a closer look? Just the header line and a few lines of data should be enough


redgeographics wrote:

That's really odd. Would you be able to post a sample file so we can take a closer look? Just the header line and a few lines of data should be enough

here you go:

23,"I",405612,10008752428,"4620X029950346","3998563000",,"7666VN",2012-02-21,,2016-02-10,2012-12-13

23,"I",405692,37005945,"1725X042609307","10080803000",,"7666VN",2016-01-04,,2016-02-07,2015-12-02

23,"I",405753,10012134708,"1720X016961597","244731175",,"7666VN",2012-06-12,,2016-02-10,2012-12-13

 

Col6 and Col9 are skipped. I am using 2020.1.0.0 (20200707 - Build 20594 - WIN64)

Thanks.


ebygomm
Influencer
Forum|alt.badge.img+38
  • Influencer
  • Best Answer
  • November 25, 2020

This occurs if Scan for Types is set to Yes (I'm not sure if 2020 now defaults to this behaviour). It doesn't look like the desired outcome of this setting in any case, as it's not just skipping columns but putting some data in the wrong place and losing other bits

Capture 

Changing Scan for Types to No reads all the columns correctly.


Forum|alt.badge.img+2
  • November 25, 2020
ebygomm wrote:

This occurs if Scan for Types is set to Yes (I'm not sure if 2020 now defaults to this behaviour). It doesn't look like the desired outcome of this setting in any case, as it's not just skipping columns but putting some data in the wrong place and losing other bits

Capture 

Changing Scan for Types to No reads all the columns correctly.

@fme_superuser​  This is a known issue and has been fixed for FME 2021.0. It occurs when the CSV file has no headers and there are blank columns. Column numbering gets offset. Issue FMEENGINE-66612


ebygomm
Influencer
Forum|alt.badge.img+38
  • Influencer
  • November 25, 2020
markatsafe wrote:

@fme_superuser​  This is a known issue and has been fixed for FME 2021.0. It occurs when the CSV file has no headers and there are blank columns. Column numbering gets offset. Issue FMEENGINE-66612

Do you know what Scan for Types defaults to in 2020? The sample file above is OS Addressbase data so it might crop up quite a bit


Forum|alt.badge.img+2
  • November 26, 2020
markatsafe wrote:

@fme_superuser​  This is a known issue and has been fixed for FME 2021.0. It occurs when the CSV file has no headers and there are blank columns. Column numbering gets offset. Issue FMEENGINE-66612

@ebygomm​  @fme_superuser​  Scan for Types defaults to Yes in 2020. You can change the default under Presets.

The problem only occurs if Scan for Types = Yes.


Forum|alt.badge.img+2
  • November 26, 2020

@fme_superuser​ @ebygomm​ We have ported this fix back into an FME 2020.2 patch. It should be in the next FME 2020.2 release that is available on our downloads page (any build after 20804).


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings