I’m using FeatureWriter to write data using dynamic schemas to multiple worksheets of an Excel file.
In the particular case where I detected the issue, the data meant for the last worksheet contain just one row and value in just one of the columns. This value is ‘No’.
As you can see, the native_data_type of the respective schema attribute for this value is ‘string’.
In the result excel file, however, the ‘No’ value is changed to FALSE.
I can’t figure out why this happens.
The Writer writes other ‘No’ values to other worksheets and all of them appear to written correctly as ‘No’.
This is the FeatureWriter setup:
Have I made a mistake somewhere, or is it a bug in the Excel Writer?
Page 1 / 1
I wonder what would happen if you defined it as Char(2) or 3 or whatever length the max word would be? Does it still come out as boolean? It looks to me like it is being automatically changed to boolean in the writer. Hope that helps!
I wonder what would happen if you defined it as Char(2) or 3 or whatever length the max word would be? Does it still come out as boolean? It looks to me like it is being automatically changed to boolean in the writer. Hope that helps!
I’ve tried that already. The result is, unfortunately, the same.
@mst you could try to expose the attribute 'A21_regional_concern_cooperation' ? I have no direct clue, but maybe FME gives unexposed attributes a different treatment
@mst you could try to expose the attribute 'A21_regional_concern_cooperation' ? I have no direct clue, but maybe FME gives unexposed attributes a different treatment
The workspace is a generic solution which needs to work dynamically without exposing the attributes.
However, I tried to do it just to see what happens. The result was the same. Even after exposing the attribute the ‘No’ value is still converted to FALSE in the Excel file.
But as I said, it’s the only value for which it happens. The quasi-boolean value in the other worksheets are written correctly.
I wonder if this happens because the affected value is on the last worksheet of the file and/or because it’s the only value in the only row on that sheet.
I’m going to check what happens if I change the order of the sheets, or if I input additional values or rows to that sheet.
@mst okay thats interesting. How about you try the value 'Yes’ instead of 'No’ for this attribute? I would expect that to turn into True then.Thats is not to be expected by the way as your other quasi booleans were correctly exported.
Sometimes it might help to restart FME or Purge al your temp files and rerun the process
@mst okay thats interesting. How about you try the value 'Yes’ instead of 'No’ for this attribute? I would expect that to turn into True then.Thats is not to be expected by the way as your other quasi booleans were correctly exported.
Ok, things are getting weirder.
If I use an AttributeManager and change the value in the exposed ‘A21_regional_concern_cooperation' attribute to ‘Yes’, in the result it’s shown as Yes. If I ‘change’ the value ‘No’, it’s shown as No in Excel! It doesn’t seem to make sense.
In case it helps to understand what’s going on, the original data comes from zipped csv files downloaded using HTTPCaller request. Files are read using a FeatureReader, where the path to the zip file is provide as a variable and the attributes are not exposed because all the csv files have different structure. The data is merged with corresponding schemas that are also created dynamically from information stored in a database.
Why unexposed, or exposed but unmodified ‘No’ value from this particular csv table, is then written to Excel file as FALSE, but when I replace it with a ‘No’ value created in AttributeManager (or AttributeCreator), it’s written correctly as ‘No’?
How the respective feature looks when the attribute is unexposed is shown in the description.
This is how it looks after the attribute is exposed:
And this is after the AttributeManager change:
Aside from the varchar length, I see nothing that would explain the different results. And the data type change is irrelevant. When I’ve set the datatype of the ‘A21_regional_concern_cooperation' in the AttributeExposer to varchar(3), it still ended as FALSE in the Excel file.
Sometimes it might help to restart FME or Purge al your temp files and rerun the process
The purge didn’t help.
I assume that you want the “YES” and “NO” values to appear as texts in Excel ?
Unfortunately, there is some hidden magic in FME that handle boolean values. The reason is that different systems have different ways to express booleans, and FME probably tries to standardize it somewhat. Normally this is a good thing.
E.g. SQL Server uses 0/1 to express bit values false/true, and they end up as no/yes texts when read into FME. Other systems use false/true, and these also end up as no/yes.
So most likely the real question is how to make FME _not_ handle them as boolean values when writing to Excel, and just leave them as texts. Regrettably I don’t have the answer to that.
Hi @mst ,
I guess that the cause of the problem could be related to how you have defined the destination schema. Could you please clarify how you have configured the shema definition - the attribute{} list, fme_feature_type and fme_feature_type_name?
Hi @mst ,
I guess that the cause of the problem could be related to how you have defined the destination schema. Could you please clarify how you have configured the shema definition - the attribute{} list, fme_feature_type and fme_feature_type_name?
Hello, @takashi.
I’ve been creating the attribute{} list by using an AttributeCreator and an Aggregator. All attributes data types are set the same, because there’s no need to set different ones for writing this data to Excel.
Regarding the fme_feature_type and fme_feature_type_name, I weren’t setting them. The fme_feature_type of all features was still CSV from the FeatureReader. The features didn’t have the fme_feature_type_name attribute, as far as I couls see.
After reading your post I’ve added additional attributes to the AttributeCreator, according to my best understanding
I added the new fields to the Group By: parameter of the Aggregator
In the FeaureMerger, where I merge the schema with the data read from the csv files, I’ve set the ConflictResolution to use data from the Supplier, which is the schema input.
This is how the problematic feature looks before it enters the Excel FeatureWriter
I don’t know if I did it incorrectly, but it still didn’t solve the issue. That single ‘No’ value in the last sheet is still converted to FALSE. And as before, the other quasi-boolean values in the other sheets are still correctly shown as No/Yes.
I think the problem may be the value itself.
I’ve added a PythonCaller to the workspace which iterates over over all attribute values (except date, datetime and numeric), checks if the value == ‘No’, and then replaces it with ‘Nooooo’.
The ‘No’ values from the other csv files were replaced. But the value in this cursed one wasn’t!
This ‘No’ value is the first value of the row, unlike the ‘No’ values in other files. The value looks like ‘No’, but I’m wondering if the CSV Reader doesn’t add some hidden, zero-length characters to the first values of the row. This could explain why python doesn’t recognize it as ‘No’, and also why the Excel writer converts it to FALSE.
Any idea if this theory makes sense, how to prove it, and how to ‘sanitize’ the affected values?
@mst , in my observation, the problem - 'Yes' and 'No' will be replaced with 'TRUE' and 'FALSE' in destination Excel sheet seems to appear when CSV reader considers the column data type as 'boolean'. In that case, Python FMEFeature.getAttribute method returns a boolean value - True or False, not a string value - 'Yes' or 'No'.
I don't know the exact condition where CSV reader assumes a column data type as 'boolean', but you can stop the type scanning by setting 'No' to the Scan for Types parameter in the CSV reader, so that data type of every column would be treated as 'string' type always and the problem could be resolved.
Try this parameter setting.
@mst , in my observation, the problem - 'Yes' and 'No' will be replaced with 'TRUE' and 'FALSE' in destination Excel sheet seems to appear when CSV reader considers the column data type as 'boolean'. In that case, Python FMEFeature.getAttribute method returns a boolean value - True or False, not a string value - 'Yes' or 'No'.
I don't know the exact condition where CSV reader assumes a column data type as 'boolean', but you can stop the type scanning by setting 'No' to the Scan for Types parameter in the CSV reader, so that data type of every column would be treated as 'string' type always and the problem could be resolved.
Try this parameter setting.
Yes! That fixed it.
I was not aware about this setting. It’s quite well hidden .
Thank you a lot, @takashi!
Good tip! Thank you @takashi ! I will file this away for later use. Always learning.