We've stumbled across what seems to be a bug (or perhaps a design decision that we haven't understood!) in the shapefile reader and writer.
The DBF format defines the type of a field in its header, but there is nothing physically preventing a field that is numeric from having text data stored in it (because internally, numbers are stored as text in the DBF format). Hence, it is up to any implementation of the DBF format to make sure that the correct types of values are written to a field, and to behave sensibly if the wrong types are read from the field.
FME doesn't seem to do this. We translated some data to shapefiles. The data have a column entitled "Number" which contains road numbers. We set up the shapefile writer to store this data in a numeric field, without properly inspecting all the data. However it turns out that the data actually occasionally contain text values, such as "A3030" (reasonably enough because that's how British roads are numbered).
I would argue that the expected or correct behaviour here would be for FME to realise that this was non-numeric data, and either throw an error (as it would with a geodatabase data type) or to silently try to parse the data into a number. Even the CSV writer complains in this situation!!
Instead the data are written as is to the DBF file! So the numeric field ends up containing text. If you look at the DBF in a hex editor you can see "A3030" sitting there.
If you then open the shapefile in a GIS program you do not see the values as it will look at the header, expect a number, find text, and figure out that there is a problem. However FME, on re-reading the shapefile will happily read "A3030" out of the numeric field and carry on happily. (Until some transformer further down the line, expecting to have got a number out of a numeric field, breaks).
This reader behaviour is perhaps forgivable as FME being "robust" - as alluded to here:
http://fmepedia.safe.com/articles/Error_Unexpected_Behavior/FME-seems-to-misread-the-width-of-Number-fields-when-reading-a-Shape-dataset
- but I don't agree, because the user should be able to expect that if the reader has a numeric field, it will output numeric data in that field.
But I cannot see any justification for the writer behaviour. Could anyone explain whether this is a bug, or whether it is in some sense intended behaviour?
Harry