I'm not sure if this helps you or not but you can use a Recorder transformer to create a temp file. This is essentially the tool that was used before feature caching came along - I'm pretty sure it's all the FeatureCache is anyway. It means you don't need to worry about the schema but it really doesn't solve the problem .
The AttributeManager and AttributeValidator can only ever act on attributes which are exposed. The SchemaScanner is a way to define/extract the schema from the attributes based on their values where the attributes are not exposed (I guess the same way the the CSV reader works).
Being able to set the type in the AttributeManager is really handy but is kind of pointless like you say when you don't know what the schema is anyway. Where it's helpful is when you do know what the schema is and you want to set the attribute type without having to manually change the output FeatureType schema (you can keep the schema definition as automatic) - the manual work gets old fast when you have a several FeatureTypes, especially when you want to add more attributes to an existing workspace.
If you have a mixture of Defined Schema and that from a SchemaFeature it will prefer the defined definition over the one in the SchemFeature (or at least it did for the few formats I tested)
Here I have only one attribute exposed and I've set the width to be 3 (in an AttibuteManager earlier in the workflow). The SchemaScanner has is as "fme_data_type: fme_varchar(7)".
When I read the data back into FME I can see the defined type has kept the length of 3 and has indeed truncated my string.
I'm not sure if this help of not for your case
> Being able to set the type in the AttributeManager is really handy but is kind of pointless like you say when you don't know what the schema is anyway.
Well, this is for a dynamic custom transformer, so I wouldn't know what the schema is on the inside (since it may change between instances), but I would know what it is on the outside. I would expect an AttributeMapper upstream from (i.e: outside) my transformer to be enough to setup the types for columns that shouldn't default to being text and have them be readable in some way from inside the transformer, but it doesn't seem like that's the case.
> I'm not sure if this helps you or not but you can use a Recorder transformer to create a temp file. This is essentially the tool that was used before feature caching came along - I'm pretty sure it's all the FeatureCache is anyway. It means you don't need to worry about the schema but it really doesn't solve the problem .
That would actually be massively helpful in my case if FME Feature Storage/FME Feature Table files were supported ouside of FME, so that Python/C/Java/System tools were able to extract, process and save the feature data without having to deal with feature objects (which, in my experiennce, are very slow).
> If you have a mixture of Defined Schema and that from a SchemaFeature it will prefer the defined definition over the one in the SchemFeature (or at least it did for the few formats I tested)
Unfortunately, you can't have "Feature Type definition" as a parameter type for custom transformers, and you can't have reader or writer nodes either, so that's unfortunately not useful in my case.
> I'm not sure if this help of not for your case
It's almost what I would need, but unfortnately not quite. Your input is much appreciated, though.
> Being able to set the type in the AttributeManager is really handy but is kind of pointless like you say when you don't know what the schema is anyway.
Well, this is for a dynamic custom transformer, so I wouldn't know what the schema is on the inside (since it may change between instances), but I would know what it is on the outside. I would expect an AttributeMapper upstream from (i.e: outside) my transformer to be enough to setup the types for columns that shouldn't default to being text and have them be readable in some way from inside the transformer, but it doesn't seem like that's the case.
> I'm not sure if this helps you or not but you can use a Recorder transformer to create a temp file. This is essentially the tool that was used before feature caching came along - I'm pretty sure it's all the FeatureCache is anyway. It means you don't need to worry about the schema but it really doesn't solve the problem .
That would actually be massively helpful in my case if FME Feature Storage/FME Feature Table files were supported ouside of FME, so that Python/C/Java/System tools were able to extract, process and save the feature data without having to deal with feature objects (which, in my experiennce, are very slow).
> If you have a mixture of Defined Schema and that from a SchemaFeature it will prefer the defined definition over the one in the SchemFeature (or at least it did for the few formats I tested)
Unfortunately, you can't have "Feature Type definition" as a parameter type for custom transformers, and you can't have reader or writer nodes either, so that's unfortunately not useful in my case.
> I'm not sure if this help of not for your case
It's almost what I would need, but unfortnately not quite. Your input is much appreciated, though.
This video gives a good insight into the whole thing: https://www.youtube.com/watch?v=_MoalhW8zlA - Explains some of the decisions.
This video gives a good insight into the whole thing: https://www.youtube.com/watch?v=_MoalhW8zlA - Explains some of the decisions.
Ah, I see, that clears up a lot of things. Thank you, Matt!
What I would be looking for, then, is some hypothetical AutomaticSchemaExtractor transformer that derives feature schema at design time, in the same way that writer nodes do, and outputs it on a separate tag (with an attribute{} list) like a FeatureReader or a SchemaScanner. This is the missing part to allow nodes that need to be able to have a schema dynamically fed to them at runtime (like a generic custom transformer, possibly with an intermediate FeatureWriter, or maybe a schema-aware but semi-generic PythonCaller), but to still be working on features of known types. Like, in my use case, I could probably just have a Creator+AttributeManager create the Schema line manually, but it would be very tedious to create (one translation I would be interested in using this on has 106 columns), and I would then need to be sure to keep it up to date with the data format from the upstream nodes is the schema evolves.
So I guess that the best I could manage on current FME would be to get something like this working, which tries to extract the internal FME attribute types as exposed to Python and generate a dynamic attribute schema feature at the beginning (like a FeatureReader) that the FeatureWriter could then pick up and use to define the columns dynamically. Besides not working for a reason I don't quite understand (I'm getting a "feature does not contain schema information" error), I'm not sure how I would get rid of format_attributes like fme_type or multi_reader_id.
import fme
import fmeobjects
from fmeobjects import FMEFeature
from fmegeneral.plugins import FMEEnhancedTransformer
fme_attribute_type_map = {
fmeobjects.FME_ATTR_BOOLEAN: "fme_boolean",
fmeobjects.FME_ATTR_INT8: "fme_int8",
fmeobjects.FME_ATTR_UINT8: "fme_uint8",
fmeobjects.FME_ATTR_INT16: "fme_int16",
fmeobjects.FME_ATTR_UINT16: "fme_uint16",
fmeobjects.FME_ATTR_INT32: "fme_int32",
fmeobjects.FME_ATTR_UINT32: "fme_uint32",
fmeobjects.FME_ATTR_INT64: "fme_int64",
fmeobjects.FME_ATTR_UINT64: "fme_uint64",
fmeobjects.FME_ATTR_REAL32: "fme_real32",
fmeobjects.FME_ATTR_REAL64: "fme_real64",
fmeobjects.FME_ATTR_STRING: "fme_buffer", # System-encoded strings, like windows-1252
fmeobjects.FME_ATTR_ENCODED_STRING: "fme_buffer", # UTF-8 and US-ASCII strings
# No info on these
fmeobjects.FME_ATTR_REAL80: "fme_real64",
fmeobjects.FME_ATTR_UNDEFINED: "fme_buffer",
}
class FeatureProcessor(FMEEnhancedTransformer):
def setup(self, first_feature: FMEFeature):
"""
This method is only called for the first input feature.
Implement this method to perform any necessary setup operations,
such as getting constant parameters.
"""
schema_feature = FMEFeature()
schema_feature.setAttribute("fme_schema_handling", "schema_only")
# https://docs.safe.com/fme/html/FME-Form-Documentation/FME-ReadersWriters/schema_from_table/Feature_Representation.htm
# https://community.safe.com/s/article/dynamic-workflow-tutorial-destination-schema-is-de-2
schema_feature.setAttribute("fme_feature_type_name", first_feature.getFeatureType())
for count, attr in enumerate(first_feature.getAllAttributeNames()):
attr_fme_type_int = first_feature.getAttributeType(attr)
attr_fme_type = fme_attribute_type_map.get(attr_fme_type_int, "fme_buffer")
schema_feature.setAttribute(f"attribute{{{count}}}.name", attr)
schema_feature.setAttribute(f"attribute{{{count}}}.fme_data_type", attr_fme_type)
self.pyoutput(schema_feature)
def receive_feature(self, feature: FMEFeature):
"""
Override this method instead of :meth:`input`.
This method receives all input features, including the first one that's also passed to :meth:`setup`.
"""
self.pyoutput(feature)
def finish(self) -> None:
"""Override this instead of :meth:`close`."""
pass
If that did work, it would be enough to resolve my issue.