Solved

Is there a way to expose attributes based on type (DateTime)


Badge

I have an issue where a dynamic workspace working with hundreds of data sets needs to be able to expose fields if they are DateTime. Due to the sheer number of datasets and fields I don't really wish to hard-code this into the workspace. I was hoping there was a way to expose fields when they are a particular datatype, in this case DateTime. The reason that I need to expose them is due to json not having the ability to write out DateTime fields correctly, they need to be converted to strings using the dateformatter. The workspace is dynamic and may open hundreds of datatypes depending on the parameters coming in. I don't want to have to hard code but I am thinking I might need to expose all the relevant fields within the FeatureReader, I am using FME 2015.1.

The date formater works fine but the attributes must be exposed to be able to select them within the transformer.

Thanks in advance!

icon

Best answer by fmelizard 24 March 2016, 07:53

View original

19 replies

Badge +3

You could use a schemareader , select the date_type attributes by tester.

Then use a merger on the data to select them to be reformatted.

A attributeexploder could also be used to to this.

Userlevel 4

You can type-cast your attributes to string inside the JSONTemplater, if that helps you:


    "updated_at" : xs:string( fme:get-attribute("my_date_attribute") ) 

Otherwise you could use a PythonCaller to loop over all the incoming attributes and re-create them as strings.

David

Userlevel 4
Badge +25

I don't see how you could do this without a script of some sort. The AttributeExploder and a ListConcatenator might give you the list of attribute names. Then copy/paste your DateFormatter into a text editor and you'll see the underlying scripting which you could recreate in a TCLCaller and adapt to your needs. That's how I would start to go about this. Hope it helps!

Mark

Userlevel 2
Badge +12

Could you create an idea from this topic:

I would like a BulkAttributeExposer with options to use a wildcard or expose by data type.

Badge

Could you create an idea from this topic:

I would like a BulkAttributeExposer with options to use a wildcard or expose by data type.

There are already a few topics of bulk attribute exposers ideas which I think would be very useful. Sometimes having to use the Schema port from the feature reader can be limiting, an expose all transformer would be amazing!

Badge

You can type-cast your attributes to string inside the JSONTemplater, if that helps you:


    "updated_at" : xs:string( fme:get-attribute("my_date_attribute") ) 

Otherwise you could use a PythonCaller to loop over all the incoming attributes and re-create them as strings.

David

I will look into this thank you, 

Badge

You could use a schemareader , select the date_type attributes by tester.

Then use a merger on the data to select them to be reformatted.

A attributeexploder could also be used to to this.

Thanks for the suggestion, its a hugely complicated workspace that has to run quickly and I worry adding in a schema reader will further complicate things. There are many datasets flowing through a feature reader so having another merger may slow things down further. Thanks for the suggestion though I will investigate.

Badge

I don't see how you could do this without a script of some sort. The AttributeExploder and a ListConcatenator might give you the list of attribute names. Then copy/paste your DateFormatter into a text editor and you'll see the underlying scripting which you could recreate in a TCLCaller and adapt to your needs. That's how I would start to go about this. Hope it helps!

Mark

Thanks Mark, I have tried with exploding the attributes back when the Featurereader didn't have a schema port, gave me a headache! So I was very happy when they added the port in 2015.1. I think I will try the JSONTemplater as a start before wrestling with lists and exploders!

Badge

Thanks for the all the suggestions and help. I will start with trying to change the json with the JSONTemplater as a start, thanks @david_r.

Seems like a bulk attribute exposer would be really helpful and there are a number of ideas already on this topic.

Badge

I will look into this thank you,

I'v realized that this is no different from exposing the attributes in the featurereader and then using the DateFormatter in the chain as I will need to know the field to replace "my_date_attribute". There are probably near to hundreds of date fields across all the data sets all with different names. THaks for the susggestion I'm starting to think the only way to do this is to expose each attribute by 'hardcoding' it into the workspace, not the best solution but the only one I think will work without getting into list explosion headaches.

Badge +2

@marko did you get this resolved? If not I might have done something similar this week that could help.

It doesn't expose the fields but creates a parameter that I can use in a DateFormatter transformer. Issue for you might be if the input files are different formats it could get unwieldy. Also in my example I could detect the Date attribute as it was known data type. If you're reading a CSV for example this wouldn't help as the data type is unknown.

See https://knowledge.safe.com/questions/24743/format-...

Badge

@marko did you get this resolved? If not I might have done something similar this week that could help.

It doesn't expose the fields but creates a parameter that I can use in a DateFormatter transformer. Issue for you might be if the input files are different formats it could get unwieldy. Also in my example I could detect the Date attribute as it was known data type. If you're reading a CSV for example this wouldn't help as the data type is unknown.

See https://knowledge.safe.com/questions/24743/format-...

HI @mark_1spatial, thanks of the reply and tip. Looks like a great solution from first glimpse, although are you using a python start up script or using a caller? I'm not sure if a start up will work whilst using a feature reader in generic mode?

Thanks for the reply I will look into it. Presently I am just exposing the values that I know to be SQL date fields, a quick sql query of the sys tables showed me which ones, and then once exposed in the featurereader I can select them in the dateformatter. Its not very elegant but works!

Badge +2

HI @mark_1spatial, thanks of the reply and tip. Looks like a great solution from first glimpse, although are you using a python start up script or using a caller? I'm not sure if a start up will work whilst using a feature reader in generic mode?

Thanks for the reply I will look into it. Presently I am just exposing the values that I know to be SQL date fields, a quick sql query of the sys tables showed me which ones, and then once exposed in the featurereader I can select them in the dateformatter. Its not very elegant but works!

Using it in Python Scripted Private Parameter and a standard Reader not a FeatureReader but should work if you configure it to read the same data. If it iscoming from a database should be doable.

Badge

Using it in Python Scripted Private Parameter and a standard Reader not a FeatureReader but should work if you configure it to read the same data. If it iscoming from a database should be doable.

I'm not certain a normal reader will work as the tables are huge. There is no way to limit the area search on the way in as the infrastructure doesn't provide the extents of the search, so the reader would have to read the entire table, UK wide data! Features are generated before teh feature reader to limit the datasearch to a specified area using look ups.

Unless the python will work from a schema reader? The other issue is that one workspace may open hundreds of different datasets depending on the parameters sent in, so it gets very complicated. I think the hard coding way may be dull but might actually be the easier option in the long run?

There are several questions/ideas relating to schemas and the feature reader in the knowledge centre. I hope Safe brings out a more advanced expose transformer that allows for exposing all, or by type etc.

Userlevel 4
Badge +13

Very interesting problem.

At the root of it is...what is known when we design the workspace, and what is known when we run it.

When we design, if it is meant to be a dynamic workspace that knows not of schema, then by definition, we know not the schema. So attribute exposing isn't really possible. The best we could do is give you an "import" option on the AttributeExposer where you'd feed it all the possible input you'd ever imagine you'd encounter, and then from that you'd pick the ones you wanted "exposed" so you could work with them. But alas you'd now nolonger have a dynamic workspace, because any previously unseen data would potentially not have exposed what you wanted.

So let's explore how to solve this nicely with FME 2016.

I'll attach a workspace -- feed any data that has schema that has DateTimes by configuring the FeatureReader appropriately. And we'll "mangle" those date time fields, very efficiently, by the end of the workflow. Leaving all other attributes alone. And never knowing anything about the input data "statically" in the workspace.

The trick is that a) we know the Schema always comes out first from the FeatureReader (this was by design, for things like this). b) we have "GlobalVariables" that we can set and reset that can communicate from one "stream" to another. c) We have an AttributeExploder to blows out features into one-feature-per-attribute, so we can work with Attributes generically then. d) we can reassemble things with the Aggregator.

I included a Decelerator at the end only to prove that no blocking is going on.

Enjoy. I look forward to feedback. @Mark2AtSafe -- we may want a knowledge type article on this technique. @mark_1spatial @marko @takashi @erik_jan @david_r Your comments particularly welcomed.dynamicdatereformat.fmw

Badge +2

Very interesting problem.

At the root of it is...what is known when we design the workspace, and what is known when we run it.

When we design, if it is meant to be a dynamic workspace that knows not of schema, then by definition, we know not the schema. So attribute exposing isn't really possible. The best we could do is give you an "import" option on the AttributeExposer where you'd feed it all the possible input you'd ever imagine you'd encounter, and then from that you'd pick the ones you wanted "exposed" so you could work with them. But alas you'd now nolonger have a dynamic workspace, because any previously unseen data would potentially not have exposed what you wanted.

So let's explore how to solve this nicely with FME 2016.

I'll attach a workspace -- feed any data that has schema that has DateTimes by configuring the FeatureReader appropriately. And we'll "mangle" those date time fields, very efficiently, by the end of the workflow. Leaving all other attributes alone. And never knowing anything about the input data "statically" in the workspace.

The trick is that a) we know the Schema always comes out first from the FeatureReader (this was by design, for things like this). b) we have "GlobalVariables" that we can set and reset that can communicate from one "stream" to another. c) We have an AttributeExploder to blows out features into one-feature-per-attribute, so we can work with Attributes generically then. d) we can reassemble things with the Aggregator.

I included a Decelerator at the end only to prove that no blocking is going on.

Enjoy. I look forward to feedback. @Mark2AtSafe -- we may want a knowledge type article on this technique. @mark_1spatial @marko @takashi @erik_jan @david_r Your comments particularly welcomed.dynamicdatereformat.fmw

Was thinking something along those lines yesterday afternoon and just tried my version but couldn't get the value into the DateFormatter 'Date Attributes'. Looks like @daleatsafe approach of exploding the data is the way. Need the option to fetch a Variable directly into the transformer !! ;-)

Badge

Very interesting problem.

At the root of it is...what is known when we design the workspace, and what is known when we run it.

When we design, if it is meant to be a dynamic workspace that knows not of schema, then by definition, we know not the schema. So attribute exposing isn't really possible. The best we could do is give you an "import" option on the AttributeExposer where you'd feed it all the possible input you'd ever imagine you'd encounter, and then from that you'd pick the ones you wanted "exposed" so you could work with them. But alas you'd now nolonger have a dynamic workspace, because any previously unseen data would potentially not have exposed what you wanted.

So let's explore how to solve this nicely with FME 2016.

I'll attach a workspace -- feed any data that has schema that has DateTimes by configuring the FeatureReader appropriately. And we'll "mangle" those date time fields, very efficiently, by the end of the workflow. Leaving all other attributes alone. And never knowing anything about the input data "statically" in the workspace.

The trick is that a) we know the Schema always comes out first from the FeatureReader (this was by design, for things like this). b) we have "GlobalVariables" that we can set and reset that can communicate from one "stream" to another. c) We have an AttributeExploder to blows out features into one-feature-per-attribute, so we can work with Attributes generically then. d) we can reassemble things with the Aggregator.

I included a Decelerator at the end only to prove that no blocking is going on.

Enjoy. I look forward to feedback. @Mark2AtSafe -- we may want a knowledge type article on this technique. @mark_1spatial @marko @takashi @erik_jan @david_r Your comments particularly welcomed.dynamicdatereformat.fmw

Many thanks for this @daleatsafe. Sadly we are still on 2015.1 as we had some errors reading sql server on 2016.0.1, I will test again when 2016.1 arrives.

I will look into this and see if I can incorporate it into my workbench.

Thanks.

Userlevel 2
Badge +17

Very interesting problem.

At the root of it is...what is known when we design the workspace, and what is known when we run it.

When we design, if it is meant to be a dynamic workspace that knows not of schema, then by definition, we know not the schema. So attribute exposing isn't really possible. The best we could do is give you an "import" option on the AttributeExposer where you'd feed it all the possible input you'd ever imagine you'd encounter, and then from that you'd pick the ones you wanted "exposed" so you could work with them. But alas you'd now nolonger have a dynamic workspace, because any previously unseen data would potentially not have exposed what you wanted.

So let's explore how to solve this nicely with FME 2016.

I'll attach a workspace -- feed any data that has schema that has DateTimes by configuring the FeatureReader appropriately. And we'll "mangle" those date time fields, very efficiently, by the end of the workflow. Leaving all other attributes alone. And never knowing anything about the input data "statically" in the workspace.

The trick is that a) we know the Schema always comes out first from the FeatureReader (this was by design, for things like this). b) we have "GlobalVariables" that we can set and reset that can communicate from one "stream" to another. c) We have an AttributeExploder to blows out features into one-feature-per-attribute, so we can work with Attributes generically then. d) we can reassemble things with the Aggregator.

I included a Decelerator at the end only to prove that no blocking is going on.

Enjoy. I look forward to feedback. @Mark2AtSafe -- we may want a knowledge type article on this technique. @mark_1spatial @marko @takashi @erik_jan @david_r Your comments particularly welcomed.dynamicdatereformat.fmw

Hi @daleatsafe, thanks for sharing the technique. I think it is commonly applicable to manipulate attributes depending on their native data type not only for date. However, seems the AttributeExploder could cause decreasing the performance especially if the number of feature attributes was very large.

I was thinking of a solution with Tcl for @mark_1spatial's question, like this: b15575-dynamicdatereformat-tcl.zip

The concept is the same as Dale's workspace. I didn't post it because I found that Mark's Python solution was more generic and flexible.

 

I think it would become easier to implement this concept if these abilities would be added to FME.

  • A transformer that extracts attribute names which match user specified condition (data type, regex, prefix etc.) from the schema feature, as a comma-separated string or list.
  • A parameter type that accepts an attribute storing attribute names (csv or list). If transformers such as DateFormatter could configure target attributes dynamically via this type parameter, the implementation would be much simpler.

@marko, this one: b15575-dynamicdatereformat.zip works with 2015.1.3+. It's different from Dale's but the concept is the same. FYI

Userlevel 4
Badge +13

Hi @daleatsafe, thanks for sharing the technique. I think it is commonly applicable to manipulate attributes depending on their native data type not only for date. However, seems the AttributeExploder could cause decreasing the performance especially if the number of feature attributes was very large.

I was thinking of a solution with Tcl for @mark_1spatial's question, like this: b15575-dynamicdatereformat-tcl.zip

The concept is the same as Dale's workspace. I didn't post it because I found that Mark's Python solution was more generic and flexible.

 

I think it would become easier to implement this concept if these abilities would be added to FME.

  • A transformer that extracts attribute names which match user specified condition (data type, regex, prefix etc.) from the schema feature, as a comma-separated string or list.
  • A parameter type that accepts an attribute storing attribute names (csv or list). If transformers such as DateFormatter could configure target attributes dynamically via this type parameter, the implementation would be much simpler.

@marko, this one: b15575-dynamicdatereformat.zip works with 2015.1.3+. It's different from Dale's but the concept is the same. FYI

Very impressive. We're going to discuss ways of making this easier to accomplish and I like @takashi 's suggestions here.

Reply