Question

Summing similar attributes without transposing

Forum|Forum|11 years ago
January 19, 2015
6 replies
35 views

edc

Hi,

I have a csv containing >250,000 features and around 50 attributes, which I want to sum based on a common property (e.g. a suffix/preffix in the atribute name). I've done this before on smaller datasets, by using AttributeExploder to transpose the attributes; adding the common feature I want to base the sum on using AttributeValueMapper; aggregating them and pivoting the output.

However, I think with this many attributes it will be quite memory hungry, is there an alternative approach I could take - possibly using SchemaMapper or something?

Thanks,

Ed

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+15

gio
Contributor
Forum|Forum|11 years ago
January 19, 2015

You could attribute explode and use a regexp to select the similar -attr_name and then create a accumulative attribute and push _attr_value to a variable using a variablesetter.

(You could create a list of the _attr_name and use a fuzzy comparer to try and automatically match similars)

VariabelRetriever is then Called upon the regexp finding subsequent similar attributes.

You would be progressively summing attributes.

Upvote

E

edc
Author
Forum|Forum|11 years ago
January 19, 2015

Hi Gio,

Thanks for your response.

I was hoping to avoid using AttributeExploader as with >250,000 features it'll create >10,000,000 features. I was hoping to be able to map a preffix to each atribute name using SchemaMapper and then aggregate those with the same preffix, without having to transpose the attributes (exploding).

I'm sure there must be a way, that doesn't require generating unecessary features

Ed

Upvote

+15

gio
Contributor
Forum|Forum|11 years ago
January 19, 2015

I see.

Well it seems to me you can make a schema, using schemamapper. Then sum them grouping by schemanames and using a aggregator.

Did you not try it yet?

50 attributes is not that much, but you could create the schemamap using regular expression search.

Alternatively maybe you can use a BulkAttribute renamer with RegularExpression Replace. But then you would have to make as many BA's as there are groups..

Upvote

E

edc
Author
Forum|Forum|11 years ago
January 20, 2015

Yeah I've got it working with smaller datasets (as explained in the question) - The problem with that though is that you can't aggregate the attributes without transposing them (as far as I can tell anyway?).

Upvote

+15

gio
Contributor
Forum|Forum|11 years ago
January 20, 2015

Hi,

Another possibility might be the use of a listpopulator followed by a listindexer and sum the items on same index (same record bassicaly).

U create a common prefix, the populator crates a list based on this prefix.

Of course then u would have to create a list, wich u also might not want?

Upvote

E

edc
Author
Forum|Forum|11 years ago
January 20, 2015

That sounds like a good idea, cheers Gio

Upvote

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded