Solved

Passing a table to the attributeExploder but keeping specific attributes unexploded

7 years ago
March 7, 2018
5 replies
105 views

adriano
28 replies

I am trying to create a workflow to normalize a table with potentially thousands of columns while maintaining the id column. This would be an easy solution if the AttributeExploder allowed you to specify a column (or columns) to keep unexploded.

Since this transformer does not allow this, a workaround I have come up with is to use the AttributeExploder with the option in the transformer to "keep all attributes" and then use an attribute keeper transformer to keep the "id", "_attr_name","_attr_value" columns. This would generally solve the problem but since there is thousands of columns, if we use the "keep all attributes" option in the AttributeExploder the workbench performs exponentially slower to complete (it will go from taking 2 hours to complete, to 40 hours to complete). Is there potential workaround which can solve this problem within a more reasonable processing time?

Best answer by takashi

Hi @adriano_n90, if you need to explode all the thousands columns into individual features, but only the attribute "id" should be kept, this procedure could be better on the performance.

AttributeExploder (Exploding Type: List, Keep Attributes: Yes)
AttributeKeeper (Attriubutes to Keep: id, Lists to Keep: _attr_list{}._attr_name, _attr_list{}._attr_name)
ListExploder (List Attribute: _attr_list{})

If you are familiar with Python scripting, the PythonCaller could be a better alternative.

View original

Did this help you find an answer to your question?

+17

fmelizard
Contributor
3725 replies
7 years ago
March 7, 2018

Can you use the AttributeKeeper before the exploder?

+19

takashi
Contributor
7538 replies
Best Answer
7 years ago
March 7, 2018

Hi @adriano_n90, if you need to explode all the thousands columns into individual features, but only the attribute "id" should be kept, this procedure could be better on the performance.

AttributeExploder (Exploding Type: List, Keep Attributes: Yes)
AttributeKeeper (Attriubutes to Keep: id, Lists to Keep: _attr_list{}._attr_name, _attr_list{}._attr_name)
ListExploder (List Attribute: _attr_list{})

If you are familiar with Python scripting, the PythonCaller could be a better alternative.

+19

takashi
Contributor
7538 replies
7 years ago
March 7, 2018

takashi wrote:

Hi @adriano_n90, if you need to explode all the thousands columns into individual features, but only the attribute "id" should be kept, this procedure could be better on the performance.

AttributeExploder (Exploding Type: List, Keep Attributes: Yes)
AttributeKeeper (Attriubutes to Keep: id, Lists to Keep: _attr_list{}._attr_name, _attr_list{}._attr_name)
ListExploder (List Attribute: _attr_list{})

If you are familiar with Python scripting, the PythonCaller could be a better alternative.

Anyway, the performance of "exploder" transformers is not good generally. If the performance is critical, it might be better that you think of another approach to design the entire workflow without using either AttributeExploder or ListExploder.

adriano
Author
28 replies
7 years ago
March 8, 2018

takashi wrote:

Hi @adriano_n90, if you need to explode all the thousands columns into individual features, but only the attribute "id" should be kept, this procedure could be better on the performance.

AttributeExploder (Exploding Type: List, Keep Attributes: Yes)
AttributeKeeper (Attriubutes to Keep: id, Lists to Keep: _attr_list{}._attr_name, _attr_list{}._attr_name)
ListExploder (List Attribute: _attr_list{})

If you are familiar with Python scripting, the PythonCaller could be a better alternative.

Thank you so much takashi! This solution worked flawlessly and the performance was 10x better than the workflow I posted above.

adriano
Author
28 replies
7 years ago
March 8, 2018

fmelizard wrote:

Can you use the AttributeKeeper before the exploder?

Hi Matt, no because I technically need to maintain the relationship between all the columns to know which belongs to the id. So I would technically need to keep all the attributes.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos

Passing a table to the attributeExploder but keeping specific attributes unexploded