Skip to main content
Solved

Passing a table to the attributeExploder but keeping specific attributes unexploded

  • March 7, 2018
  • 5 replies
  • 146 views

adriano
Forum|alt.badge.img+1

I am trying to create a workflow to normalize a table with potentially thousands of columns while maintaining the id column. This would be an easy solution if the AttributeExploder allowed you to specify a column (or columns) to keep unexploded.

Since this transformer does not allow this, a workaround I have come up with is to use the AttributeExploder with the option in the transformer to "keep all attributes" and then use an attribute keeper transformer to keep the "id", "_attr_name","_attr_value" columns. This would generally solve the problem but since there is thousands of columns, if we use the "keep all attributes" option in the AttributeExploder the workbench performs exponentially slower to complete (it will go from taking 2 hours to complete, to 40 hours to complete). Is there potential workaround which can solve this problem within a more reasonable processing time?

 

Best answer by takashi

Hi @adriano_n90, if you need to explode all the thousands columns into individual features, but only the attribute "id" should be kept, this procedure could be better on the performance.

  1. AttributeExploder (Exploding Type: List, Keep Attributes: Yes)
  2. AttributeKeeper (Attriubutes to Keep: id, Lists to Keep: _attr_list{}._attr_name, _attr_list{}._attr_name)
  3. ListExploder (List Attribute: _attr_list{})

If you are familiar with Python scripting, the PythonCaller could be a better alternative.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

5 replies

fmelizard
Safer
Forum|alt.badge.img+20
  • Safer
  • 3719 replies
  • March 7, 2018

Can you use the AttributeKeeper before the exploder?


takashi
Celebrity
  • 7843 replies
  • Best Answer
  • March 7, 2018

Hi @adriano_n90, if you need to explode all the thousands columns into individual features, but only the attribute "id" should be kept, this procedure could be better on the performance.

  1. AttributeExploder (Exploding Type: List, Keep Attributes: Yes)
  2. AttributeKeeper (Attriubutes to Keep: id, Lists to Keep: _attr_list{}._attr_name, _attr_list{}._attr_name)
  3. ListExploder (List Attribute: _attr_list{})

If you are familiar with Python scripting, the PythonCaller could be a better alternative.


takashi
Celebrity
  • 7843 replies
  • March 7, 2018

Hi @adriano_n90, if you need to explode all the thousands columns into individual features, but only the attribute "id" should be kept, this procedure could be better on the performance.

  1. AttributeExploder (Exploding Type: List, Keep Attributes: Yes)
  2. AttributeKeeper (Attriubutes to Keep: id, Lists to Keep: _attr_list{}._attr_name, _attr_list{}._attr_name)
  3. ListExploder (List Attribute: _attr_list{})

If you are familiar with Python scripting, the PythonCaller could be a better alternative.

Anyway, the performance of "exploder" transformers is not good generally. If the performance is critical, it might be better that you think of another approach to design the entire workflow without using either AttributeExploder or ListExploder.

 


adriano
Forum|alt.badge.img+1
  • Author
  • 28 replies
  • March 8, 2018

Hi @adriano_n90, if you need to explode all the thousands columns into individual features, but only the attribute "id" should be kept, this procedure could be better on the performance.

  1. AttributeExploder (Exploding Type: List, Keep Attributes: Yes)
  2. AttributeKeeper (Attriubutes to Keep: id, Lists to Keep: _attr_list{}._attr_name, _attr_list{}._attr_name)
  3. ListExploder (List Attribute: _attr_list{})

If you are familiar with Python scripting, the PythonCaller could be a better alternative.

Thank you so much takashi! This solution worked flawlessly and the performance was 10x better than the workflow I posted above.

 

 


adriano
Forum|alt.badge.img+1
  • Author
  • 28 replies
  • March 8, 2018

Can you use the AttributeKeeper before the exploder?

Hi Matt, no because I technically need to maintain the relationship between all the columns to know which belongs to the id. So I would technically need to keep all the attributes.