Skip to main content
Question

Getting numbers out of cells with text and producing a list with all these numbers

  • October 25, 2016
  • 12 replies
  • 71 views

Forum|alt.badge.img

I have a column given to me in a spreadsheet which looks like that:

What I need is to get all the references out, the ones in square brackets, to provide with the full list to a user:

An then get a full list of all references, as follows:

Does anyone have an idea of how I can do this using FME?

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

12 replies

david_r
Celebrity
  • 8391 replies
  • October 25, 2016

referencesfinder.fmwtGiven an input attribute "text", this sequence of transformers will give you one feature per reference with the attribute "Reference_item" containing a reference number.

You could then consider using a DuplicateFilter (DuplicateRemover before FME 2016) and/or a Sorter before proceeding.

Sample workspace is attached.


ebygomm
Influencer
Forum|alt.badge.img+44
  • Influencer
  • 3422 replies
  • October 25, 2016

A stringsearcher with regex such as [0-9]{4} (this assumes they are always 4 digits) to build a list of matched characters, then a list exploder to create the separate features

 

 

 


david_r
Celebrity
  • 8391 replies
  • October 25, 2016

A stringsearcher with regex such as [0-9]{4} (this assumes they are always 4 digits) to build a list of matched characters, then a list exploder to create the separate features

 

 

 

But you'd also end up with numbers outside of the brackets, if there are any...

 

 


ebygomm
Influencer
Forum|alt.badge.img+44
  • Influencer
  • 3422 replies
  • October 25, 2016
But you'd also end up with numbers outside of the brackets, if there are any...

 

 

Yes, does depend on how sure of the data format you are

 

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • October 25, 2016

@darius

Connect the spreadsheet to a SQL-Executor. (i used a texfile here)

Then Create following script.

It creates attributes from all hits of a regular expression.( By assigning them levels)

The purple bits you can edit as you wish/need.

A shame the InlineQuerier does nto recognise regexp_substr etc.. Then yu would not need a database connection for this. AS we simple use dual here.


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • October 25, 2016

into attributes.. (i created random text and random typed numbers after the first couple b.t.w.)

@darius


david_r
Celebrity
  • 8391 replies
  • October 26, 2016

@darius

Connect the spreadsheet to a SQL-Executor. (i used a texfile here)

Then Create following script.

It creates attributes from all hits of a regular expression.( By assigning them levels)

The purple bits you can edit as you wish/need.

A shame the InlineQuerier does nto recognise regexp_substr etc.. Then yu would not need a database connection for this. AS we simple use dual here.

That's a cool solution. Just be aware, as for the other one, that this will also return any numbers occurring outside of the bracketed list.

Forum|alt.badge.img
  • Author
  • 21 replies
  • October 26, 2016

referencesfinder.fmwtGiven an input attribute "text", this sequence of transformers will give you one feature per reference with the attribute "Reference_item" containing a reference number.

You could then consider using a DuplicateFilter (DuplicateRemover before FME 2016) and/or a Sorter before proceeding.

Sample workspace is attached.

Thanks for this doing this so throughly. It worked really well!

 


Forum|alt.badge.img
  • Author
  • 21 replies
  • October 26, 2016

A stringsearcher with regex such as [0-9]{4} (this assumes they are always 4 digits) to build a list of matched characters, then a list exploder to create the separate features

 

 

 

Not sure if there are numbers outside the square brackets. The spreadsheet is quite large, so numbers that are not references will be extracted. References are always in square brackets. However, I will use the regex expression from @david_r, which works very well.

 


Forum|alt.badge.img
  • Author
  • 21 replies
  • October 26, 2016

@darius

Connect the spreadsheet to a SQL-Executor. (i used a texfile here)

Then Create following script.

It creates attributes from all hits of a regular expression.( By assigning them levels)

The purple bits you can edit as you wish/need.

A shame the InlineQuerier does nto recognise regexp_substr etc.. Then yu would not need a database connection for this. AS we simple use dual here.

Thanks gio. As @david_r said, this is a clever solution. However, I am not sure if numbers ocurr outside the square brackets.

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • October 26, 2016

@darius

Of course, it would. This is just to show the technique.

It is not to show how to create regular expressions.

And example text posted does not show any numbers outside brackets.

Anywaýs':

(?=[;\\[]*)\\b\\d+\\b(?=[;\\]])

all numbers between brackets and or semicolons.


ebygomm
Influencer
Forum|alt.badge.img+44
  • Influencer
  • 3422 replies
  • October 26, 2016

@darius

Of course, it would. This is just to show the technique.

It is not to show how to create regular expressions.

And example text posted does not show any numbers outside brackets.

Anywaýs':

(?=[;\\[]*)\\b\\d+\\b(?=[;\\]])

all numbers between brackets and or semicolons.

 

This regex would also work within the stringsearcher