Skip to main content
Solved

Find and replace words in a text with words from a list


michiedem
Contributor
Forum|alt.badge.img+7

I have a text that randomly has numerals and text for numbering. So the text can be:

 

20 trees will be planted of which ten will be oaks.

 

I want to replace all numerals that are in text with numbers, so I get:

 

20 trees will be planted of which 10 will be oaks.

 

I already have a csv (with the numbers as numerals and in text) which I could use to merge. I was thinking about putting every single word in the text in a list and the matching all words with the csv with a loop, but this will be very data heavy.

 

Any suggestions to do this in a clean way?

 

 

Best answer by nielsgerrits

The StringPairReplacer is built for just this.

View original
Did this help you find an answer to your question?

11 replies

egge
Contributor
Forum|alt.badge.img+14
  • Contributor
  • December 12, 2022

Maybe a PythonCaller? Recently, we have been playing around with the opposite, i.e. converting numbers to words and we have found the num2words · PyPI library to get this done.

For your case you might have a look at the word2number · PyPI lib instead.

An alternative Python solution -  using loop + join() + split() - is given on this page: Python | Convert numeric words to numbers.


nielsgerrits
VIP
Forum|alt.badge.img+54
  • Best Answer
  • December 12, 2022

The StringPairReplacer is built for just this.


michiedem
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • December 12, 2022
nielsgerrits wrote:

The StringPairReplacer is built for just this.

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?


michiedem
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • December 12, 2022
egge wrote:

Maybe a PythonCaller? Recently, we have been playing around with the opposite, i.e. converting numbers to words and we have found the num2words · PyPI library to get this done.

For your case you might have a look at the word2number · PyPI lib instead.

An alternative Python solution -  using loop + join() + split() - is given on this page: Python | Convert numeric words to numbers.

I think the first solution should work for text that are in English, mine is in Dutch unfortunately. The second suggestion might work!


nielsgerrits
VIP
Forum|alt.badge.img+54
michiedem wrote:

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Please check the documentation, the StringPairReplacer uses key value pairs:

 

For example, if the source attribute’s value was:

bobby

and the replacement pairs were:

b s o a

the result will contain:

sassy

 

So your input should be

one 1 two 2 three 3 etc...

 


michiedem
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • December 12, 2022
michiedem wrote:

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

You are right! I missed that, thank you very much!


nielsgerrits
VIP
Forum|alt.badge.img+54
michiedem wrote:

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Cheers!


ebygomm
Influencer
Forum|alt.badge.img+32
  • Influencer
  • December 12, 2022
michiedem wrote:

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

The only thing to be aware of with the StringPairReplacer is it will match partial words, so if you are looking to replace the word 10 with 10 and also have the word e.g. often that will become of10


nielsgerrits
VIP
Forum|alt.badge.img+54
michiedem wrote:

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Sharp.


michiedem
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • December 12, 2022
michiedem wrote:

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

I thought about that (since it will alos happen with twentyone for example, it will become 201), I can use the stringreplacer to replace all spaces by an underscore, so:

 

_ten_trees_will_be_replaced_by_twentyone_bushes>

 

_ten_ 10 _eleven_ 11 will be my input


geomancer
Evangelist
Forum|alt.badge.img+47
  • Evangelist
  • December 13, 2022
michiedem wrote:

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Won't you miss words at the start of the line, and just before punctuation marks?

Three_shall_be_the_number_thou_shalt_count,_and_the_number_of_the_counting_shall_be_three.

Four_shalt_thou_not_count,_neither_count_thou_two,_excepting_that_thou_then_proceed_to_three.

Five_is_right_out.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings