Solved

Find and replace words in a text with words from a list

  • 12 December 2022
  • 11 replies
  • 143 views

Badge +3

I have a text that randomly has numerals and text for numbering. So the text can be:

 

20 trees will be planted of which ten will be oaks.

 

I want to replace all numerals that are in text with numbers, so I get:

 

20 trees will be planted of which 10 will be oaks.

 

I already have a csv (with the numbers as numerals and in text) which I could use to merge. I was thinking about putting every single word in the text in a list and the matching all words with the csv with a loop, but this will be very data heavy.

 

Any suggestions to do this in a clean way?

 

 

icon

Best answer by nielsgerrits 12 December 2022, 10:34

View original

11 replies

Userlevel 1
Badge +11

Maybe a PythonCaller? Recently, we have been playing around with the opposite, i.e. converting numbers to words and we have found the num2words · PyPI library to get this done.

For your case you might have a look at the word2number · PyPI lib instead.

An alternative Python solution -  using loop + join() + split() - is given on this page: Python | Convert numeric words to numbers.

Userlevel 6
Badge +33

The StringPairReplacer is built for just this.

Badge +3

The StringPairReplacer is built for just this.

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Badge +3

Maybe a PythonCaller? Recently, we have been playing around with the opposite, i.e. converting numbers to words and we have found the num2words · PyPI library to get this done.

For your case you might have a look at the word2number · PyPI lib instead.

An alternative Python solution -  using loop + join() + split() - is given on this page: Python | Convert numeric words to numbers.

I think the first solution should work for text that are in English, mine is in Dutch unfortunately. The second suggestion might work!

Userlevel 6
Badge +33

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Please check the documentation, the StringPairReplacer uses key value pairs:

 

For example, if the source attribute’s value was:

bobby

and the replacement pairs were:

b s o a

the result will contain:

sassy

 

So your input should be

one 1 two 2 three 3 etc...

 

Badge +3

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

You are right! I missed that, thank you very much!

Userlevel 6
Badge +33

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Cheers!

Userlevel 1
Badge +21

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

The only thing to be aware of with the StringPairReplacer is it will match partial words, so if you are looking to replace the word 10 with 10 and also have the word e.g. often that will become of10

Userlevel 6
Badge +33

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Sharp.

Badge +3

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

I thought about that (since it will alos happen with twentyone for example, it will become 201), I can use the stringreplacer to replace all spaces by an underscore, so:

 

_ten_trees_will_be_replaced_by_twentyone_bushes>

 

_ten_ 10 _eleven_ 11 will be my input

Userlevel 4
Badge +36

As far as I know the stringreplacer can be used to replace one value with another. So only ten with 10 for instance. I have a whole list of values (1 to 100 written in letters) that need to be replaced with values from a list (1 - 100 as numerals). I do not know beforehand which of the values will have to be replaced per feature, so I think stringreplacer won't work?

Won't you miss words at the start of the line, and just before punctuation marks?

Three_shall_be_the_number_thou_shalt_count,_and_the_number_of_the_counting_shall_be_three.

Four_shalt_thou_not_count,_neither_count_thou_two,_excepting_that_thou_then_proceed_to_three.

Five_is_right_out.

Reply