Skip to main content

I have a dataset with a messy 'comments' column. I want to extract numbers from these comments, but only when the come before a ceratin word. For instance, a comment could be:

 

on januari 21, 16 boats passed

or

21 boats passed on januari 12 2017

 

I only want the numbers when they come beofre the word 'boat'. Is there a way to do that?

StringSearcher with regex should do this.


StringSearcher with regex should do this.

Thanks. I was struggling with the regex, but found it: \\d+(?=\\ boats)


Reply