Solved

How to extract house number?

  • 3 March 2021
  • 8 replies
  • 31 views

Badge +5

Hi all,

I have attributes streets for example below. I want to extract the attribute

14 Le Thanh Nghi >> 14

12B Gia Quat >> 12B

3 Ngach 6 Ngo 12 >> 3

Tran Hung Dao >> Null (Have not number)

 

Could you tell me how to use regex function to make it?

Thank you

icon

Best answer by markatsafe 3 March 2021, 19:36

View original

8 replies

Userlevel 5
Badge +25

If it's always the first part until the space then a regex matching the first word should work:

^\\w+

 



Badge +5

If it's always the first part until the space then a regex matching the first word should work:

^\\w+

 



Thank you @Hans van der Maarel​ .

In the case the street name is "45/11 Tran Hung Dao", It will not get number (Null). Could you tell me what Inverser of

^\\w+ ? because I get the number 14,12B

Userlevel 5
Badge +25

Thank you @Hans van der Maarel​ .

In the case the street name is "45/11 Tran Hung Dao", It will not get number (Null). Could you tell me what Inverser of

^\\w+ ? because I get the number 14,12B

The \\w stands for "word character", i.e. letters, numbers and some punctuation, but not spaces or, as it turns out, a /

 

An alternative would be to use an AttributeSplitter with the space as delimiter character. It'll create a list and _list{0} would be your house number, assuming the data is clean of course. 12 B Gia Quat would result in 12

Badge +2

@hoa_le​  In StringSearcher use a Regular Expression something like ^([0-9]{1}[0-9a-zA-Z\\w/-]*)\\s.*$

To see how this works go RegExr.com. The () is a subexpression, so in StringSearcher, under Advanced set a Subexpression Matches List Name. "Tran Hung Dao" will come out as Not Matched and you can set the house number to <null> in an AttributeCreator.

I've attached a small example workspace (FME 2020.2)

Userlevel 4

Try this one:

^\d[\d,\w,/]*

It expects the house number to be at the very beginning of the line, and it must be composed of a digit followed by zero or more repetitions of either a digit, a letter or a forward slash. 

Userlevel 4

Try this one:

^\d[\d,\w,/]*

It expects the house number to be at the very beginning of the line, and it must be composed of a digit followed by zero or more repetitions of either a digit, a letter or a forward slash. 

Alternative:

^\d[\S]*

Will return the first "word" of the address, but only if it starts with a digit. Any character is allowed, as long as it's not a space.

Badge +5

Try this one:

^\d[\d,\w,/]*

It expects the house number to be at the very beginning of the line, and it must be composed of a digit followed by zero or more repetitions of either a digit, a letter or a forward slash. 

In my case, it's perfect for in the above example.

Otherwise, I want to extract: 

14 Le Thanh Nghi >> Le Thanh Nghi

12B Gia Quat >> Gia Quat

3 Ngach 6 Ngo 12 >> Ngach 6 Ngo 12

Tran Hung Dao >> Tran Hung Dao

How can I do?

Thank you

Badge +20

In my case, it's perfect for in the above example.

Otherwise, I want to extract:

14 Le Thanh Nghi >> Le Thanh Nghi

12B Gia Quat >> Gia Quat

3 Ngach 6 Ngo 12 >> Ngach 6 Ngo 12

Tran Hung Dao >> Tran Hung Dao

How can I do?

Thank you

Use StringReplacer with RegEx Mode, ^\\d[\\d,\\w,/]*or ^\\d[\\S]* as Text to replace and empty string in Replacement text. That way you just get rig of the number at the beginning.

Reply