Skip to main content
Solved

How to extract house number?


hoa_le
Contributor
Forum|alt.badge.img+5

Hi all,

I have attributes streets for example below. I want to extract the attribute

14 Le Thanh Nghi >> 14

12B Gia Quat >> 12B

3 Ngach 6 Ngo 12 >> 3

Tran Hung Dao >> Null (Have not number)

 

Could you tell me how to use regex function to make it?

Thank you

Best answer by markatsafe

@hoa_le​  In StringSearcher use a Regular Expression something like ^([0-9]{1}[0-9a-zA-Z\\w/-]*)\\s.*$

To see how this works go RegExr.com. The () is a subexpression, so in StringSearcher, under Advanced set a Subexpression Matches List Name. "Tran Hung Dao" will come out as Not Matched and you can set the house number to <null> in an AttributeCreator.

I've attached a small example workspace (FME 2020.2)

View original
Did this help you find an answer to your question?

8 replies

redgeographics
Celebrity
Forum|alt.badge.img+50

If it's always the first part until the space then a regex matching the first word should work:

^\\w+

 




hoa_le
Contributor
Forum|alt.badge.img+5
  • Author
  • Contributor
  • March 3, 2021
redgeographics wrote:

If it's always the first part until the space then a regex matching the first word should work:

^\\w+

 



Thank you @Hans van der Maarel​ .

In the case the street name is "45/11 Tran Hung Dao", It will not get number (Null). Could you tell me what Inverser of

^\\w+ ? because I get the number 14,12B


redgeographics
Celebrity
Forum|alt.badge.img+50
hoa_le wrote:

Thank you @Hans van der Maarel​ .

In the case the street name is "45/11 Tran Hung Dao", It will not get number (Null). Could you tell me what Inverser of

^\\w+ ? because I get the number 14,12B

The \\w stands for "word character", i.e. letters, numbers and some punctuation, but not spaces or, as it turns out, a /

 

An alternative would be to use an AttributeSplitter with the space as delimiter character. It'll create a list and _list{0} would be your house number, assuming the data is clean of course. 12 B Gia Quat would result in 12


Forum|alt.badge.img+2
  • Best Answer
  • March 3, 2021

@hoa_le​  In StringSearcher use a Regular Expression something like ^([0-9]{1}[0-9a-zA-Z\\w/-]*)\\s.*$

To see how this works go RegExr.com. The () is a subexpression, so in StringSearcher, under Advanced set a Subexpression Matches List Name. "Tran Hung Dao" will come out as Not Matched and you can set the house number to <null> in an AttributeCreator.

I've attached a small example workspace (FME 2020.2)


david_r
Celebrity
  • March 4, 2021

Try this one:

^\d[\d,\w,/]*

It expects the house number to be at the very beginning of the line, and it must be composed of a digit followed by zero or more repetitions of either a digit, a letter or a forward slash. 


david_r
Celebrity
  • March 4, 2021
david_r wrote:

Try this one:

^\d[\d,\w,/]*

It expects the house number to be at the very beginning of the line, and it must be composed of a digit followed by zero or more repetitions of either a digit, a letter or a forward slash. 

Alternative:

^\d[\S]*

Will return the first "word" of the address, but only if it starts with a digit. Any character is allowed, as long as it's not a space.


hoa_le
Contributor
Forum|alt.badge.img+5
  • Author
  • Contributor
  • March 4, 2021
david_r wrote:

Try this one:

^\d[\d,\w,/]*

It expects the house number to be at the very beginning of the line, and it must be composed of a digit followed by zero or more repetitions of either a digit, a letter or a forward slash. 

In my case, it's perfect for in the above example.

Otherwise, I want to extract: 

14 Le Thanh Nghi >> Le Thanh Nghi

12B Gia Quat >> Gia Quat

3 Ngach 6 Ngo 12 >> Ngach 6 Ngo 12

Tran Hung Dao >> Tran Hung Dao

How can I do?

Thank you


caracadrian
Contributor
Forum|alt.badge.img+23
  • Contributor
  • March 4, 2021
hoa_le wrote:

In my case, it's perfect for in the above example.

Otherwise, I want to extract:

14 Le Thanh Nghi >> Le Thanh Nghi

12B Gia Quat >> Gia Quat

3 Ngach 6 Ngo 12 >> Ngach 6 Ngo 12

Tran Hung Dao >> Tran Hung Dao

How can I do?

Thank you

Use StringReplacer with RegEx Mode, ^\\d[\\d,\\w,/]*or ^\\d[\\S]* as Text to replace and empty string in Replacement text. That way you just get rig of the number at the beginning.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings