Question

Aggregate string value by common base words

10 days ago
July 4, 2025
4 replies
53 views

slustado
Contributor
2 replies

Hello! I’m trying to figure out the best way to aggregate values by the most common words. I’ve found a few threads and documents but not quite what I was looking for.

For example, I have a list of building names and numbers where each entry can have a of variation of a building name and number:

“1000 The Coolest Building Ever”

“1000 Coolest Building”

“Coolest Building”

“100 Coolest Building Dr.”

I would like the output to be “Coolest Building”, as it has common base words across all features. Is this possible?

Bonus point if a variation of “Bldg.” “Bldg” can be included. Any advice/guidance is appreciated!

+13

alexbiz
Enthusiast
83 replies
8 days ago
July 6, 2025

Hm, you may use AI to resolve this kind of fuzzy matching I think, if the number of features/different values is not too big.

+33

crutledge
Influencer
219 replies
8 days ago
July 6, 2025

Yah. There is no easy way on this one. You would first have to build your list using something like: Normalize Data Using FME Desktop - YouTube or like @alexbiz said AI for this maybe?

Then have an attribute mapper for shortforms like bldg=building or st=Street

First thing I think would be to get the “extras” out of the attributes like the numbers and get it down to a pure line of text. no ##s no special characters. Then remove things like Building, Bldg, st, street, and get down to a “Name” then Normalize. Keep track of this “list of values” for an Attribute Mapper.

This is a challenge. Let us know how it goes. Hope that helps.

At your rest service ^B

takashi
7715 replies
7 days ago
July 7, 2025

Hi @slustado ,

If your examples cover every string pattern that could appear, I think you can use StringSearcher with this expression to extract the part representing building name - "Coolest" in the examples.
(\d*\s+)?(The\s+)?(.+)\s+(Building|Bldg)

+39

virtualcitymatt
Celebrity
1897 replies
7 days ago
July 7, 2025

I also think this is a easy problem for AI for an AI to solve.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Aggregate string value by common base words