Solved

Comparing Strings and returning unmatched characters?

12 years ago
March 5, 2013
17 replies
73 views

kam
18 replies

Hi , I hope this is a simple query.

I have a dataset which has a field that contains the organisation name and address seperated by spaces (see example)

I also have a gazetteer that should match the address in the first dataset without organisation name.

I want to be able to compare the two strings and extract the organisation name into a new field.

Example:

Dataset 1: Kam's Made Up Org 15 Test Street Testingham TE23 2TE

Dataset 2: 15 Test Street Testingham TE23 2TE

Desirable new field: Kam's Made Up Org

p.s. Both datasets have a unique identifier also to match them together.

Regards,

Best answer by mark_f

Here - try this Workbench - works with your second example.

Click Here

View original

Did this help you find an answer to your question?

This post is closed to further activity.
It may be a question with a best answer, an implemented idea, or just a post needing no comment.
If you have a follow-up or related question, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+38

ebygomm
Influencer
3283 replies
12 years ago
March 5, 2013

Use a string replacer

Join the two datasets then search the string in dataset 1for the string in dataset 2 and replace it with nothing

kam
Author
18 replies
12 years ago
March 5, 2013

Is there another way without having to join them as my Dataset 2 has over 180k records whereas my dataset 1 has 4k. It would unecessarily look through tens of thousands of records?

Also, if i was to do it that way, what do I enter inside the 'Text to find'?

+38

ebygomm
Influencer
3283 replies
12 years ago
March 5, 2013

You choose the attribute that contains the string from dataset 2 in the text to find, you search the attribute that contains the string from dataset 1

It won't be searching through thousands of records as what you are doing is joining the two datasets together into one 'table' and then comparing the value in one 'field' with the value in another 'field'

david_r
8337 replies
12 years ago
March 5, 2013

Hi,

a couple of questions:

are you able to join the two datasets on something else than a partial string (address), or is the address part the only thing they have in common?
are you 100% sure that the address part is always written exactly(!) the same in the two datasets?

David

kam
Author
18 replies
12 years ago
March 5, 2013

Hi EGomm and David R,

EGomm: I have joined using a joiner so now my DATASET 1 includes both string fields, one with organisation names and one without. I have then pushed it into a string replacer with String from Dataset 1 as the Attribute and string from Dataset 2 as the Text to Find, then put a blank space as the replacement text. This does not seem to do anything to the string field from Dataset 1....

What am i doing wrong?

David R: Yes I am able to join the two datasets with a unique identifier that I have mentioned previously, and yes I am sure the latter part of the string contains the same address as the one from dataset 2 as the system uses the address from dataset 2 and just enters an organisation name before it in the same field.

mark_f
325 replies
12 years ago
March 5, 2013

You said "Both datasets have a unique identifier also to match them together."

Do you mean you could join on these ID that but it's a many to one relationship? You could use a Joiner, FeatureMerger or if you're famailliar if SQL the InlineQuerier. And I wouldn't worry about joining 180k records.

+38

ebygomm
Influencer
3283 replies
12 years ago
March 5, 2013

If nothing is happening it suggests that there is no match. The process works in theory with the sample I have just created.

david_r
8337 replies
12 years ago
March 5, 2013

Hi again,

sorry, just realized that you had answered question 1 in your original question :-)

Two alternative suggestions:

1: Classic FME solution

FeatureMerger to join the two datasets
StringReplacer, like EGomm suggested

2: Using an InlineQuerier with the two datasets as input (Dataset 1= "ORG" and Dataset 2 = "ADDR" in this example), you can use this SQL query that will give you a new attribute "MY_ORG_STRING" that contains only part of the string that is not matched:

select ORG.ID, ORG_STR, ADDR_STR, rtrim(substr(ORG_STR, 1, length(ORG_STR)-length(ADDR_STR))) as MY_ORG_STRING from ORG join ADDR on (ORG.ID = ADDR.ID)

The above assumes that the address part of the organization string has been written EXACTLY the same. If this is not the case you will have to do fuzzy comparisons, but that is another game entirely...

David

mark_f
325 replies
12 years ago
March 5, 2013

StingReplacer Parameters:

Attibute: Dataset 1 Value

Text to Find: Dataset 2 Value (Use the Set to Attribute Value option)

Replace:

mark_f
325 replies
12 years ago
March 5, 2013

Clearly that is StringReplacer

(Edit function needed here! :) )

kam
Author
18 replies
12 years ago
March 5, 2013

Haha what a pickle. Ok let's make this even simpler.

I now have 1 Dataset with the fields as follows:

Address : 15 Temp Street

OrgAddress : Organisation 15 Temp Street

Now what I want is a new field with the word 'Organisation' stripped from it

Also, if the OrgAddress was : Organisation 15 Temp Street Tempingham

Then the new field should strip: Organisation Tempingham..

kam
Author
18 replies
12 years ago
March 5, 2013

I am not doubting that your methods are correct, but it is not working for me, maybe I am doing something wrong.

Also I assumed the StringReplacer would have done it, this was the first thing I tried...

I can do this through python with a simple string function but I just wondered how easy it was in FME

david_r
8337 replies
12 years ago
March 5, 2013

I just tried with your specific example:

Before:

Attribute(encoded: utf-8): `Address' has value `15 Temp Street' Attribute(encoded: utf-8): `OrgAddress' has value `Organisation 15 Temp Street'

After:

Attribute(encoded: utf-8): `Address' has value `15 Temp Street' Attribute(encoded: utf-8): `OrgAddress' has value `Organisation '

Looks halfway right to me?

David

mark_f
325 replies
Best Answer
12 years ago
March 5, 2013

Here - try this Workbench - works with your second example.

Click Here

kam
Author
18 replies
12 years ago
March 5, 2013

Thank you - works a treat

+38

ebygomm
Influencer
3283 replies
12 years ago
March 5, 2013

Did you figure out the problem then?

The example is just a stringreplacer?

kam
Author
18 replies
12 years ago
March 5, 2013

Yes, for some reason the string replacer worked when I had merged the two attributes into one dataset rather than pull them both into one workspace, join them then push through a string replacer.

Odd.. but It works now

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Comparing Strings and returning unmatched characters?

17 replies

Helpful Members This Week

Recently Solved Questions

How to use a WMTS to extract heights from a DTM service?

Clipping strategy for donut polygons when some features are missing

Joining geometry and attributes after JSONfragmenters

How to report on compression method used in a GeoTiff?

AGOL Hosted Feature Layer Update Question

Community Stats

Latest FME

Cookie policy

Cookie settings

Related Topics

Wanting to replace data in multiple attributes to multiple strings of text. Is there a transformer that I can use for this?icon

RegEx help - separate road numbers from road namesicon

Is there a transformer that can use generic expression (using string functions) for a group of attributes?icon

Transformer for equivalent of Update Column in MapInfo with 2 polygon filesicon

HTTPCaller and NextCloud configurationicon

Helpful Members This Week

Recently Solved Questions

How to use a WMTS to extract heights from a DTM service?

Clipping strategy for donut polygons when some features are missing

Joining geometry and attributes after JSONfragmenters

How to report on compression method used in a GeoTiff?

AGOL Hosted Feature Layer Update Question

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings