Question

SQL question: exact match of a word in a textstring

  • 16 December 2013
  • 1 reply
  • 3 views

Hi all,

 

 

I have the following problem when running a query in the inlinequerier: I have a table with textstrings (consisting of articles) in which a country name is provided. I also have a shapefile with all the countries and their names. Mr. Takashi has helped me a lot with following query:

 

 

select  b.ARTICLE,  a.COUNTRY_WORLD   from Source as b cross join World as a   where b.ARTICLElike '%'||a.COUNTRY_WORLD||'%'

 

 

In this query however, 'Nigeria' is for example also matched to 'Niger'. I was wondering if there was a way to query for an exact match of the country table inside the article?

 

 

I was thinking about using a REGEX instead of LIKE operator, but I do not seem to get this to work in FME. 

 

Any suggestions on a new query or on how to use regex in FME?

 

 

Kind regards

1 reply

Userlevel 2
Badge +17
Hi,

 

 

In the example, I think you got 2 result features for the same article. One matches with Niger, another matches with Nigeria.   I think this would be one of possible workarounds (not tested): Modify SQL statement. 1) Add a unique ID attribute (e.g. ARTICLE_ID) to the article features and select it by the SQL statement. 2) Add order clause to the SQL to sort output features by country name descending. ----- select  b.ARTICLE_ID, b.ARTICLE,  a.COUNTRY_WORLD from Source as b cross join World as a where b.ARTICLE like '%'||a.COUNTRY_WORLD||'%' order by a.COUNTRY_WORLD desc ----- Additionally, if there is always a white space before every country name in articles, it might be better to add a white space after the first % in the where clause, to perform stricter selecting. where b.ARTICLE like '% '||a.COUNTRY_WORLD||'%'   Connect a DuplicateRemover to output port of the InlineQuerier. Key Attributes: ARTICLE_ID And, send only features which come from its UNIQUE port to the following processing.   If an article matches with 2 country names e.g. Niger and Nigeria, the order clause would bring the result outputting Nigeria (i.e. longer name)  feature before Niger feature. Then, the DuplicateRemover can select Nigeria feature.

 

 

Takashi

 

P.S. Please just call me Takashi.

Reply