Skip to main content
Question

Word boundaries inside Inline Querier


Hi all,

 

I have this SQL query in InlineQuerier that helps me join tables on specific keywords(table_b.id):

 

 

SELECT *

 

FROM table_a

 

JOIN table_b

 

ON table_a.id

 

LIKE '%' || table_b.id || '%'

 

but I would like to take it to the next level and also add word boundaries to my keywords, otherwise the expression is too greedy.

 

Can anybody help out with the syntax? (I was trying to add REGEXP and [[:<:]], [[:>:]], but so far I'm only getting syntax errors).

 

 

Thanks,

Linda

8 replies

takashi
Contributor
Forum|alt.badge.img+21
  • Contributor
  • July 17, 2018

I don't think SQL for the InlineQuerier (i.e. standard components of SQLite) supports the REGEXP operator. You will have to think of other (or additional) approach. What pattern should "table_a.id" match?


  • Author
  • July 17, 2018
takashi wrote:

I don't think SQL for the InlineQuerier (i.e. standard components of SQLite) supports the REGEXP operator. You will have to think of other (or additional) approach. What pattern should "table_a.id" match?

There are two tables, one containing the keyword and ID, other one full names and I'd like to assign an ID based on the keyword, like this.

 

 

Table 1:IDKeyword111Starbucks321Golden Tulip555ZZTable 2:

 

Place NameAmsterdam Golden Tulip HotelStarbucks AirportGolden Plazza HotelResulting table should be: Place NameIDAmsterdam Golden Tulip Hotel321Starbucks Airport111Golden Plazza Hotel- In this example "Golden Plazza Hotel" shouldn't get any ID as it doesn't have any of the keywords, but when using this SQL statement: SELECT * FROM table_a JOIN table_b ON table_a.id LIKE '%' || table_b.id || '%' it picks up ID 555 as the Place name "Golden Plazza Hotel" contains 'zz'. That's why I was thinking of word boundaries that could help me to make the expression less greedy and solve this issue. If REGEXP is not an option, is there anything else I could try?

takashi
Contributor
Forum|alt.badge.img+21
  • Contributor
  • July 17, 2018
takashi wrote:

I don't think SQL for the InlineQuerier (i.e. standard components of SQLite) supports the REGEXP operator. You will have to think of other (or additional) approach. What pattern should "table_a.id" match?

An easy workaround is to filter out undesired matching with the StringSearcher.

 

SQL Statement in the InlineQuerier

 

select a."Place Name", b."ID", b."Keyword"
from "Table2" as a
left outer join "Table1" as b
on a."Place Name" like '%'||b."Keyword"||'%' 
Regular Expression in the StringSearcher

 

[[:<:]]@Value(Keyword)[[:>:]]

0684Q00000ArMpjQAF.png


  • Author
  • July 17, 2018
takashi wrote:
An easy workaround is to filter out undesired matching with the StringSearcher.

 

SQL Statement in the InlineQuerier

 

select a."Place Name", b."ID", b."Keyword"
from "Table2" as a
left outer join "Table1" as b
on a."Place Name" like '%'||b."Keyword"||'%' 
Regular Expression in the StringSearcher

 

[[:<:]]@Value(Keyword)[[:>:]]

0684Q00000ArMpjQAF.png

Great, exactly what I was looking for. Thank you @takashi

 


takashi
Contributor
Forum|alt.badge.img+21
  • Contributor
  • July 17, 2018
takashi wrote:

I don't think SQL for the InlineQuerier (i.e. standard components of SQLite) supports the REGEXP operator. You will have to think of other (or additional) approach. What pattern should "table_a.id" match?

Good to hear. The workflow was able to be simpler a bit.

 

 


  • Author
  • July 17, 2018
takashi wrote:
Good to hear. The workflow was able to be simpler a bit.

 

 

Nice!

 

I have a follow-up question: While running my workbench I noticed that in those cases when the Place name can have two possible keywords, in the resulting table it will create a duplicate (it will take the same place twice each time assigning different ID).

 

For example,

 

The place name is: Best Western Premier Budapest

 

Keywords:

 

Best Western id 54

 

Best Western Premier id 654

 

 

For now the new place will be taken twice with each of the id this way creating a duplicate:

 

Best Western Premier Budapest id 54

 

Best Western Premier Budapest id 654

 

Is there any way to prevent it, some way to make it select the longest keyword or even just the first one that is found?

 

 

Thanks!

 

 


takashi
Contributor
Forum|alt.badge.img+21
  • Contributor
  • July 17, 2018
takashi wrote:

I don't think SQL for the InlineQuerier (i.e. standard components of SQLite) supports the REGEXP operator. You will have to think of other (or additional) approach. What pattern should "table_a.id" match?

A possible way is to sort the features by length (number of characters) of Keyword descending and then filter out duplicate Place Name(s).

 


  • Author
  • July 17, 2018
takashi wrote:
A possible way is to sort the features by length (number of characters) of Keyword descending and then filter out duplicate Place Name(s).

 

Perfect! Now everything is just as it should.

 

Thanks again @takashi!

 


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings