Question

Filter and join two tables where a number from one table is between the min and max values in the other table

  • 16 February 2023
  • 7 replies
  • 5 views

Badge

I had a solution to this problem using a FeatureJoiner and a Test Filter that worked for smaller data sets (previous solution below). But my datasets have grown and this process is too slow now. I tried InlineQuerier but couldn't achieve success. I suggestion was to write a python program to do this but I am not experienced enough with python.

 

Is the a way to join to tables as described in the title of the message?

 

https://community.safe.com/s/question/0D5Dm000006cuXDKAY/categorizing-data-based-on-external-table-with-unique-lookup-minmax-values


7 replies

Userlevel 4
Badge +36

Possible SQL statement in InlineQuerier:

select i.*, r.*
from "InputData" i, "Ranges" r
where i."sys_time" >= r."low" and i."sys_time" < r."high"

CategorizeI'm not sure about the speed with your data set though.

 

Badge +2

@gcarmich​ SQLite support a BETWEEN operator that might simplify your query in the InlineQuerier. If you include a small sample dataset someonemight be able to illustrate the query

Badge

Mark, Geomancer did submit an SQL InlineQuerier solution and it worked for my needs. It just took me a while to understand the syntax. I have a workflow with the InlineQuerier and numeric values that work and the Feature Value which does not. I'll post the solution and the files here for future reference:

image 

Userlevel 4
Badge +36

@gcarmich​ SQLite support a BETWEEN operator that might simplify your query in the InlineQuerier. If you include a small sample dataset someonemight be able to illustrate the query

Unfortunately, the BETWEEN operator is always inclusive: it tests for greater than or equal to the lower value, and less than or equal to the higher value.

In this case an exclusive range was needed, so the greater than (>) and less than operators (<) have to be used.

Userlevel 4
Badge +36

Mark, Geomancer did submit an SQL InlineQuerier solution and it worked for my needs. It just took me a while to understand the syntax. I have a workflow with the InlineQuerier and numeric values that work and the Feature Value which does not. I'll post the solution and the files here for future reference:

image 

Hi @gcarmich​, good to see you could use the InlineQuerier, and that it is fast enough for your data.

Badge +2

@gcarmich​ @geomancer​ Just a quick FYI note on performance in the InlineQuerier . 

FME creates an fme_feature_content attribute that contains the entire feature on the input (i.e. all the input attributes) . 

In the Columns section of the Inputs: only include the attributes that you'll explicitly use in your SQL queries  (see the section Configuration -  Inputs of the InLineQueir docs). So, in the example below,  if you're creating a simple JOIN using RECORDID, remove all the other attributes. They'll be preserved in the fme_feature_content and you'll have a much lighter SQLite in memory database.  

imageand your SQL might look something like:

SELECT "CellTower".*, "CellTowerDetail",* FROM "CellTower"
INNER JOIN "CellTowerDetail" 
ON "CellTower"."RECORDID" = "CellTowerDetail"."RECORDID";

In your example above, you'd have to include the "sys_time" , low" and "high" in the Columns

Why clean-up Columns...?

  • You're creating tables that contain many more attributes than you need which will take longer to load and create a larger in memory SQLite database than you need
  • All Columns attributes are indexed.  So for larger data datasets with many attributes that's a lot of indexing 

 

 

Userlevel 4
Badge +36

Thank you @Mark Stoakes​, very interesting, I didn't know this. This can really speed up the InlineQuerier.

I learned something new today, and its only 7:25 AM 😁

Reply