You could try the custom transformer FuzzyStringCompareFrom2Datasets
If you just want to join with a like query then the InLineQuerier will allow you to do that.
You could try the custom transformer FuzzyStringCompareFrom2Datasets
If you just want to join with a like query then the InLineQuerier will allow you to do that.
Thanks for the reply, I was actually trying the InlineQuerier, but maybe I have something off I am getting no results, and it stalls for like 30 mins at one point?
I'm not sure the SQLite Query Optimiser used in InlineQuerier would perform well with a LIKE being used like this in a JOIN ... ON condition. The way it works is it would possibly test the WHERE condition twice, once to establish the Cartesian product of the JOIN which evaluates to TRUE (there is extra processing done when using "ON" in a JOIN expression), and then an additional test of the WHERE clause to filter this result, even though it is the same WHERE clause.
Also, don't use "+" to concatenate. In most SQL syntaxes it is a double-pipe || to concatenate strings. This will be the most likely reason that no results are returned, "+" is a literal maths/arithmetic operator in SQLite, and on non-numeric strings it will return basically in your current statement the equivalent of "... LIKE NULL", which will return no results as no string can be LIKE NULL.
Additionally, to significantly decrease the run time of InlineQuerier when dealing with lots of records, then it is highly recommended to limit the data passed to InlineQuerier to just:
- An Identifier in Table A;
- An Identifier in Table B; and
- The 2 comparison data fields.
Do this with an AttributeManager in front of the InlineQuerier to remove all unneeded Attributes before passing to InlineQuerier.
Similarly, limit the Attributes being returned in the SQL statement. When you are dealing with a comparison result of potentially up to 3,000 x 1 million = 3 billion records in this case, this will make a substantial performance difference in the Transformer as well as reducing the SQLite engine's run-time.
Try instead like this:
SELECT A.IdentifierField, B.IdentifierField
FROM CustomerData_22020 A, Sheet1 B
WHERE A.account_owner_salesrep LIKE '%'||B.LastName||'%'
You could try the custom transformer FuzzyStringCompareFrom2Datasets
If you just want to join with a like query then the InLineQuerier will allow you to do that.
@morijlumos @cdalessandro has updated his FuzzyStringCompareFrom2Datasets trasnformer to be python 3 compliant.