Question

Using/selecting multiple tables in Joiner


Hello, 

 

 

I am reading in a lot of features and would like to join to them a whole load of other features, based upon an ID. (they are .dbf at the moment).

 

 

These are yearly comparisons so data is read in by years defined in parameters. 

 

 

Originally, I read in all data required then used FeatureMerger to merge on the ID but after 24 hours it had read the features and done only 6% of the merge and looked like taking forever.

 

 

Now i'm trying to use the Joiner - just read in one year of data and join another year to it - and selecting the dataset folder is fine but i want to use ALL the tables within the folder but the selection menu only allows you to use one. (the data is tiled due to size). Ive tried to use the * wildcard in the Table field but no joy. 

 

 

Any suggestions out there?

3 replies

Userlevel 5
Hi,

 

 

I don't think this is possible. You should consider using the InlineQuerier, however, it can be very efficient for these scenarios.

 

 

Alternatively, there is the SQLExecutor too.

 

 

On a side note: if you have huge amounts of data, I would highly recommend using a proper database, e.g. PosgreSQL. You will then be able to pre-join your data in the databas, which can make an important difference in the performance.

 

 

David
Hi, 

 

 

yes i think a pre-join in PosgreSQl would be wise. I dont actually know how to use it but it may be the best way. I've also looked at the InlineQuerier but i think it would still be too time/memory consuming.

 

 

cheers
Userlevel 5
Hi,

 

 

it should be relatively simple:
  • Load data from the .dbf files into similar table structures in e.g. PostgreSQL (the FME wizard will do most of the work for you here, just make sure that you verify the data types that the wizard suggests for your output tables)
  • Create indexes on all primary and foreign keys (use a database admin interface, such as PgAdmin for this)
  • Create a SELECT statement that joins your tables. If necessary, analyze performance. Check the SELECT statement in your query tool of choice, for PostgreSQL this could be PgAdmin.
  • In your workspace insert a SQLCreator and copy the SELECT statement into it. Make sure that you expose the attributes from the SELECT statement to make them visible in the workspace.
That should be about it. Best of luck!

 

 

David

Reply