A few things to be aware of:
1) My preference for this kind of task is to use the WorkspaceRunner. It's simpler to use and will allow you to process 7 DWG files concurrently.
2) An initial load to a set of staging tables will allow your process to be more efficient, without the need for blocking transformers.
3) Fanout will also need to be handled at the dataset level when writing to DWG.
Consider the following approach (though it depends on the volume of GZs your processing as to the exact workflow:
- Workspace 1, load all GZs for your region to a database. PostGIS, SQlite or even FME's FFS. Use this process to tag all features with the 1250 tiles they are inside. Read the Grid in as the first reader, use the Clipper with 'Clippers first' defined to cut each feature and assign the grid. Apply any necessary styling and any other values to the OSMM features.
- Workspace 2, setup to read your grid file tile names and form a SQL Statement or value to populate your child workspace parameters with on a per tile basis to pass to the WorkspaceRunner. This will allow all child workspaces to read in ONLY 1 tile worth of data at a time and write to DWG... but you'll be able to run upto 7 at once.
- Workspace 3, the child workspace with the database reader and a DWG writer. The writer set to write out in dynamic mode so all necessary layers are written to each DWG. Then set Fanout on the database and use the tilename attribute so that you get 1 DWG per tile.
A few things to be aware of:
1) My preference for this kind of task is to use the WorkspaceRunner. It's simpler to use and will allow you to process 7 DWG files concurrently.
2) An initial load to a set of staging tables will allow your process to be more efficient, without the need for blocking transformers.
3) Fanout will also need to be handled at the dataset level when writing to DWG.
Consider the following approach (though it depends on the volume of GZs your processing as to the exact workflow:
- Workspace 1, load all GZs for your region to a database. PostGIS, SQlite or even FME's FFS. Use this process to tag all features with the 1250 tiles they are inside. Read the Grid in as the first reader, use the Clipper with 'Clippers first' defined to cut each feature and assign the grid. Apply any necessary styling and any other values to the OSMM features.
- Workspace 2, setup to read your grid file tile names and form a SQL Statement or value to populate your child workspace parameters with on a per tile basis to pass to the WorkspaceRunner. This will allow all child workspaces to read in ONLY 1 tile worth of data at a time and write to DWG... but you'll be able to run upto 7 at once.
- Workspace 3, the child workspace with the database reader and a DWG writer. The writer set to write out in dynamic mode so all necessary layers are written to each DWG. Then set Fanout on the database and use the tilename attribute so that you get 1 DWG per tile.
Please note there is also both a GML Reader and an OS (GB) MasterMap Reader in FME. The former reads the GML in its raw state, the latter does some schema mapping behind the scenes and also deals with some of the geometry, for example does some work on CartoText for you.
Many thanks for your advice @1spatialdave I have managed to achieve what I wanted to with this. I ended up calling FME from a .bat file and specifying the --SourceDataset and --DestDataset for each OS gml chunk. The resultant .bat is long and probably a bit clumsy but it gets the job done! I have taken your comments on board for future workspaces! thanks
Many thanks for your advice @1spatialdave I have managed to achieve what I wanted to with this. I ended up calling FME from a .bat file and specifying the --SourceDataset and --DestDataset for each OS gml chunk. The resultant .bat is long and probably a bit clumsy but it gets the job done! I have taken your comments on board for future workspaces! thanks
No problem, your approach is absolutely acceptable and a good way to manage memory use. One tweak to your current method that may increase your speed (hardware dependant) is to create several batch files and split the jobs across each batch. Then run them all at the same time.
In FME Desktop you can launch 8 concurrent fme.exe instances before you hit the license cap. So lets say you have a 6 core machine, you could comfortably build 4 x batch files, split the jobs across them and run them all at once. You'll need to see how your RAM is handled but in theory you could be looking at a process that did take 8 hours, now taking 2ish.
Hope that helps.