Skip to main content
Question

Proper use of the OSM reader?

  • 15 November 2012
  • 5 replies
  • 17 views

I'm trying to load OpenStreetMap using FME. This works great with small datasets in 2012, however as soon as I start playing with large datasets (millions of features) I hit the problem that the FME OSM reader acts as a grouper - it reads the entire dataset in before it actually trys to translate everything.

 

 

Does this have to happen? Is there a way to disable it or is this a limitation of the XML reader? It's massively slowing down everything because even with 32GB of RAM, FME is still filling it up and then writing to temp disk space (despite the original dataset only being about 7GB uncompressed).

 

 

Using 2012 SP4 64bit.

 

2013 13206 is 5.5 times slower on the small dataset, so not using that despite changes in #13082 to make "it better with large datasets" - not sure what that means.
I think that this is due to the xml reading and it having to go through the entire docuement to find what you need. I see similar results when I use the AIXM reader and set the Feature types to read to one feature type. It still reads all the feature but only passes the one I chose through the workbench.

 

 

-Sean
Thanks Sean. I gathered that much but it doesn't make sense because FME has already done that when I first added the reader (so I pointed the reader at a smaller dataset). Furthermore why would it need to store all of the features rather than aggregating a list of them and testing against that.

 

 

It gets even worse. My log contains this:

 

 

2012-11-15 18:32:12|11859.9|  0.5|INFORM|XML Reader mapped feature # 35110000

 

2012-11-15 18:32:12|11860.4|  0.5|INFORM|XML Reader mapped feature # 35112000

 

2012-11-15 18:32:20|11862.6|  2.2|INFORM|XML Reader mapped feature # 35114000

 

2012-11-15 18:32:32|11865.8|  3.2|INFORM|XML Reader mapped feature # 35116000

 

2012-11-15 18:32:43|11869.7|  3.8|INFORM|XML Reader mapped feature # 35118000

 

2012-11-15 18:32:53|11873.3|  3.6|INFORM|XML Reader mapped feature # 35120000

 

2012-11-15 18:33:06|11877.9|  4.6|INFORM|XML Reader mapped feature # 35122000

 

2012-11-15 18:33:21|11884.6|  6.8|INFORM|XML Reader mapped feature # 35124000

 

 

Note how the loading times shoot up from 0.5 seconds per to 6.8 seconds. It never got that quick again

 

As of when I get in this morning I'm seeing this:

 

 

2012-11-16 08:50:36|62778.9|173.2|INFORM|XML Reader mapped feature # 36588000

 

2012-11-16 08:56:14|63116.4|337.4|INFORM|XML Reader mapped feature # 36590000

 

2012-11-16 09:04:55|63637.1|520.7|INFORM|XML Reader mapped feature # 36592000

 

2012-11-16 09:13:09|64130.8|493.7|INFORM|XML Reader mapped feature # 36594000

 

 

That's 493 seconds to read 2000 features! Looking at the performance chart it seems that FME is now constantly reading my HDD (FME_TEMP has 12GB of data, my RAM has 17GB).

 

 

It appears that the OSM reader in 2012 is utterly unuseable for large datasets, even on a very powerful machine.

 

Does anyone know if these issues are resolved in 2013?
Ok, so I bit the bullet and went with 2013. It doesn't do the XML reader feature mapping (guess that's the #13082 change) but still has its own issues:

 

 

a) It seems to be much slower than 2012.

 

b) It's not using much of my hardware capacity. On average it's using about 30-40% of a single core. FME typically uses 100% of a core. In fact, I've never seen a processing task use less except in specific circumstances. RAM usage is a steady 100MB (good thing) and HDD reading/writing is a steady ~1-2MB. So there's no obvious hardware reason for this as the machine is otherwise unused.

 

 

On the upside, at least it looks like it might work.
Howdy,

 

Depending on the complexity of the job you are doing it might be worthwhile converting OSM to a Postgis database in advance of further processing.

 

 

This is commonly done to implement tile renderers but may also be useful for analytical or other purposes.

 

 

Links to tools that do this be found on the osm wiki http://wiki.openstreetmap.org/wiki/PostGIS
Hi Richo,

 

Safe have confirmed that the 2013 reader is slower (only 2.5 times slower for them, 5.5 for me)). They've given it PR41669.

 

 

Testing even further. If I use my incredibly simple workspace and write to a SQLite writer, I get the aforementioned not-using-full-CPU problem. But if I write to "NULL", it does use 100% of a core.

 

 

PostGIS isn't an option; this is just temporary storage so I was storing to SQLite. The final destination will be Oracle.

 

Cheers,

 

Jonathan

 


Reply