Proper use of the OSM reader?

I'm trying to load OpenStreetMap using FME. This works great with small datasets in 2012, however as soon as I start playing with large datasets (millions of features) I hit the problem that the FME OSM reader acts as a grouper - it reads the entire dataset in before it actually trys to translate everything.

Does this have to happen? Is there a way to disable it or is this a limitation of the XML reader? It's massively slowing down everything because even with 32GB of RAM, FME is still filling it up and then writing to temp disk space (despite the original dataset only being about 7GB uncompressed).

Using 2012 SP4 64bit.

2013 13206 is 5.5 times slower on the small dataset, so not using that despite changes in #13082 to make "it better with large datasets" - not sure what that means.

Page 1 / 1

I think that this is due to the xml reading and it having to go through the entire docuement to find what you need. I see similar results when I use the AIXM reader and set the Feature types to read to one feature type. It still reads all the feature but only passes the one I chose through the workbench.

-Sean

Thanks Sean. I gathered that much but it doesn't make sense because FME has already done that when I first added the reader (so I pointed the reader at a smaller dataset). Furthermore why would it need to store all of the features rather than aggregating a list of them and testing against that.

It gets even worse. My log contains this:

2012-11-15 18:32:12|11859.9| 0.5|INFORM|XML Reader mapped feature # 35110000

2012-11-15 18:32:12|11860.4| 0.5|INFORM|XML Reader mapped feature # 35112000

2012-11-15 18:32:20|11862.6| 2.2|INFORM|XML Reader mapped feature # 35114000

2012-11-15 18:32:32|11865.8| 3.2|INFORM|XML Reader mapped feature # 35116000

2012-11-15 18:32:43|11869.7| 3.8|INFORM|XML Reader mapped feature # 35118000

2012-11-15 18:32:53|11873.3| 3.6|INFORM|XML Reader mapped feature # 35120000

2012-11-15 18:33:06|11877.9| 4.6|INFORM|XML Reader mapped feature # 35122000

2012-11-15 18:33:21|11884.6| 6.8|INFORM|XML Reader mapped feature # 35124000

Note how the loading times shoot up from 0.5 seconds per to 6.8 seconds. It never got that quick again

As of when I get in this morning I'm seeing this:

2012-11-16 08:50:36|62778.9|173.2|INFORM|XML Reader mapped feature # 36588000

2012-11-16 08:56:14|63116.4|337.4|INFORM|XML Reader mapped feature # 36590000

2012-11-16 09:04:55|63637.1|520.7|INFORM|XML Reader mapped feature # 36592000

2012-11-16 09:13:09|64130.8|493.7|INFORM|XML Reader mapped feature # 36594000

That's 493 seconds to read 2000 features! Looking at the performance chart it seems that FME is now constantly reading my HDD (FME_TEMP has 12GB of data, my RAM has 17GB).

It appears that the OSM reader in 2012 is utterly unuseable for large datasets, even on a very powerful machine.

Does anyone know if these issues are resolved in 2013?

Ok, so I bit the bullet and went with 2013. It doesn't do the XML reader feature mapping (guess that's the #13082 change) but still has its own issues:

a) It seems to be much slower than 2012.

b) It's not using much of my hardware capacity. On average it's using about 30-40% of a single core. FME typically uses 100% of a core. In fact, I've never seen a processing task use less except in specific circumstances. RAM usage is a steady 100MB (good thing) and HDD reading/writing is a steady ~1-2MB. So there's no obvious hardware reason for this as the machine is otherwise unused.

On the upside, at least it looks like it might work.

Howdy,

Depending on the complexity of the job you are doing it might be worthwhile converting OSM to a Postgis database in advance of further processing.

This is commonly done to implement tile renderers but may also be useful for analytical or other purposes.

Links to tools that do this be found on the osm wiki http://wiki.openstreetmap.org/wiki/PostGIS

Hi Richo,

Safe have confirmed that the 2013 reader is slower (only 2.5 times slower for them, 5.5 for me)). They've given it PR41669.

Testing even further. If I use my incredibly simple workspace and write to a SQLite writer, I get the aforementioned not-using-full-CPU problem. But if I write to "NULL", it does use 100% of a core.

PostGIS isn't an option; this is just temporary storage so I was storing to SQLite. The final destination will be Oracle.

Cheers,

Jonathan

I know this thread is over a decade old, but I’m still experiencing the same slow reading of large OSM files in FME 2023 build 23332. Has there been any progress or revelations on getting large OSM files to read faster? Currently trying to read a single OSM file that is over 22gb in size.

Community Stats

Latest FME

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded