Question

Optimize 500mb GeoJSON-read


Badge +21

Any way to optimize the read of a 500mb GeoJSON-file? Already using 64bit Win with 10GB RAM. Trying to squeeze down the time it takes to read it

 


12 replies

Userlevel 4
Badge +13
maybe downloading the file locally, might help, but if you are interested in dynamic data, than it might not be what you are looking for.
Badge +21
Its already downloade locally (and its also an option in the READER settings to keep it locally).
Userlevel 4
Hi,

 

 

unfortunately, I don't have any tricks regarding such huge geojson files, but here is an alternative strategy if speed is crucial:

 

 

Read the file inside a PythonCreator using the module geojson to serialize the features into FMEFeature objects. I'd be surprised if that wasn't a fair bit quicker.

 

 

Lykke til

 

 

David
Userlevel 4
Badge +13
A geojason 2 ffs conversion and using the ffs?
Badge +21
Itay: Thats cheating doing the benchnmark test which is GeoJSON -> SQLite.

 

 

It takes 12min to read the GeoJSON with 64bit, 10GB RAM, SSD, and 4 minutes to write the SQLite. NOT a long time, however just wanted to see if it was possible to cut it even more. 
Userlevel 4
Badge +13
ha :) your just testing......4 min is not a lot of time, seen worse cases.
Userlevel 4
Agree with Itay, 12 minutes to serialize 500MB of text into the internal feature representations is actually quite impressive.

 

 

Am curious about the reasoning behind these tests.

 

 

David
Userlevel 4
Badge +13
My guess is: the unfortunate human wanting for faster and more..... :)
Badge +21
Just comparing to Arc-software and other opensourcetools - to brand FME as the fastest :)
Userlevel 4
Give me a word if you want some competition from a pure Python solution using the geojson and sqlite3 modules ;-)

 

 

But if user friendly enters as a parameter to the tests, there is no question that FME will win hands-down, regardless!

 

 

David
Badge +21
Since we are in competetive mode, have a look at this thread with comments regarding ArcPy and Dissolve. 4.1 seconds with FME :) :

 

 

http://www.mindland.com/wp/solving-the-arcpy-dissolve/

 

 

Userlevel 4
Yeah, I saw that post, very interesting.

 

 

But comparing 4.1 (FME) vs 4.5 (Python shapely) vs 1.5 (JEQL) seconds is a bit moot when everybody is sitting on wildly different hardware ;-)

 

 

Still, fascinating discussion.

 

 

David

Reply