Skip to main content
Question

Optimize 500mb GeoJSON-read


sigtill
Contributor
Forum|alt.badge.img+24

Any way to optimize the read of a 500mb GeoJSON-file? Already using 64bit Win with 10GB RAM. Trying to squeeze down the time it takes to read it

 

12 replies

fmelizard
Contributor
Forum|alt.badge.img+16
  • Contributor
  • March 22, 2013
maybe downloading the file locally, might help, but if you are interested in dynamic data, than it might not be what you are looking for.

sigtill
Contributor
Forum|alt.badge.img+24
  • Author
  • Contributor
  • March 22, 2013
Its already downloade locally (and its also an option in the READER settings to keep it locally).

david_r
Evangelist
  • March 22, 2013
Hi,

 

 

unfortunately, I don't have any tricks regarding such huge geojson files, but here is an alternative strategy if speed is crucial:

 

 

Read the file inside a PythonCreator using the module geojson to serialize the features into FMEFeature objects. I'd be surprised if that wasn't a fair bit quicker.

 

 

Lykke til

 

 

David

fmelizard
Contributor
Forum|alt.badge.img+16
  • Contributor
  • March 22, 2013
A geojason 2 ffs conversion and using the ffs?

sigtill
Contributor
Forum|alt.badge.img+24
  • Author
  • Contributor
  • March 22, 2013
Itay: Thats cheating doing the benchnmark test which is GeoJSON -> SQLite.

 

 

It takes 12min to read the GeoJSON with 64bit, 10GB RAM, SSD, and 4 minutes to write the SQLite. NOT a long time, however just wanted to see if it was possible to cut it even more. 

fmelizard
Contributor
Forum|alt.badge.img+16
  • Contributor
  • March 22, 2013
ha :) your just testing......4 min is not a lot of time, seen worse cases.

david_r
Evangelist
  • March 22, 2013
Agree with Itay, 12 minutes to serialize 500MB of text into the internal feature representations is actually quite impressive.

 

 

Am curious about the reasoning behind these tests.

 

 

David

fmelizard
Contributor
Forum|alt.badge.img+16
  • Contributor
  • March 22, 2013
My guess is: the unfortunate human wanting for faster and more..... :)

sigtill
Contributor
Forum|alt.badge.img+24
  • Author
  • Contributor
  • March 22, 2013
Just comparing to Arc-software and other opensourcetools - to brand FME as the fastest :)

david_r
Evangelist
  • March 22, 2013
Give me a word if you want some competition from a pure Python solution using the geojson and sqlite3 modules ;-)

 

 

But if user friendly enters as a parameter to the tests, there is no question that FME will win hands-down, regardless!

 

 

David

sigtill
Contributor
Forum|alt.badge.img+24
  • Author
  • Contributor
  • March 22, 2013
Since we are in competetive mode, have a look at this thread with comments regarding ArcPy and Dissolve. 4.1 seconds with FME :) :

 

 

http://www.mindland.com/wp/solving-the-arcpy-dissolve/

 

 


david_r
Evangelist
  • March 22, 2013
Yeah, I saw that post, very interesting.

 

 

But comparing 4.1 (FME) vs 4.5 (Python shapely) vs 1.5 (JEQL) seconds is a bit moot when everybody is sitting on wildly different hardware ;-)

 

 

Still, fascinating discussion.

 

 

David

Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings