Question

SHP TO SQLITE Spatial Very Slow

  • 10 February 2020
  • 12 replies
  • 10 views

Badge

have a few shape files that I'm merging together using multiple transformers including LineOnLineOverlayer and Line Combiner and writing to SQLITE Spatial format. The file processing takes a few minutes but the SQLite Spatial writer takes up to half an hour to write 130,000 line features. It must find something disagreeable with the processed features that is over my head. Once the features are in SQLite format, and doing any additional processing, the SQLITE writer has no more issues. I have tried it figuring this out from all angles, but its time to reach out to the pros. I have attached a sample line feature right before and after SQLite Writer. If you need more details let me know.

Thanks in Advance.

I


12 replies

Badge +3

It can be indices that do this, or the file system/media being written to is slow.

For checking the impact of indices, compare by writing to an entirely new SQLite Spatial database with no index creation options. Similarly the impact of the downstream file system can be compared by instead writing it to a fast local storage.

Also note that the index creation Yes\\No dropdown on the field parameters in the Writer create the indices before writing the features instead of afterwards. It is instead recommended to create indices in the Writers by using CREATE INDEX statements in the SQL to Run after Write parameter.

Badge

@bwn

It is not the media written to because if I switch to SHP writer it is super fast. And once the file is made, with any additional processing the SQL writer goes fast as well.

I'm using the SqLite Spatial(FDO) writer. This does not have the index creation option. I need to use this writer because of file compatibility.

Badge +3

@bwn

It is not the media written to because if I switch to SHP writer it is super fast. And once the file is made, with any additional processing the SQL writer goes fast as well.

I'm using the SqLite Spatial(FDO) writer. This does not have the index creation option. I need to use this writer because of file compatibility.

The SQLite Spatial FDO Writer in latest FME does have index creation options.

On my PC, just tested the Writer. It wrote 750,000 complex polygons (Cadastre with a very large number of attributes) to a Spatial FDO database on my local SSD in 59 seconds. So the underlying Writer itself looks to be high performance.

Can you screenshot how you've configured the Writer? I suspect it is something in there? The data schema you've shown doesn't look complex and shouldn't be the issue.

Badge

@bwn

EDIT: If I merge the features into an existing SQLite dataset the writer is fast as well.

I updated to FME Desktop 2019.2.3(latest version) and same results. Here are my writer settingsfme settings.PNGfme settings2.PNG

Badge +3

@bwn

EDIT: If I merge the features into an existing SQLite dataset the writer is fast as well.

I updated to FME Desktop 2019.2.3(latest version) and same results. Here are my writer settingsfme settings.PNGfme settings2.PNG

Thanks, a couple of questions:

  • What were the settings in the User Attributes Tab?
  • What is the full path of the SQLite database being written? Is this is a local drive or a network drive?

Apart from that, you don't need "Overwrite Table" set to "Yes" when the "Overwrite Existing Database" option is turned on, but that shouldn't make any difference.

Possibly to dig down a bit further is looking over the Translation Log to see at what steps the time is being consumed. It seems a little strange that creating a new SQLite database vs writing to an existing one would create any difference. The process to create a new DB with SQLite takes virtually no time at all, unless something like an overzealous background virus protection program takes a lot of time with the new file before deciding to allow it to be accessed further.

If the Translation Log "Options" button is clicked, there is a option to "Log Timestamp Information". This should show the relative times between creating the database file, when FME begins to write to the file, and how long it takes.

Badge

fme user attr.PNG

@bwn

The full file path is a local drive.

I'll do a few translations with timestamp on.

But what I have discovered, when I bypass the line modifying transformers the SQLite Writer is very happy and goes fast. The Line combiner is what really slows it down. And these line features are 2D, as simple as they get. Writing the same files to Shape file format to same directory goes lightning fast. I'm about ready to believe there's a bug in the writer.

Badge +3

fme user attr.PNG

@bwn

The full file path is a local drive.

I'll do a few translations with timestamp on.

But what I have discovered, when I bypass the line modifying transformers the SQLite Writer is very happy and goes fast. The Line combiner is what really slows it down. And these line features are 2D, as simple as they get. Writing the same files to Shape file format to same directory goes lightning fast. I'm about ready to believe there's a bug in the writer.

Perhaps it is something as silly as the Writer not maintaining an active connection to the SQLite database when the Writer has to "wait" on upstream transformers and it keeps disconnecting and re-connecting on every few features, causing a cumulative time overhead in having to perform extra connection/disconnection steps numerous times (SQLite writing is by doing database transactions: Anything that wants to write data to an SQLite file first has to connect to it and this step takes a little bit of processing time).

A workaround may be to insert a FeatureHolder immediately upstream of the Writer to prevent FME from writing any features to SQLite until all the features are ready to be written. FeatureHolder will stop the Writer from doing anything until everything like LineCombiner is finished first.

Another potential problem is that the geometry coming out of LineCombiner may not be "simple" enough to directly write to fdo_line format in SQLite spatial format, and the Writer is churning in having to convert the geometry from one form to another, even though in Inspector the geometries "look" like simple 2D lines. I've recently experienced a little bit of craziness in FME in that some of the geometries coming out of LineCombiner seem to have some hidden geometry traits that upset other downstream transformers.

The "hack" I used that seemed to work was post-processing all the geometries with GeometryRefiner as a final step, which in theory shouldn't have done anything, but for unexplained reasons allowed the downstream transformers to function properly again, even though Inspector was indicating that it didn't change anything on any geometry properties or traits that were exposed.

Badge

@bwn

The features all load into the writer before it starts outputting, so FeatureHolder did not have any affect.

I was sure GeometryRefiner would do the trick, but unfortunately it had no effect either.

If I would have a way of privately sharing this project with you I would gladly do so.

Badge +3

@bwn

The features all load into the writer before it starts outputting, so FeatureHolder did not have any affect.

I was sure GeometryRefiner would do the trick, but unfortunately it had no effect either.

If I would have a way of privately sharing this project with you I would gladly do so.

The behaviour seems somewhat inconsistent, so yes that certainly makes it difficult to diagnose.

So just to check whether missed anything:

  1. The Workflow Writes OK if the Writer does not create a new SQLite file, but instead writes to an existing SQLite file (or was this instead, append to an existing table?)
  2. The Workflow Writes OK if the Writer does not use the output from LineCombiner
  3. The Workflow Write OK if instead use the SHP file Writer

That is indeed difficult to pin down. 1 and 2 don't seem to have anything in common.

If it is to do with the Writer struggling with the FME feature geometry in some way, then perhaps try switching the geometry type to "fdo_null", and seeing if that makes any difference to narrow down possibilities. It there is no difference then it suggests it is not to do with the geometries themselves.

Otherwise, really clutching at straws, but perhaps FME and SQLite are struggling with SQLite's dynamic data typing. Each value in SQLite has its own data type, the setting in the Table definition is just a "loose" data typing, it actually doesn't get decided on a per value level until something attempts to commit the data in a transaction, but the application writing the data can optionally provide "hints" to SQLite via the API in order to help it decide if it is a string, a numeric etc. Whether the FME Writer uses this functionality for or not is unknown. LineCombiner might upset the carrying forward of the workspace schema into the Writer and that as a result the Writer keeps trying to hint at data types that SQLite disagrees with and overrules causing time delay. But that doesn't really explain why Workaround 1 seems to be successful, and I don't think I've seen a real-world instance of this happening for applications writing to SQLite despite it being a possibility. Other things that can be tried in this space is setting the Table Schema in the Writer Manually and providing your own hints through setting the Field data types in the drop downs.

Perhaps trialling yet another variation of Workaround 1 , this can be emulated by setting the Writer to instead use blank SQLite database file with a zero record table already setup as a Template file in the Writer Parameters which effectively adds a step of copying an existing SQLite file to a new file path, and then opening this and writing to it. This overrides the Writer's default behaviour of getting the API to generate a new SQLite Spatial database from scratch. I can't see why this should work or make any difference, but if that is what your testing suggests then who knows!

Also, I'm not sure how flexible you are with downstream format, but SpatiaLite is somewhat functionally equivalent and tends to be jointly supported by GIS packages that can read alternately either SQLite Spatial or SpatiaLite. The geometry schema and schema metadata tracking is a little different, but at their core they store data basically the same way, although the support for one format vs the other can vary depending on what application are needing to use it in.

Badge

@bwn

Thanks for the detailed reply.

I should clarify your statement 1.

I'm doing my data processing in two main steps The main data set consists of three combined subsets. I run the subset processes individually, which consists of reading the original shape files, processing them to my data standards, then writing them as SQLite Files. This is where the writer is very slow.

The second process, is basically integrating the SQLite subsets into the main data set. At this point the writer is very fast, I can do any additional line modifications, like LineCombiner and LineOnLineOverlayer, the writer is unnoticeable y affected. That tells me once the data is structured for SQLite the writer is very happy..

Setting the geometry to "fdo_null" does not speed up the writer, so this is not geometry related.

The reason I need SQLite, I need the final files to work for AutoCAD Map and QGIS, and I haven't been able to get the Spatialite files to work with AutoCAD Map. The SpatialLite writer does not have those slowness issues though.

Badge

The best solution I've come up with, is to use AutoCAD Map or QGIS to convert the original shapefiles to SQLite, then do my FME file processing. Using this method, the SQLite writer has the normal speed. Its an extra step, but for the half hour + I save its not a big deal.

Badge +2

@djcad With workspaces, data and additional information supplied by @djcad of-line, it looks like this degradation in performance for the SQLITE3FDO writer (and possibly SDF writer) is caused by a conflict between the FDO 3.8 libraries used by FME (that correspond to AutoCAD 2018 IIRC) and an install of Autodesk 2018. If you do not have Autodesk installed, or if you have another version of Autodesk, then you don't see this issue.

We'll try and track down the cause of the problem

Reply