Skip to main content

Hello Forum,

 

I have several workspaces where I take a series of csv exports from our other applications and convert to useable GIS files such as tab, shp or gpkg. This example is where I am consuming a csv export from our GMS and publishing as .tab to location A and geopackage to location B.

 

The problem I’m seeing is about the size of the output files. For instance, take this part…

 

swk1

 

The orange circled writer creates this tab…

 

swk2

 

Which has a combined files size of 56.9mb…

 swk3 

The blue circled writer creates this gpkg, which is 1.3gb in size!…

 

 swk4 

 

 I know gpkg adds a populated 'id' field, where tab doesn’t. But surely that alone can’t explain the vast difference between the two outputs (tab: 56mb, gpkg: 1.3gb). This issue is replicated across the other parts of the workspace's outputs, and in other workspaces where I publish to gpkg.

 

In addition, I think it may have something to do with indexing. So, I tried to set the FeatureWriter to not create the index. But this made no difference to the output size.

swk5

 Is there a setting in the workspace causing such huge differences?

 

Note that the tab and geopackage output (for LLPG-postally addressable commercial properties) both have 12,965 records.

 

Thanks,

Stuart

If you read the Geopackage back into FME, does it contain only 12965 records and is that the only table present in the geopackage? Also, did you define indexes on any of the attributes? Perhaps try manually deleting the geopackage file an re-running the workspace before comparing again.


If you read the Geopackage back into FME, does it contain only 12965 records and is that the only table present in the geopackage? Also, did you define indexes on any of the attributes? Perhaps try manually deleting the geopackage file an re-running the workspace before comparing again.

Thanks for that, david_r,

The gpkg contains 12965 records, the same as the tab/shp.

I did not knowingly define indexes on any of the attributes.

Thanks,

Not a clue, but what happens if you use QGIS to do the same conversion?


Not a clue, but what happens if you use QGIS to do the same conversion?

Hi,

If I open the tab in QGIS and ‘Export>Save Features As’ to gpkg in QGIS, the output gpkg is 3,796kb. And to confirm, that output contents is all correct and present.

Thanks,
Thanks for that, david_r,

The gpkg contains 12965 records, the same as the tab/shp.

I did not knowingly define indexes on any of the attributes.

Thanks,

Did you try manually deleting the geopackage file, then regenerating the contents with FME? Is the file size still the same?

Also, note that Geopackage is based on SQLite, meaning that you can open the gpkg file with e.g. DB Browser for SQLite to verify the exact contents, indexes, etc.


My guess is that it's somehow coming from the way the columns are defined - Can you check what the output schema definition looks ok - Are there any text fields with really big widths or something weird like that - or indexes defined? The MapInfo file also look to have pretty big DAT file sizes (at least compated to the 3-4MB file from QGIS).

 

I wasn't able to reproduce what you're seeing myself but my test data was really small, certainly defining an index on the output data made the file jump in size though.


Reply