Question

Splitting an FME Binary attribute based on string length

1 year ago
October 19, 2023
2 replies
39 views

ponting13
Contributor
10 replies

Hey Team!

I'm scratching my head over a problem that I'm sure one of the gurus on here can help with.

I have some very large spatial files that I need to store in a Snowflake environment.

Snowflake has a few big issues when it comes to spatial data. It has very strict acceptance criteria on anything submitted as geometry so we have decided to store all of the spatial information as a binary field to pull out later (using GeometryExtractor) .

However when writing this binary attribute to snowflake there is a size limit of 16MB... and when some of your features have over a million vertices this is not really going to work.

We are looking into storing the FME binary over a number of attributes (_geom1, _geom2, ect.) to stick back together when needed, but cant get any string based functions to work due to it now being in binary format. I have tried substring extractors and attribute compressors with not much luck.

Any Idea?

david_r
8331 replies
1 year ago
October 19, 2023

Here's some Python code to split up an FME binary (e.g. from the GeometryExtractor) into chunks of a pre-defined size (1600 bytes in this small example, you probably want to set it to 16777216, or one less):

def chunked(size, source):
    for i in range(0, len(source), size):
        yield source[i:i+size]
 
def FMEBufferSplitter(feature):
    chunk_size = 1600  # change as needed
    geom = feature.getAttribute('_geometry')
    parts = chunked(chunk_size, geom)
    feature.setAttribute('_geometry_parts{}', list(parts))

And conversively, here's how to join them back together:

def FMEBufferJoiner(feature):
    parts = feature.getAttribute('_geometry_parts{}') or []
    joined = b''.join(parts)
    feature.setAttribute('_geometry', joined)

I've also attached a demo workspace (2022.2) with sample data from the FME training dataset.

If possible, I would rather consider writing the geometries in something like PostGIS and then use a GUID in Snowflake as a foreign key to the table holding the actual geometry.

ponting13
Author
Contributor
10 replies
1 year ago
October 23, 2023

david_r wrote:

def chunked(size, source):
    for i in range(0, len(source), size):
        yield source[i:i+size]
 
def FMEBufferSplitter(feature):
    chunk_size = 1600  # change as needed
    geom = feature.getAttribute('_geometry')
    parts = chunked(chunk_size, geom)
    feature.setAttribute('_geometry_parts{}', list(parts))

And conversively, here's how to join them back together:

def FMEBufferJoiner(feature):
    parts = feature.getAttribute('_geometry_parts{}') or []
    joined = b''.join(parts)
    feature.setAttribute('_geometry', joined)

I've also attached a demo workspace (2022.2) with sample data from the FME training dataset.

If possible, I would rather consider writing the geometries in something like PostGIS and then use a GUID in Snowflake as a foreign key to the table holding the actual geometry.

This worked perfectly thanks!

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Splitting an FME Binary attribute based on string length

2 replies

Reply

Helpful Members This Week

Recently Solved Questions

N-Triples (.nt) data

System Email config for Microsoft Exhange Online with OAuth

How can i extract data from this line?

Log - what's taking so long?

Make a dynamic line segmentation with points

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

Mirror/ copy tables Microsoft SQL Servericon

MSSQL to Oracle resulting in all NULL in destination tableicon

I have an old .gdb file from an Interbase V5 database from Delphi that is 20 years old that we need to transfer to a MS SQL database. We want to create a .csv or .dbf file from the .gdb file so we can import it to MS SQL using SSMS.icon

SQL Server JDBC writing erroricon

loop? for this.. if so how?icon

Helpful Members This Week

Recently Solved Questions

N-Triples (.nt) data

System Email config for Microsoft Exhange Online with OAuth

How can i extract data from this line?

Log - what's taking so long?

Make a dynamic line segmentation with points

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings