Archived

Parallel Reads and Writes

Related products:FME Form

8 years ago
March 30, 2017
4 replies
8 views

jerrodstutzman
3 replies

Many times, I have a workflow that reads multiple tables from one database, performs a few simple tasks to each one individually, and writes to another database. If there are 10 tables, it completes one before moving on to the next. This can add up to long processing times in the end. However, if I split that job up into 10 separate workbenches and run them concurrently on FME Server, the total processing is drastically reduced since they are run in parallel. Unfortunately, that method creates a data management nightmare.

My suggestion is to allow parallel reads and writes within one workbench (when those reads/writes don't depend on the other reads/writes in the same workbench).

Obviously this wouldn't apply if there are table joins or any transformers that hold features.

Example: The screenshot below is a job that takes nearly 24 hours to run due to large amounts of data. However, no single table takes more than a few hours. But since FME runs in "serial" mode, all those hours are added together in the end.

This post is closed to further activity. It may be a question with a best answer, an implemented idea, or just a post needing no comment. If you have a follow-up or related question, please <a href="https://community.safe.com/topic/new">post a new question or idea</a>. If there is a genuine update to be made, please contact us and request that the post is reopened.

+13

rylanatsafe
Safer
627 replies
7 years ago
September 12, 2017

Try replacing all your "native" FME Writers with FeatureWriter Transformers! This should prevent holding data in memory while one writer writes at a time...

+17

bruceharold
Contributor
338 replies
7 years ago
September 12, 2017

Another pattern is to use WorkspaceRunner with no wait for completion. For example if your data are in a directory use the Path reader, process path_windows inputs and write with dataset fanout @Value(fme_basename).

One caveat for Data Interpoerability users; the process limit code appears not to be implemented and you get as many processes as you have inputs, which is exciting!

+19

fmelizard
Safer
3725 replies
7 years ago
September 14, 2017

Hi @jerrodstutzman -- the idea of analyzing the workspace to look for independent flows and then run them in parallel is a very good one. As @RylanAtSafe mentions, in the meantime, using FeatureWriters will provide at least some efficiency boost if your outputs were going to different writers entirely. However, we'll keep thinking about the idea of doing graph analysis and splitting (if the workspace author agrees) chunks to be run in parallel.

rbell
Contributor
2 replies
3 years ago
April 4, 2022

I like this idea. This would resemble what SSIS does, being able to link dependencies graphically. If no dependencies the activity can start in parallel as the software best determines.

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Parallel Reads and Writes

4 replies

Helpful Members This Week

Recently Solved Questions

pivot table long to wide not working with AttributePivoter

How to get the path/location of a transformer like the log files do?

Pass job_id from one workspace to another workspace in an automation

Download zip files from Google Cloud bucket

Autodesk AutoCAD Map 3D Object Data Writer and custom hatches

Community Stats

Latest FME

Cookie policy

Cookie settings

Related Topics

FME Cloud - Running Scheduled Job during an FME Cloud Running Scheduled Instanceicon

Auto-Pause After Inactivity for FME Flow Hosted Instances

FME Cloud task schedullingicon

FME Cloud Server trigger on demand?icon

FME Cloud Performanceicon

Helpful Members This Week

Recently Solved Questions

pivot table long to wide not working with AttributePivoter

How to get the path/location of a transformer like the log files do?

Pass job_id from one workspace to another workspace in an automation

Download zip files from Google Cloud bucket

Autodesk AutoCAD Map 3D Object Data Writer and custom hatches

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings