Survived another week? Same. And, this TGIF, we’ve got Nia, one of our FME Flow Technical Specialists, here to help you make sure your FME Flow deployment keeps surviving and thriving, too.
If you need a highly available FME Flow that can survive server maintenance windows, unexpected crashes, cyber attacks, or even natural disasters like floods and wildfires, then stay tuned while we explore two essential concepts: fault tolerance and disaster recovery in FME Flow.
Fault tolerance vs disaster recovery: What’s the difference?
These two architecture types are often confused, so let’s clear up the difference.
Fault tolerance keeps FME Flow up and running if one of its servers goes down. A fault tolerant FME Flow deployment is hosted in a single data center.*
Disaster recovery keeps FME Flow going if an entire data center goes down and all servers in an FME Flow installation are lost. FME Flow with disaster recovery spans two or more data centers.
In other words, fault tolerance ensures redundancy across components in a single installation, while disaster recovery ensures redundancy across installations in multiple data centers.
Fault tolerant FME Flow: What is it?
A fault tolerant FME Flow deployment is a high-availability installation type. It’s a lot like a distributed installation, where FME Flow’s components are spread across different machines at a single data center.*
The key is that each of FME Flow’s components is redundant - there are two (or more) web application server and core nodes, and often two or more engine hosts as well. In addition, the database and the system share can also be made highly available.

From Planning for Fault Tolerance in the FME Flow administrator’s guide.
All components’ host machines are running and communicating with each other at the same time. Another name for this is an active-active setup. It ensures that if one machine in the FME Flow installation goes down (also called a failover event), FME Flow can rely on the redundant component’s host machine to continue running.
You can select fault tolerant when running the FME Flow installer to deploy this architecture. Please note that you will have to provide your own load balancer to route traffic to the redundant web application servers.
Fault tolerant FME Flow: When do I need it?
A fault tolerant FME Flow installation is particularly useful in the following situations…
SLAs that require high availability.
Failover event handling will ensure your installation survives component failures.
Extremely high engine counts, or maximum headroom for additional engines.
The additional engine hosts will have more than one core to connect to.
Extremely high web traffic.
Round robin load balancing can evenly distribute traffic across web application servers.
Disaster recovery and FME Flow: What is it?
Disaster recovery with FME Flow is all about recovering from a catastrophic event resulting in the loss of a data center. It requires redundant FME Flow installations across multiple data centers. If one installation is lost, another hosted in a separate data center can carry on instead.

Example from Planning for Disaster Recovery in the FME Flow administrator’s guide.
These redundant installations are completely separate from each other. They don’t communicate with each other, either. Only one installation is used at a time, so the other installation(s) do not need to be running until they are needed. Another name for this is an active-passive setup.
To make sure that all installations are in sync with each other, routine backup and restores must be performed. And, if a data center goes down, all traffic must be rerouted to the other installation.
Disaster recovery and FME Flow: When do I need it?
Choose disaster recovery with FME Flow if your business continuity plans require it.
Can I have both fault tolerance and disaster recovery?
Yes! Disaster recovery works with any FME Flow installation type. You can have redundant express, distributed, or fault tolerant installations. So, if you want a bullet-proof FME Flow that can survive anything, you may consider implementing both.
Resources
| Resource Type | Description | Link/Placeholder |
|---|---|---|
| Article | A Guide to Choosing Your FME Flow Deployment Architecture | |
| Document | Planning for Fault Tolerance | |
| Document | Planning for Disaster Recovery | |
| Webinar | FME Flow Fundamentals for Admins: Setup & Configuration FAQs | |
| Article | Using a Load Balancer with a Fault-Tolerant FME Flow |
Conclusion
Use a fault tolerant deployment to recover from a host machine failure, and use disaster recovery to recover from data center loss. Or, use both!
What other questions do you have about FME Flow’s architecture, and what planning challenges are you facing? Let us know, and we might include them in a future edition of TGIF.
Until then, have an excellent weekend. We’ll see you again next week!
* This does not consider remote engines, which may be hosted in other data centers, closer to the data they work with.

