In order to create redundancy, a few crucial questions would need to be answered.
Where are the customers? Understanding this would allow for the correct positioning of the server instances.
What are the risks? Local collapse? national collapse? international collapse?
How do we minimize the blast radius in the event of a failure?
The decision
The decision was made to host with Hetzner in Germany as they had a footprint across Europe and the USA.
Building redundancy was simple. We created 3 data nodes and 3 Kubernetes nodes. A Kubernetes node and data node pair were positioned in the USA, Germany and Frankfurt.
The data services (ElasticSearch, MongoDB and Active MQ) were configured to automatically replicate across the data centres while the Kubernetes nodes would replicate containers across the borders.
Everything Works Together
In December 2022, we experienced an outage with one of the data centres. The services automatically recovered and our customers experienced zero downtime.