Multi-Cluster Georedundancy

Technical Operations > ICE OS > Geographic Redundancy > Multi-Cluster Georedundancy

ICE Data Replication Between Independent Clusters

The second option is to build two or more entirely separate Kubernetes clusters and configure Instant Connect to replicate data and messages between them (forming a single, logical instance of ICE). This is the mechanism officially supported by Instant Connect (although Instant Connect only supports a two-cluster deployment).

Clusters communicate with one another through a virtual private network (VPN) using WireGuard.

Instant Connect is unaware that it is operating across clusters. In much the same way that ICE does not know, monitor, or care how many physical nodes its software components may be orchestrated across, it has no knowledge of this dual-cluster 'plumbing.'

In this design, network requirements are much more lax than operating a single cluster across sites. Instead, ICE requires two forms of data to be replicated between clusters:

 Kafka Messages
Kafka messages produced on one site must be relayed to the other site. For example, Bill is connected to Site A and tries to call Brenda connected to Site B. The replication of Kafka messages allows the real-time signaling of the call flow to reach both parties connected to different sites.

 Cassandra Data Persistence
Data persisted in the Cassandra database must be visible to users on others sites. Cassandra supports multi-node, multi-data center replication out of the box.

These tools—Kafka and Cassandra—are designed to be replicated over lower-bandwidth, higher latency network links and can easily adapt to intermittent and short-term partitions (referred to, broadly, as 'eventual consistency'). Low bandwidth and/or high latency network links between sites will produce a degraded user experience but not a system outage.