Skip to main content

Cross-cluster replication with Aiven for Apache Cassandra® Limited availabilty

Cross-cluster replication (CCR) is a configuration of Apache Cassandra services on the Aiven platform that mirrors your data between different clouds and/or regions, providing increased durability and locality.

About cross-cluster replication

You can choose which region to replicate your data to. CCR deploys a single multi-datacenter Apache Cassandra cluster across two Aiven services. Apache Cassandra is configured to treat nodes from a single service as located in a single datacenter.

Why use CCR

  • Improved data availability: CCR improves the disaster recovery capability for your service. Even if one service (cloud provider or region) goes down, your data stays safe and available with the CCR peer, which is another service with a different cloud provider or region.
  • Improved performance: Enabling CCR on your service, you can set up your client to interact with the service that is geographically close. The data locality benefit translates into a lower latency and improved data processing performance.

Data flow architecture

When you enable CCR on your Aiven for Apache Cassandra service, you connect to another service so that the two services (CCR pair) can replicate data to each other. The CCR service pair constitutes a single Apache Cassandra cluster that comprises nodes from the two services. The services are located in different regions and the nodes of a single service comprise a single datacenter.

How it works

Replication strategy

Apache Cassandra allows specifying the replication strategy when creating a keyspace.

What is the keyspace?

It is a namespace defining a replication strategy and particular options for a group of tables.

With the replication strategy defined in the CREATE KEYSPACE query, every table in the keyspace is replicated within the cluster according to the specified strategy.

NetworkTopologyStrategy

The replication strategy that allows CCR in Aiven for Apache Cassandra is called NetworkTopologyStrategy.

NetworkTopologyStrategy can be set up as the replication strategy when creating a keyspace on the cluster. The same CREATE KEYSPACE query can be used to specify the replication factor and a datacenter that data can be replicated to.

  • Datacenter per service: For the CCR feature to work, the two services that cross-replicate need to be located in two different datacenters.

  • Replication factor: The replication factor is defined in the CREATE KEYSPACE query to indicate the number of copies of the data to be saved per cluster (in each datacenter).

For more details on the replication factor for Apache Cassandra, see NetworkTopologyStrategy in the Apache Cassandra documentation.

CCR setup

To make CCR work on your services, you need a cluster comprising two Apache Cassandra services with CCR enabled. On the cluster, issue the CREATE KEYSPACE request, specifying NetworkTopologyStrategy as a replication strategy along with desired replication factors.

CREATE KEYSPACE test WITH replication =  /
{ /
'class': 'NetworkTopologyStrategy', /
'service-1': 3, /
'service-2': 3 /
};

Where service-1 and service-2 are the names of Apache Cassandra datacenter, which you can find in the Aiven console.

CCR in action

With CCR enabled and configured, Apache Cassandra replicates each write in the keyspace to both services (datacenters) with an appropriate number of copies as per replication factor.

  • Active-active model: Apache Cassandra uses an active-active model: clients have the choice of reading/writing either from one service or the other.
  • Consistency level: The consistency level regulates how many nodes need to confirm they executed an operation for this operation to be considered successfully completed by the client. You can set up the consistency level to one of the allowed consistency level arguments depending on your needs.
Examples
  • LOCAL_QUORUM consistency level: The read is contained within the service you connect to (completes faster).
  • QUORUM consistency level: Replies from nodes of both services are required. The read produces more consistent results but fails if one of the regions is unavailable.

For more details on consistency levels for Apache Cassandra, see CONSISTENCY in the Apache Cassandra documentation.

Limitations

  • It is not possible to connect two existing services to become a CCR pair. But you still can:

    • Create a CCR pair from scratch or,
    • Add a new region to an existing service (create a new service that replicates from your existing service).
  • Enabling CCR on an existing service is only possible if this service has a keyspace that uses NetworkTopologyStrategy as a replication strategy.

  • Two CCR services need to use an identical service plan and the same amount of dynamic disk space.

  • Limited replication configuration

    • SimpleReplicationStrategy not supported
    • Unbalanced NetworkTopologyStrategy not supported (both CCR peer services need the same replication factor)
  • Value of the replication factor needs to be equal to or greater than 2.

    warning

    If you set the replication factor value below 2, you may have it changed automatically at any time.

  • Once a CCR service pair is split, the clusters cannot be reconnected.

What's next

More on CCR with Aiven