.. _Understanding Data Redundancy:

Understanding Data Redundancy
-----------------------------

Virtuozzo Infrastructure Platform protects every piece of data by making it redundant. It means that copies of each piece of data are stored across different storage nodes to ensure that the data is available even if some of the storage nodes are inaccessible.

Virtuozzo Infrastructure Platform automatically maintains the required number of copies within the cluster and ensures that all the copies are up-to-date. If a storage node becomes inaccessible, the copies from it are replaced by new ones that are distributed among healthy storage nodes. If a storage node becomes accessible again after downtime, the copies on it which are out-of-date are updated.

The redundancy is achieved by one of two methods: replication or erasure coding (explained in more detail in the next section). The chosen method affects the size of one piece of data and the number of its copies that will be maintained in the cluster. In general, replication offers better performance while erasure coding leaves more storage space available for data (see table).

Virtuozzo Infrastructure Platform supports a number of modes for each redundancy method. The following table illustrates data overhead of various redundancy modes. The first three lines are replication and the rest are erasure coding.

+-----------------+-------------------+-------------------------+---------------------+------------------------+
| Redundancy mode | Minimum number of | How many nodes can fail | Storage overhead, % | Raw space required     |
|                 | nodes required    | without data loss       |                     | to store 100GB of data |
+=================+===================+=========================+=====================+========================+
| 1 replica       | 1                 | 0                       | 0                   | 100GB                  |
| (no redundancy) |                   |                         |                     |                        |
+-----------------+-------------------+-------------------------+---------------------+------------------------+
| 2 replicas      | 2                 | 1                       | 100                 | 200GB                  |
+-----------------+-------------------+-------------------------+---------------------+------------------------+
| 3 replicas      | 3                 | 2                       | 200                 | 300GB                  |
+-----------------+-------------------+-------------------------+---------------------+------------------------+
| Encoding 1+0    | 1                 | 0                       | 0                   | 100GB                  |
| (no redundancy) |                   |                         |                     |                        |
+-----------------+-------------------+-------------------------+---------------------+------------------------+
| Encoding 1+2    | 3                 | 2                       | 200                 | 300GB                  |
+-----------------+-------------------+-------------------------+---------------------+------------------------+
| Encoding 3+2    | 5                 | 2                       | 67                  | 167GB                  |
+-----------------+-------------------+-------------------------+---------------------+------------------------+
| Encoding 5+2    | 7                 | 2                       | 40                  | 140GB                  |
+-----------------+-------------------+-------------------------+---------------------+------------------------+
| Encoding 7+2    | 9                 | 2                       | 29                  | 129GB                  |
+-----------------+-------------------+-------------------------+---------------------+------------------------+
| Encoding 17+3   | 20                | 3                       | 18                  | 118GB                  |
+-----------------+-------------------+-------------------------+---------------------+------------------------+

.. note:: The 1+0 and 1+2 encoding modes are meant for small clusters that have insufficient nodes for other erasure coding modes but will grow in the future. As redundancy type cannot be changed once chosen (from replication to erasure coding or vice versa), this mode allows one to choose erasure coding even if their cluster is smaller than recommended. Once the cluster has grown, more beneficial redundancy modes can be chosen.

You choose a data redundancy mode when configuring storage access points and their volumes. In particular, when:

- creating LUNs for iSCSI storage access points,

- creating S3 clusters,

- configuring Acronis Backup Gateway storage access points.

No matter what redundancy mode you choose, it is highly recommended is to be protected against a simultaneous failure of two nodes as that happens often in real-life scenarios.

.. note:: All redundancy modes allow write operations when one storage node is inaccessible. If two storage nodes are inaccessible, write operations may be frozen until the cluster heals itself.

.. _Redundancy by Replication:

Redundancy by Replication
~~~~~~~~~~~~~~~~~~~~~~~~~

With replication, Virtuozzo Infrastructure Platform breaks the incoming data stream into 256MB chunks. Each chunk is replicated and replicas are stored on different storage nodes, so that each node has only one replica of a given chunk.

The following diagram illustrates the 2 replicas redundancy mode.

.. image:: ../../../images/stor_image8.png
   :align: center
   :class: align-center 

Replication in Virtuozzo Infrastructure Platform is similar to the RAID rebuild process but has two key differences:

- Replication in Virtuozzo Infrastructure Platform is much faster than that of a typical online RAID 1/5/10 rebuild. The reason is that Virtuozzo Infrastructure Platform replicates chunks in parallel, to multiple storage nodes.

- The more storage nodes are in a cluster, the faster the cluster will recover from a disk or node failure.

High replication performance minimizes the periods of reduced redundancy for the cluster. Replication performance is affected by:

- The number of available storage nodes. As replication runs in parallel, the more available replication sources and destinations there are, the faster it is.

- Performance of storage node disks.

- Network performance. All replicas are transferred between storage nodes over network. For example, 1 Gbps throughput can be a bottleneck (see :ref:`Per-Node Network Requirements`).

- Distribution of data in the cluster. Some storage nodes may have much more data to replicate than other and may become overloaded during replication.

- I/O activity in the cluster during replication.

.. _Redundancy by Erasure Coding:

Redundancy by Erasure Coding
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With erasure coding, Virtuozzo Infrastructure Platform breaks the incoming data stream into fragments of certain size, then splits each fragment into a certain number (M) of 1-megabyte pieces and creates a certain number (N) of parity pieces for redundancy. All pieces are distributed among M+N storage nodes, that is, one piece per node. On storage nodes, pieces are stored in regular chunks of 256MB but such chunks are not replicated as redundancy is already achieved. The cluster can survive failure of any N storage nodes without data loss.

The values of M and N are indicated in the names of erasure coding redundancy modes. For example, in the 5+2 mode, the incoming data is broken into 5MB fragments, each fragment is split into five 1MB pieces and two more 1MB parity pieces are added for redundancy. In addition, if N is 2, the data is encoded using the RAID6 scheme, and if N is greater than 2, erasure codes are used. 

The diagram below illustrates the 5+2 mode.

.. image:: ../../../images/stor_image9.png
   :align: center
   :class: align-center
 
.. _No Redundancy:

No Redundancy 
~~~~~~~~~~~~~

.. warning:: Danger of data loss!

Without redundancy, singular chunks are stored on storage nodes, one per node. If the node fails, the data may be lost. Having no redundancy is highly not recommended no matter the scenario, unless you only want to evaluate Virtuozzo Infrastructure Platform on a single server.

