2. Exporting Cluster Statistics in XML Format¶
You can use the vstorage stat -xml command to export current cluster statistics in XML format. The statistics includes information about all storage components: cluster, MDSes, CSes, clients, and nodes.
The resulting XML file has the following main elements:
Cluster identification tags:
cluster_id, the cluster identifier generated automatically during cluster creation.cluster_name, the string value with the cluster name specified during cluster creation.status, the current cluster status that may be one of the following:healthy, all CSes are active.unknown, insufficient information about the cluster state (e.g., because the master MDS server was elected a while ago).degraded, some CSes are inactive.failure, the cluster has too many inactive CSes; automatic replication is disabled.SMART warning, one or more physical disks attached to cluster nodes reported a S.M.A.R.T. error.
Nested tags containing information about the cluster: - space, information on total and logical storage space. - rjournal, ??? - license, license status. - repl, redundancy settings. - chunks, the number of chunks in every chunk state.
- replicated, redundancy type.
Nested tags containing information on MDSes, CSes, clients, and nodes:
- mds_list, a list of tags containing statistics on MDS.
- cs_list, a list of tags containing statistics on CS.
- clients_list, a list of tags containing information on clients.
- host_list, a list of tags containing information on hosts.
Additionally, some of the tags may contain statistics on the values of various counters at different time frames under the rate tag.
The following sections describe the nested tags and their values in detail.
2.1. space¶
The space tag contains information on total and logical storage space in the following tags:
| Tag | Description |
|---|---|
allocatable |
Amount of logical disk space available to clients. Allocatable disk space is calculated on the basis of the current replication parameters and free disk space on chunk servers. It may also be limited by license. |
effective_total |
. |
allocatable_raw |
. |
total_raw |
. |
total |
Total physical space on all disks, in bytes. |
free |
Unused physical space, in bytes. |
tiers |
. |
2.2. rjournal¶
.
| Tag | Description |
|---|---|
epoch |
. |
epoch_uptime |
Time elapsed since the MDS master server election. |
round |
Number of the voting round. |
master_id |
Identifier of the master MDS. |
2.3. license¶
The license tag contains the status value that can be one of the following:
- trial license.
- active.
- .
2.4. repl¶
The repl tag contains replication settings and nests the following tags:
| Tag | Description |
|---|---|
norm |
Normal number of chunk replicas. |
limit |
Limit after which a chunk gets blocked until recovered. |
max |
. |
2.5. chunks¶
The chunks tag contains percentages of chunks with the statuses described below and a nested replicated tag.
| Status | Description |
|---|---|
| healthy | Percentage of chunks that have enough active replicas. The normal state of chunks. |
| replicating | Percentage of chunks which are being replicated. Write operations on such chunks are frozen until replication ends. |
| offline | Percentage of chunks all replicas of which are offline. Such chunks are completely inaccessible for the cluster and cannot be replicated, read from or written to. All requests to an offline chunk are frozen until a CS that stores that chunk’s replica goes online. Get offline chunk servers back online as fast as possible to avoid losing data. |
| void | Percentage of chunks that have been allocated but never used yet. Such chunks contain no data. It is normal to have some void chunks in the cluster. |
| pending | Percentage of chunks that must be replicated immediately. For a write request from client to a chunk to complete, the chunk must have at least the set minimum amount of replicas. If it does not, the chunk is blocked and the write request cannot be completed. As blocked chunks must be replicated as soon as possible, the cluster places them in a special high-priority replication queue and reports them as pending. |
| blocked | Percentage of chunks which have fewer active replicas than the set minimum amount. Write requests to a blocked chunk are frozen until it has at least the set minimum amount of replicas. Read requests to blocked chunks are allowed, however, as they still have some active replicas left. Blocked chunks have higher replication priority than degraded chunks. Having blocked chunks in the cluster increases the risk of losing data, so postpone any maintenance on working cluster nodes and get offline chunk servers back online as fast as possible. |
| degraded | Percentage of chunks with the number of active replicas lower than normal but equal to or higher than the set minimum. Such chunks can be read from and written to. However, in the latter case a degraded chunk becomes urgent. |
| urgent | Percentage of chunks which are degraded and have non-identical replicas. Replicas of a degraded chunk may become non-identical if some of them are not accessible during a write operation. As a result, some replicas happen to have the new data while some still have the old data. The latter are dropped by the cluster as fast as possible. Urgent chunks do not affect information integrity as the actual data is stored in at least the set minimum amount of replicas. |
| standby | Percentage of chunks that have one or more replicas in the standby state. A replica is marked standby if it has been inactive for no more than 5 minutes. |
| overcommitted | Percentage of chunks that have more replicas than normal. Usually these chunks appear after the normal number of replicas has been lowered or a lot of data has been deleted. Extra replicas are eventually dropped, however, this process may slow down during replication. |
| deleting | Percentage of chunks queued for deletion. |
| unique | Percentage of chunks that do not have replicas. |
The replicated tag contains statistics on the number of replicated chunks under the rate tag.
2.6. fs_stat¶
The fs_stat tag contains statistics on files, inodes, file maps, chunks and chunk replicas.
| Tag | Description |
|---|---|
used_size |
Used logical/physical (???) space, in bytes. |
files |
Number of files in the filesystem. |
inodes |
Number of inodes in the filesystem. |
file_maps |
. |
chunk_maps |
. |
chunk_nodes |
. |
2.7. io_stat¶
The io_stat tag contains rate statistics on cluster IO activity excluding replication, replication IO activity, data synchronization, and the following values:
| Tag | Description |
|---|---|
reads |
Statistics on data reads, in bytes per second. |
read_ops |
Statistics on data reads, in operations per second. |
writes |
Statistics on data writes, in bytes per second. |
write_ops |
Statistics on data writes, in operations per second. |
repl_reads |
Statistics on data replication reads, in operations per second. |
repl_writes |
Statistics on data replication writes, in operations per second. |
sync |
Statistics on synchronization, in operations per second. |
datasync |
Statistics on data synchronization, in operations per second. |
queue_aver |
. |
queue_max |
. |
hot_nodes |
. |
last_balanced |
. |
last_balance_uptime |
. |
2.8. mds_list¶
The mds_list tag contains the list of MDSes wrapped in mds tags with the following structure:
| Tag | Description |
|---|---|
id |
Automatically generated global MDS identifier. |
status |
Current MDS status:
|
ctime |
rate statistics on time spent writing to the local journal. |
commits |
rate statistics on local journal commits. |
cpu_usage |
MDS CPU usage rate statistics. |
mem_usage |
Number of pages the MDS has in physical memory. |
uptime |
Time elapsed since MDS startup. |
host_info |
Information on the host where the MDS runs; nests the following tags:
|
build_version |
MDS build version. |
2.9. cs_list¶
The cs_list tag contains the list of MDSes wrapped in cs tags with the following structure:
| Tag | Description |
|---|---|
id |
Automatically generated global CS identifier. |
status |
Current CS status:
|
space |
Amount of physical space on CS in bytes: total, free, and available. |
replcas |
Number of chunk replicas stored on the CS. |
available |
Space available on the CS. |
alloc_cost |
Cost of allocating a chunk on this CS. |
tier |
Tier assigned to the CS. |
adm_status |
. |
act_status |
. |
err_status |
CS error status. If not none, the CS is not used for chunk allocation. |
last_err |
Previous CS error status. |
last_err_uptime |
Time elapsed since the previous CS error. |
last_link_err |
Last CS link error status. |
last_link_err_uptime |
Time elapsed since the previous CS link error. |
rmw |
rate statistics on the number of read-modify-write sequences due to unaligned IO. |
jrmw |
rate statistics on the number of read-modify-write sequences served from the SSD journal. |
io_stat |
Tag that nests IO statistics. |
chunks |
Number of chunks in various states. |
latency |
. |
net_stat |
. |
host_info |
Information on the host running the CS in the following tags:
|
features |
. |
build_version |
CS build version. |
The io_stat tag contains rate statistics on the following counters:
| Tag | Description |
|---|---|
reads |
Data reads, in bytes per second. |
read_ops |
Data reads, in operations per second. |
writes |
Data writes, in bytes per second. |
write_ops |
Data writes, in operations per second. |
repl_reads |
Data replication reads, in operations per second. |
repl_writes |
Data replication writes, in operations per second. |
maps |
Number of map, in operations per second. |
sync |
Synchronization, in operations per second. |
datasync |
Data synchronization, in operations per second. |
iowait |
Percentage of time spent waiting for IO operations including synchronization. |
syncwait |
Percentage of time spent waiting for synchronization. |
ioqueue |
. |
jfull |
Percentage of SSD journal to be stored on HDD. |
2.10. clients_list¶
The clients_list tag contains the list of clients tags with the following structure:
| Tag | Description |
|---|---|
id |
Automatically generated global client identifier. |
period_ms |
. |
leases |
Number of shared and exclusive leases belonging to the client. |
reads |
Data reads in bytes per second. |
read_ops |
Data reads in operations per second. |
writes |
Data writes in bytes per second. |
write_ops |
Data writes in operations per second. |
fsyncs |
. |
latency |
. |
host_info |
Information on the host running the client in the following tags:
|
build_version |
Client build version. |
2.11. host_list¶
The host_list tag contains the list of host_stat tags with the following structure:
| Tag | Description |
|---|---|
host_roles |
Contains tags with the following numbers:
|
net_stat |
. |
space |
Amount of physical space on the host in bytes: total, free, and available. |
host_info |
Information on the host in the following tags:
|
host_ip |
Host IP address. |
2.12. rate¶
The rate tag can be wrapped in various tags and contains statistics of corresponding counters in the following fields:
| Tag | Description |
|---|---|
total |
(Optional) Maximum value of the counter in question. |
units |
(Optional) Unit of measure. |
avg5s |
Average value during 5-second period. |
avg1m |
Average value during 1-minute period. |
avg5m |
Average value during 5-minute period. |
avg15m |
Average value during 10-minute period. |