mirror of
https://github.com/rqlite/rqlite.git
synced 2022-10-30 02:37:32 +03:00
Update CLUSTER_MGMT.md
This commit is contained in:
@@ -130,7 +130,9 @@ In the event that multiple rqlite nodes are lost, causing a loss of quorum and a
|
||||
|
||||
To begin, stop all remaining nodes. You can attempt a graceful node-removal, but it will not work in most cases. Do not worry if the remove operation results in an error. The cluster is in an unhealthy state, so this is expected.
|
||||
|
||||
The next step is to go to the _data_ directory of each rqlite node. Inside that directory, there will be a `raft/` sub-directory. You need to create a `peers.json` file within that directory, which will contain the desired configuration of your recovered rqlite cluster (which may be smaller than the original cluster, perhaps even just a single recovered node). This file should be formatted as a JSON array containing the node ID, `address:port`, and suffrage information of each rqlite node in the cluster. An example is shown below:
|
||||
The next step is to go to the _data_ directory of each rqlite node. Inside that directory, there will be a `raft/` sub-directory. You need to create a `peers.json` file within that directory, which will contain the desired configuration of your recovered rqlite cluster (which may be smaller than the original cluster, perhaps even just a single recovered node). This file should be formatted as a JSON array containing the node ID, `address:port`, and suffrage information of each rqlite node in the cluster.
|
||||
|
||||
Below is an example, of bringing a 3-node cluster back online.
|
||||
|
||||
```json
|
||||
[
|
||||
@@ -154,6 +156,6 @@ The next step is to go to the _data_ directory of each rqlite node. Inside that
|
||||
|
||||
`id` specifies the node ID of the server, which must not be changed from its previous value. The ID for a given node can be found in the logs when the node starts up if it was auto-generated. `address` specifies the desired Raft IP and port for the node, which does not need to be the same as previously. You can use hostnames instead of IP addresses if you prefer. `non_voter` controls whether the server is a read-only node. If omitted, it will default to false, which is typical for most rqlite nodes.
|
||||
|
||||
Next simply create entries for all nodes. You must confirm that nodes you do not include here have indeed failed and will not later rejoin the cluster. Ensure that this file is the same across all remaining rqlite nodes. At this point, you can restart your rqlite cluster.
|
||||
Next simply create entries for all the nodes you plan to bring up (in the example above that's 3 nodes). You must confirm that nodes you don't include here have indeed failed and will not later rejoin the cluster. Ensure that this file is the same across all remaining rqlite nodes. At this point, you can restart your rqlite cluster. In the example above, this means you'd start 3 nodes.
|
||||
|
||||
Once recovery is completed, the `peers.json` file is renamed to `peers.info`. `peers.info` will not trigger further recoveries, and simply acts as a record for future reference. It may be deleted at anytime.
|
||||
|
||||
Reference in New Issue
Block a user