Replace a failed YB-Master
You can replace a failed YB-Master server in a YugabyteDB cluster.
Examples included in this document use the following scenario:
- The cluster includes three yb-master servers:
M1
,M2
,M3
. - YB-Master server
M1
failed and needs to be replaced. - A new YB-Master server (
M4
) will replaceM1
. - The default master RPC port is
7100
If the YB-Master to be replaced is already dead (for example, the VM was terminated), you would need to perform the REMOVE
step first, and then the ADD
step.
-
Start the replacement YB-Master server in standby mode by setting the
--master_addresses
flag to an empty string (""
), as follows:./bin/yb-master --master_addresses="" --fs_data_dirs=<your_data_directories> [any other flags you would typically pass to this master process]
When the
--master_addresses
is""
, this YB-Master server starts without joining any existing master quorum. The node will be added to the master quorum in a later step. -
Add the replacement YB-Master server into the existing cluster by running the
yb-admin change_master_config ADD_SERVER
command, as follows:./bin/yb-admin -master_addresses M1:7100,M2:7100,M3:7100 change_master_config ADD_SERVER M4 7100
-
Remove the failed YB-Master server from the cluster by using the
yb-admin change_master_config REMOVE_SERVER
command, as follows:./yb-admin -master_addresses M1:7100,M2:7100,M3:7100,M4:7100 change_master_config REMOVE_SERVER M1 7100
Make sure to specify all YB-Master addresses, including M4, to make sure that if M4 becomes the leader, then yb-admin can find it.
-
Validate cluster by checking that your set of masters is now
M2
,M3
andM4
, as follows:./yb-admin -master_addresses M2:7100,M3:7100,M4:7100 list_all_masters
Until #1542 is implemented, the YB-TServer by default can only be aware of the YB-Master servers that are encoded in the --tserver_master_addrs
flag with which they are started. If any of those YB-Master servers are still part of the active quorum, then they can propagate the new master quorum via heartbeats. If none of the current YB-Master servers are present in the YB-TServer flag, then the YB-TServer cannot join the cluster. Therefore, it is important to update --tserver_master_addrs
on every YB-TServer to the new set of master addresses as M2:7100,M3:7100,M4:7100
.