This articles describes the steps to take when a cluster is unable to discover its nodes. Steps in this articles assume Mongo is hosted as an add-on on Nex!™ PaaS but could in essence be applied to any Mongo hosting.
...
Table of Contents | ||
---|---|---|
|
...
1 - Context
It may happen upon doing multiple restarts that a Mongo addon replica set becomes stale. In that very specific case all nodes will recognise there is a replica set but each node will have a list of outdated members in the replica set. In this case the cluster becomes stale with no node actually taking leadership for cluster.
...
Code Block | ||
---|---|---|
| ||
> rs.status() { "startupStatus" : 1, "ok" : 0, "errmsg" : "loading local.system.replset config (LOADINGCONFIG)" } |
2 - Preferred steps
Warning |
---|
The steps described in this article are quite radical and must only be used in the event where the cluster cannot be reconfigured. |
We recommend first trying to reconfigure the Mongo cluster using the steps on this link.
If things don't work then proceed to section 3 - .
3 - Reset the replica set
If this happens you can follow the instructions below to recover the replica set:
First choose a container in your mongo addon - preferrably the last master if you can identify it
Code Block nex-cli cubes --addon name_of_mongo_addon
Stop all other mongo containers
Code Block nex-cli cubes:stop other_container_id
SSH to the rack hosting the container that is still alive
Code Block nex-cli racks:ssh 10.1.1.1
Connect to the container
Code Block sudo nex-attach my_container_id
Connect to Mongo using the Shell
Infocode mongo -u $MONGO_USER -p $MONGO_PASSWORD admin
Delete the local replication information. This command does NOT delete your actual data but will destroy uncommitted replication transactions.
Code Block use local db.dropDatabase()
- Exit the MongoShell and the container
On the rack issue a restart of the container. Upon restart the mongo instance will initiate a new replicaSet (previous one being deleted)
Code Block # Restart nex-restart --name my_container_id # Monitor logs docker logs -f my_container_id
Exit the rack and bring up the other nodes using the nex-cli
Code Block nex-cli cubes:start other_container_id