Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This articles describes the steps to take when a cluster is unable to discover its nodes. Steps in this articles assume Mongo is hosted as an add-on on Nex!™ PaaS but could in essence be applied to any Mongo hosting.

Image Added


...

Table of Contents
stylenone

...


1 - Context

It may happen upon doing multiple restarts that a Mongo addon replica set becomes stale. In that very specific case all nodes will recognise there is a replica set but each node will have a list of outdated members in the replica set. In this case the cluster becomes stale with no node actually taking leadership for cluster.

...

Code Block
languagejs
> rs.status()
{
 "startupStatus" : 1,
 "ok" : 0,
 "errmsg" : "loading local.system.replset config (LOADINGCONFIG)"
}


2 - Preferred steps

The steps described in this article are quite radical and must only be used in the event where the cluster cannot be reconfigured.

We recommend first trying to reconfigure the Mongo cluster using the steps on this link.

If things don't work then proceed to section 3.

3 -

...

Reset the replica set

If this happens you can follow the instructions below to recover the replica set:

  1. First choose a container in your mongo addon - preferrably the last master if you can identify it

    Code Block
    nex-cli cubes --addon name_of_mongo_addon

     

  2. Stop all other mongo containers

    Code Block
    nex-cli cubes:stop other_container_id


  3. SSH to the rack hosting the container that is still alive

    Code Block
    nex-cli racks:ssh 10.1.1.1


  4. Connect to the container

    Code Block
    sudo nex-attach my_container_id

     

  5. Connect to Mongo using the Shell

    Info

    mongo -u $MONGO_USER -p $MONGO_PASSWORD admin


  6. Delete the local replication information. This command does NOT delete your actual data but will destroy uncommitted replication transactions.

    Code Block
    use local
    db.dropDatabase()


  7. Exit the MongoShell and the container
  8. On the rack issue a restart of the container. Upon restart the mongo instance will initiate a new replicaSet (previous one being deleted)

    Code Block
    # Restart
    nex-restart --name my_container_id
    
    # Monitor logs
    docker logs -f my_container_id


  9. Exit the rack and bring up the other nodes using the nex-cli

    Code Block
    nex-cli cubes:start other_container_id