Check the 'Source Control Management' section, and validate the LAST COMMIT is correctViewing the status of the Application:
Code Block |
---|
nex-cli apps:info my-app |
If the status is 'ACTIVE'App Details section "STATUS" is "active", and the Source Control Management section "LAST COMMIT" is correct, if so, your deployment should be ok & running.
If the status is '"restarting' " and it has beens been some time, then it may be having troubles starting and is something may have gone wrong and it's worth investigating the cubes.
Viewing the status of the cubes, and checking the logs
List the cubes for your app:
Code Block |
---|
nex-cli cubes --app my-app --all |
If one or more cubes have the "STATUS" => "running", then it appears the app is healthy and you should try trigger a redeployment (replay the github webhook).
Triggering a redeployment
Code Block |
---|
nex-cli apps:restart my-app |
The command above should trigger a redeployment and restart the cubes.
To view the events to see what is going on run:
Code Block |
---|
nex-cli apps:events my-app |
Note that the github webhook replay does not get logged in the event feed at the moment.
If one or more cubes have the "STATUS" => "performing:starting", most likely the cube has crashed.
SSH'ing into the rack via the cubes IP address listed from the command above with
Using the crashed cube's IP address, SSH into the rack (the IP is printed via the cubes command shown above):
Code Block |
---|
nex-cli racks:ssh 10.0.1.234 |
List Viewing the logs for the cube:
Code Block |
---|
sudo su
docker logs CUBE-INSTANCE-ID |
This should give you more information, typically there has been a 500 and identifying this log should give you your answer.
Logs don't look like your new code has been applied?
Whilst ssh'ed in the rack, list the docker containers to check the status:
Code Block |
---|
sudo su docker ps -a |
If a container "exited" (should say in the status), there is something wrong. View the logs:
Code Block |
---|
docker logs CONTAINER-NAME |
Hopefully you have identified your issue :)
Resetting the Nex! rack lock
Sometimes Nex! seems to lose control of cubes, if one is stuck and not restarting, try deleting the Nex! locks in the racks:ssh machine:the status of the container with the same NAME as the crashed cube instance ID has exited hours or days ago, and you recently deployed, then the new code hasn't been applied properly. Trying "triggering a redeployment" via `nex-cli apps:restart my-app`.
Scaling down / up the application's cubes
As a form of "hard reset", we can scale down & then back up the application's cubes.
Scale applications down as many times as there are Cube
Code Block |
---|
nex-cli apps:down my-app |
After the `nex-cli apps:info` "STATUS" => "stopped" for all Cube Instances, scale apps up as many times are there are Cube Instances:
Code Block |
---|
nex-cli apps:up my-app |
When all Cubes apps:info "STATUS" => "running", see if it's now working. If not, repeat these steps, especially viewing the crashed cube's logs as it may be different after it's restarted with the latest code.
Deleting the Nex! Locks & Restarting the Cubes
The lock files are used to control the state of actions currently undertaken on containers. If a containers crashes during provisioning the lock will stay there and the container will be forever considered as “starting” or “provisioning”.
SSH into the rack, and then delete the Nex! Locks:
Code Block |
---|
rm -fr /var/lock/nex/* |