This article provides resolution instructions in case the routing layer of Nex!™ is not responding anymore and your browser displays that the "site can't be reached" (connection timeout).
1 - Context
If you are getting a connection timeout error when attempting to access an app on Nex!™ it is likely that something wrong is going on with all the routing racks. When this issue occurs you are likely to observe the following error in Chrome:
2 - Checking Nginx on the routing racks
The following steps assume that you have installed the Nex!™ CLI
You can display the list of routing racks and access one of them using the following commands:
# Display all routing racks nex-cli racks --type routing # Access rack by internal IP nex-cli racks:ssh 10.1.1.1
The first thing to do is to check that Nginx is up. If it's not attempt to restart it.
# Check if any nginx process is running ps -ef | grep nginx # Perform a full restart of nginx service nginx restart # Check if nginx is up ps -ef | grep nginx
If Nginx has not come back up after restarting then you will need to start it in the foreground to understand the issue:
# Run the following command # This will start nginx in foreground mode. Any startup issue will appear here. /usr/local/openresty/nginx/sbin/nginx
Nginx not starting is usually related to default SSL certificates not being setup properly. If the above command yields an error related to SSL, Private Key, Public Key then go to section 3.
3 - Fixing the default SSL certificates
If the default Nginx certificates are not properly setup - e.g. missing public or private key, keys not matching - Nginx will simply refuse to start. This problem typically indicates that the Nex!™ Orchestrator configuration is incorrect.
3.1 - Immediate resolution
The fastest way to resolve this issue is to manually re-setup the Nginx default SSL certificates. For that you will need access to:
- Your wildcard certificate public key: e.g. mydomain.crt
- Your wildcard certificate private key: e.g. mydomain.key
- You Certificate Authority (CA) chain: e.g. my_cert_provider_bundle.crt
On the routing rack copy the content of your certificate private key into the following file (e.g. using vim):
/etc/nginx/default-certificates/default.key
On your local computer concatenate your certificate public key and CA chain:
# Cat both files to obtain a chained certificate cat mydomain.crt my_cert_provider_bundle.crt
On the routing rack copy the content of the above into the following file (e.g. using vim):
/etc/nginx/default-certificates/default.crt
Once done start Nginx in the foreground to check that everything is fine:
# This will start nginx in foreground mode. Any startup issue will appear here. # Use Ctrl-C to exit the process /usr/local/openresty/nginx/sbin/nginx
Finally restart the Nginx service to bring it back up:
service nginx restart
After bringing Nginx back up and verifying that your web applications were accessible again proceed to section 3.2 to permanently resolve the issue.
3.2 - Permanent resolution
As we mentioned at the beginning of the section this kind of SSL issue is symptomatic of a configuration issue with the orchestrator itself. You should therefore review the SSL certificates configuration variables in Nex!™ to ensure they match your certificate keys.
a) Using Ansible
b) Using Rails configuration