Table of Contents |
---|
...
New Relic
...
New Relic is a tool used to give deep performance analytics for every part of your platform: applications and servers. You can easily view and analyse massive amounts of data, and gain actionable insights in real-time. It's a major tool to use in case of an incident.
1 - Applications / APM
To know more about APMs, see Application Performance Monitoring (APM)
Important KPIs to follow on an APM:
- Web or non-web Transactions
- Response time
- Error rate
- APDEX score
- Availability
2 - Servers
To know more about Servers, see coming soon
Important KPIs to follow on a server:
- CPU usage
- Memory
- Load
3 - Synthetics
To know more about Synthetics, see: Synthetics
4 - Alerting
KPIs
Ping
Channels: Slack & emails
5 - Process
Now that alerts are configured correctly, when alerts are received, here are a few tips to follow:
...
- Raise a ticket "Incident" in JIRA describing: "Incident description and consequences", "Actions taken", "Root Cause", "Recommendation"
- If the incident is closed, close the JIRA Ticket, otherwise, change to the correct status.
- If the incident is closed but the root cause has not been identified yet or has not been corrected yet, raise a ticket "Problem" in JIRA, linked to the incident ("caused by")
- Before leaving an investigation, make sure that someone from the Operations Management and/or an investigator is accross the incident and ready to investigate.
- Assign the JIRA tickets appropriately, publish the JIRA ticket(s) on Slack on the #incident Channel
...
Want to set up New Relic to support your platform? See: Setup New Relic for Monitoring
...
Child pages (Children Display) |
---|
...