Case Studies

Ops/Server Alerts


  • The ops team in this tech startup monitors thousands of servers and the services running on them.
  • Each server and service was programmed to send multiple notifications every day with status updates, warnings and alerts.
  • The email- and sms-based “fire-hose” of ops notifications was frustrating the team. They were inundated and cluttered with a high number of alerts, making it hard to find the signal in the noise.
  • Users outside the operations team were interested in receiving high-priority alerts (e.g. website down), but received way too many irrelevant notifications.
  • Escalation processes were difficult to automate. There was no efficient way to update all stakeholders during fire-fighting. This was particularly challenging when users were not in front of the online dashboard.
  • The company needed a better way to manage ops notifications.

Teamchat Solutions:

  • Teamchat was deployed by the company for the Ops Team. They integrated Teamchat with their Nagios monitoring server. The Nagios “bot” (now, a Teamchat user) would send alerts via Teamchat. All alerts were posted as “smart” messages with additional context such as host, service and severity among others.


  • Messages related to the same host or service were threaded and aggregated, dramatically reducing clutter. Messages were color-coded based on severity making them easier to track.
  • Messages with specific priorities were sent to relevant people; a failure to respond triggered the escalation process automatically.
  • The ops team setup workflows for handling escalation paths, status updates and more.
  • In fire-fighting mode, the engineers had to just update in one place; all stakeholders were auto-updated via Teamchat.



  • The ops team became much more productive as the ops notification became more manageable.
  • High severity alerts were not missed. Low severity alerts were responded to immediately as they were routed to the right person immediately.
  • Even when the right systems administrator was occasionally unavailable, the escalation process kicked in automatically.
  • Notifications also came with corrective action offering a set of pre-defined actions for initiation by the OPS manager leading to faster resolution of issues.
  • The OPS team did not have to constantly be at the dashboard because they could get alerts on the phone.
  • Other stakeholders outside the ops team got full visibility too: they received the relevant high-severity alerts, as well us real-time updates on corrective measures being taken. This eliminated the communication overhead from concerned stakeholders.

Connect with Teamchat to know how it can benefit your business