We’re close to releasing a big new feature for Hosted Graphite – the ability to define alerts that can notify you when your metric data indicates something might be wrong with your infrastructure. You can choose to be notified via email, webhooks, PagerDuty, Slack, and now, HipChat. The alerting feature is in beta right now but many teams are already using it. If you’d like to get early access to try it out, get in touch with our support team and we’ll flip the switch for you.
When our friends at Atlassian asked if we’d be interested in building an add-on for HipChat, we weren’t immediately sold on the idea. However, once we saw that we could embed parts of the Hosted Graphite experience inside HipChat, we paid closer attention!
We thought the new alerting feature would be a great place to start. Most of the notification methods are relatively straightforward – when something goes wrong, we notify you. When it’s fixed, we notify you.
Nothing ground-breaking there, but with the HipChat Connect API we were able to offer a much richer experience by embedding parts of the Hosted Graphite product experience right into the HipChat interface. This is pretty powerful, and something that other chat tools don’t offer.
Notifications
First, the basics. When an alert goes off, we post a notification to your HipChat room:
This is a good start – there’s a thumbnail of a graph, a link to the full size graph, the metric name related to the alert, and the conditions that caused it to fire. This “card” view lets us pack a lot of information into a small space, making the notification as useful as possible and giving the signal-to-noise ratio a welcome boost.
Keeping up with the chaos
One of the challenges of ChatOps
Unhealthy looks like this:
Context switching
With other tools you’d have to switch to another browser tab and navigate the UI to get a high level view of the state of your infrastructure, and that’s assuming you’re already logged in. Having the information right next to the relevant discussion is powerful, and it’s available for your entire team, which adds up to a lot of saved time.
Acting quickly
These dashboard links include a Hosted Graphite access key, so everyone in the room gets one-click read-only access as quickly as possible to help them diagnose the problem. They’ll need to login to make any changes, of course.
Summary
Using our HipChat add-on is the richest way to take advantage of our new alerting feature. It keeps your team informed, improves the signal-to-noise ratio in your chat rooms, reduces context switching, and provides for lower time-to-resolution, which directly impacts your customers.
Want to try it out? You can install the Hosted Graphite for HipChat integration in the Atlassian Marketplace.