Adventures in fault tolerant alerting with Python

We gave a presentation a couple of weeks ago Python Ireland’s April meetup where we described our experiences with PySyncObj, a relatively new but solid library for building fault tolerant distributed systems in Python. Most of the services that run Hosted Graphite are built in Python, and this includes our alerting system. While that talk wasn’t recorded, this blog post…

Hosted Graphite’s Alerting now integrates with OpsGenie!

TL;DR: Hosted Graphite’s alerting feature now integrates with OpsGenie, including auto-resolving incidents according to the alerting rules. Hosted Graphite’s alerting feature continues to sprout new functionality – we just launched the ability to send notifications of infrastructure problems straight to your on-call engineering team via OpsGenie. If you’re not familiar with OpsGenie, here’s how they…

Enabling remote work

At Hosted Graphite, we rely on remote work – our CEO works full-time from the US and the rest of the team work from Ireland. We have a flexible policy on working from home (essentially, Nike-style: just do it). As long as work gets done, we don’t sweat the details of when or where it…

Alerting from first principles

An Introduction to Alerting Having recently added our Alerting for Graphite, we thought it’d be useful to put together a short primer on Alerting. What do you need to look at when considering what you alert on, and where those alerts go? An early warning system is only as good as its alarms. What is…

No brogrammers: Practical tips for writing inclusive job ads

A common problem with hiring for tech companies is that job ads often use strong, offputting language that alienates women, people of colour, and other minorities in the tech community. By paying attention to the language we use to describe ourselves, our ideal candidates, and the job responsibilities, we can broaden the net of candidates that might apply…

Managing ChatOps Signal-to-Noise with HipChat and Hosted Graphite

We’re close to releasing a big new feature for Hosted Graphite – the ability to define alerts that can notify you when your metric data indicates something might be wrong with your infrastructure. You can choose to be notified via email, webhooks, PagerDuty, Slack, and now, HipChat. The alerting feature is in beta right now but…

New Years Resolution : Stop Self Hosting Graphite

Obvious fact alert: New Year’s Resolutions don’t work for anyone (this lifehack clickbait has five reasons why). They especially don’t work for companies because if the resolution was really that important you probably would have taken it on already. The January-specific orbital tilt of the earth won’t make you especially free to take new stuff…

New webhook sources

We’ve just launched some new webhooks available to Hosted Graphite. These services allow you to create new events in your Grafana Dashboard when something interesting (or worrying…) happens. Hosted Graphite webhook integrations are available under the “Data sources” tab. Continuous Integration services Create events when your builds run: flag successful or failed builds. CircleCI  Codeship  Website…

Multiple AWS accounts and Google Auth / AzureAD Login now supported!

AWS Cloudwatch Our AWS Cloudwatch metrics integration has been given a feature facelift – now you can add multiple AWS Cloudwatch accounts to your Hosted Graphite account and pull metrics from all of them! Once your accounts are added you can also describe what data you want to pull – you have the option to…

Grafana 2.1

We’re pleased to announce that we have upgraded to the latest version of Grafana (2.1).  Building upon the UI changes, new features and structural changes introduced in Grafana 2.0, Grafana 2.1 brings with it a number of small changes which will help you to get even more out of your dashboards. Features such as multi-valued…

Uptime monitoring with Pingdom and Hosted Graphite

When we ask our customers what they use to track the uptime (or more importantly downtime) of their sites, Pingdom is typically the answer. As part of an integrated monitoring philosophy, it’s essential that your internal monitoring process can also relate to external checks – combining multiple sources of information gives you a clear picture…

StatsD Noise reduction!

If you like metrics (and we do!) StatsD is a fantastic way to get a wide variety of statistical views of your data, as well as providing the very useful ability to sample your data instead of firing giant streams out over the intertubes. It’s also got wide language support and there are dozens of…