When HostedGraphite was set up our founders took the opportunity to address some issues they knew existed within Graphite. Even though it was a ground-breaking service when it first launched, Graphite hasn't always kept up with the changing needs of monitoring. HostedGraphite, on the other hand, was built from the ground up to cope with the real-life situations which face today's engineers. And ever since, our SRE team has been expanding and improving on those changes.
Here's a few ways in which HostedGraphite succeeds where vanilla Graphite struggles.
Use our Data Views instead of StatsD
Almost every introduction to Graphite talks about using StatsD to send your application metrics. StatsD is a pre-aggregation service which is required because of a restriction in vanilla Graphite. Specifically, Graphite's Whisper database can only accept one datapoint per timestamp; if you have a busier metric the last value you send wins and everything else is lost. StatsD deals with this by collecting datapoints for every 10 second time span and creating a series of new metrics derived from the values it has received.
This adds an additional layer of complication and a potential point of failure. It hinders scaling since if you're running multiple StatsD servers you'll also need to ensure each metric's datapoints are all sent to the same instance, and since the basic StatsD server isn't capable of multithreading it can get overwhelmed.
Hosted Graphite deals with datapoint aggregation for you automatically using Data Views. Data Views helps get new users started quickly and removes the hassle of dealing with StatsD on top of monitoring your application. Every metric Hosted Graphite receives is stored as 9+ different Data Views, and you can switch between views at any time. All you need to do is change the suffix on the metric.
Native Cluster for Easy Scaling
Another benefit HostedGraphite has over Graphite's Whisper storage is that it's cluster-based, which makes scaling easy and allows us to ingest millions of datapoints every minute.
Whisper uses a single file for every resolution of each metric, which allows it to create new metrics with no preparation – this was the major step forward that Graphite introduced when it was first developed. Unfortunately modern monitoring of cloud and container-based services means that metrics tend to be high-cardinality and low frequency: lots of different metric names, containers and hosts can be scaled, but they're also short-lived. Whisper's file-based model struggles under that type of load.
HostedGraphite on the other hand is built around a large scale cluster-based non-relational database, which only stores the datapoints it receives, unlike Whisper's whole-file system. Datapoints are only stored and retrieved when necessary, and writes are prioritised over reads - you can read more about that here.
Nine Views, Nine Times the Value
One of the advantages of this whole setup is that your metric limit gives you a lot more value than it would appear at first. With standard Graphite, if you want to see the average, minimum, maximum and 95th percentile of a timer metric, that's 4 different StatsD metrics that you need to store. But with HostedGraphite, all of those views are available for your metric automatically. Hosted Graphite let's you see the same information with 1 metric instead of standard Graphite's 4, you can choose different percentile values on the fly, and there are 5 other data views available if you need them!
With HostedGraphite, you have the simplicity of sending metrics directly without needing StatsD, the reliability and speed of a cluster based model, and users also get more bang for their buck by storing the equivalent of 9 metrics for every metric received!