Request schema validation, a double-edged sword

May 15, 2018
Engineering

Making sure data is valid can be a tedious process, especially for complex systems. We have many models in our system that are changed constantly - these models are controlled by our APIs. An example is our alerting API, which allows users to control their alerts via HTTP requests.

Over the past few years, we added some cool new stuff to our alerting feature. Because of this, we had to make changes to how we accept HTTP requests from our users. The request formats for some endpoints have changed - a lot - and to account for these changes, we had to change the validation logic for requests too.

The stone age (manual validation)

Previously, we used what we called “manual” validation: the bog-standard, line-by-line, iterative checking of fields and values to see if the request is valid. Yes, it can do all the validation that we need, but as changes are made to the API, we’ve had to add more “manual” validation logic to reflect these changes.

We discovered that a major disadvantage of this is that we could not account for all possible edge cases. This resulted in a never-ending cycle between users “discovering” edge cases and us devs scrambling to fix these newly-discovered edge cases.

Discovering metal (schema validation)

Our answer to this dilemma? The world of schema validation - a rigorous, adaptive form of validation that is easily changeable while also protecting against the myriad of unforeseen edge cases. To do this, we used Cerberus: a simple, lightweight data validation library for Python that has zero dependencies. We defined a schema for these endpoints in our alerting API:

  • Creating alerts.
  • Updating alerts.
  • Mute an alert.
  • Mute multiple alerts.
  • Unmute an alert.
  • Creating notification channels.
  • Updating notification channels.

For each of these endpoints, we wrote a request schema that defines exactly what is and isn’t acceptable for each field in the request. Cerberus validates the received request against the endpoint’s schema in one line of code, and throws an error if it doesn’t follow this schema.

So comparing these two validation methods, here’s what we found out:

A comparison chart displaying the advantages and disadvantages between schema and manual validation

Accidentally stabbing ourselves

The biggest drawback that we saw was damage to the helpfulness of error messages, since Cerberus supplied its own set of error messages. Cerberus gave very unhelpful error messages for complex cases, so we resorted to leaving the “manual” validation in place for those cases. Here’s an example of a very unhelpful error message when sending a mistake in the alert criteria field for creating alerts:

https://gist.githubusercontent.com/HG-blog/2e42434f2876d032256e0f33a0238e4e/raw/57177575d9db41aa82eede24457aca55be9d7544/gistfile1.txt

It’s up to the developer to gauge whether schema validation is the right tool, and whether it’s worth the tradeoff of having less-clear error messages for your users. We also saw that not all “manual” validation logic can be replaced by schema validation. The primary purpose of schema validation is to check if request parameters are valid. We like to think of schema validation as bolting on steel plates to your kevlar vest to make it more bulletproof - but you can’t bolt it on just anywhere.

Wrapping up

Putting these problems aside, the outcome that we wanted from schema validation was to have a more rigorous validation method while minimizing the impact on quality of error messages.  It worked out pretty well, and it will hopefully be implemented in our other API’s too in the near future.

{ "errors": { "alert_criteria": [ { "type": [ { "oneof": [ "none or more than one rule validate", { "oneof definition 0": [ "unallowed value missing", "field 'above_value' is required" ], "oneof definition 1": [ "unallowed value missing", "field 'below_value' is required" ], "oneof definition 2": [ "unallowed value missing", "field 'above_value' is required", "field 'below_value' is required" ], "oneof definition 3": [ "field 'time_period' is required" ] } ] } ] } ] }
}
Jerico Alcaras

Junior Developer at Hosted Graphite.

Related Posts

See why thousands of engineers trust Hosted Graphite with their monitoring

START A FREE TRIAL