Kruse-Net.dk

Det man blogger er man selv...

JSON Schema Proposal

Forside » JSON Schema Proposal

Inspirations

During the design of this JSON Schema Proposal, two other proposals were major inspirations:

  1. Kris Zyp’s JSON Schema Proposal, and
  2. Thomas Messier’s JSON schemas/validation with CFJSON.

The pro’s and con’s of these two proposals as I see them, can be summarized like this:

  Pro’s Con’s
json.com
(Kris Zyp)
  • High level of detail
  • Many relevant attributes
  • At first glance looks suitable for both documentation and validation
  • Well-featured
  • Separation of “property definition” and “schema definition”
  • Some unclear attributes, like “unconstrained” and “format”
  • Two methods of defining both objects and arrays, some broken or unclear
  • Impossible to create schema for objects containing properties “items”, “*”, “final”, “id” and “basis”
  • Overly complex
epiphantastic.com
(Thomas Messier)
  • Simple
  • Well structured
  • Poorly documented
  • Low level of detail
  • Duplicate property names
  • No separation of integers and floats

With this JSON Schema Proposal I aim for the simplest possible JSON-based format that is suitable for both documentation and validation. Kris Zyp’s proposal seems very promising, and so my proposal resembles it quite a lot, but it has some severe defects that I try to address. Thomas Messier’s proposal (if it is to be seen as such — he makes no claim in that respect) does not have these defects but is generally not as “complete”. Hopefully you will find that this proposal includes the best features of both, with none of the flaws.

Schema definitions

A JSON Schema is a JSON object containing one or more properties (called “Schema attributes”) describing the structure or property which is the target of the Schema. All JSON Schemas must contain at least a “type” attribute. Depending on the value of the “type” attribute, other attributes may apply.

Schema types

These are the valid values of the Schema “type” attribute:

  • “array” – Schema target is a JSON array structure
  • “object” – Schema target is a JSON object structure
  • “integer” – Schema target is a JSON number value without fraction
  • “float” – Schema target is a JSON number value with or without fraction
  • “string” – Schema target is a JSON string value
  • “boolean” – Schema target is a JSON boolean value
  • “datetime” – Schema target is a JSON string value conforming to a specified date/time mask
  • “any” – Schema target can be any of the above mentioned types
  • [ <type>, <type>, ... ] – Schema target can be one of the specified types

Note that “any” is equivalent with [ "array", "object", "integer", "float", "string", "boolean", "datetime" ] but is the preferred syntax.

Schema attribute map

The following table lists the other available Schema attributes and their applicability for each “type” value.

attr. \ type array object integer float string boolean datetime any
description Valid Valid Valid Valid Valid Valid Valid Valid
optional Valid Valid Valid Valid Valid Valid Valid Valid
transient Valid Valid Valid Valid Valid Valid Valid Valid
nullable Valid Valid Valid Valid Valid Valid Valid Valid
default Valid Valid Valid Valid Valid Valid Valid Valid
options N/A N/A Valid Valid Valid Valid Valid N/A
items Valid N/A N/A N/A N/A N/A N/A N/A
minlength Valid N/A N/A N/A Valid N/A N/A N/A
maxlength Valid N/A N/A N/A Valid N/A N/A N/A
properties N/A Valid N/A N/A N/A N/A N/A N/A
final N/A Valid N/A N/A N/A N/A N/A N/A
basis N/A Valid N/A N/A N/A N/A N/A N/A
min N/A N/A Valid Valid N/A N/A N/A N/A
max N/A N/A Valid Valid N/A N/A N/A N/A
pattern N/A N/A N/A N/A Valid N/A N/A N/A
mask N/A N/A N/A N/A N/A N/A Valid N/A

Schema attributes

“description”
Textual description of this schema, possibly in multiple languages. The value can be either a string, or an object where the keys are language codes and the values are description strings, like { 'en': 'English description', 'da': 'Dansk beskrivelse' }. Default value is the empty string.

“optional”
A boolean indicating whether the target is optional, or a list of types for which the target is optional. The value can be either a boolean or an array of type values. Default value is false.

“transient”
A boolean indicating whether the target is transient/volatile, and not meant for persisting. Default value is false.

“nullable”
A boolean indicating whether null is a valid value for the target. Default value is false.

“default”
For an optional target, the default value when the property is omitted. Only applicable if “optional” is true. Default value is null, meaning “no default”.

“options”
For primitive data types, an array listing all valid values for the target. Default value is null, meaning “no options” (as in “everything goes”).

“items”
For arrays, a Schema for the items in the array (so the value must be a Schema). Default value is the Root Schema: { "type": "any" }.

“minlength”
For arrays and strings, the minimum length of the target. The value must be a non-fractional, non-negative number. Default value is 0.

“maxlength”
For arrays and strings, the maximum length of the target. The value must be a non-fractional, non-negative number or null. Default value is null meaning “no maximum”.

“properties”
For objects, an object where the keys are property names and the values are Schemas for those properties. Default value is the empty object: { }. One special property name “*” exists, meaning “any property name not explicitly mentioned”, thus providing a way to specify a Schema for unknown properties.

“final”
For objects, a boolean indicating whether the target is final, i.e. can not contain any properties besides those described in the “properties” attribute of the Schema. Default value is false.

“basis”
For objects, the basis Schema, i.e. one that we should inherit all attributes from. Schema attributes specified in this Schema override those in the basis Schema. Default value is null, meaning “no basis Schema”.

“min”
For integers and floats, the minimum value of the target. Default value is null, meaning “no minimum value”.

“max”
For integers and floats, the maximum value of the target. Default value is null, meaning “no maximum value”.

“pattern”
For strings, a string representation of a regular expression matching all valid values of the target. Default value is ".*", which matches anything.

“mask”
For datetimes, a string containing a date/time format description. Default value is "YYYY-MM-DDThh:mm:ss", the suggested ISO 8601 composite format.

Referencing

For JSON Schemas I suggest using a variation of the JSPON scheme for referencing. Namely that two special instance properties are defined:

  • "$id" – defines a globally unique id (a string) for a JSON instance.
  • "$idref" – indicates that the object containing the "$idref" property should be (considered) replaced by a reference to the JSON instance with the specified id.

Schemas could be referenced from instance objects (and only from objects, unfortunately), with a third special property "$schema". Thus if you have a Schema like this:

{
    "$id": "my-schema",
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "age": { "type": "integer" },
        "gender": { "type": "string", "options": [ "M", "F" ], "optional": true }
    }
}

You could reference that Schema from an instance like this:

{
    "name": "Jakob",
    "age": 32,
    "$schema": { "$idref": "my-schema" }
}

Schema for Schema

As proof-of-concept and for possible validation of JSON Schemas, please see this Schema-for-Schema. It also serves as an example for now.

Todo

  • Support for defining tuple formats
  • Possibly a “persist” property giving hints about how/where to persist data (for binding to a database)
  • Additional “@href” referencing scheme