Coffee Space 
Disclaimer: I’ve had a few drinks - I’m angry, upset and annoyed. This is me just offloading some random ideas that I’ll likely regret in the morning!
Let me start with a simple statement: I like JSON. I’ve been using it now for perhaps coming up to 10 years, since it was first introduced to be as part of a robotics project my a good mentor I was lucky to have. Back then (and likely still now), he was big on JavaScript and NodeJS - two things I don’t share a passion with.
JSON has served me well over the years, I’ve used it for some large projects, including another robotics team and my PhD. I’ve even been foolish enough to even write my own JSON parser in two different languages 1. I’ve come to quite like it.
Generally I am a fan of a few things:
One thing I also like (but don’t use often) is templating. For example, you define something like the following:
0001 {
0002 "key": { "type": "int", "min": "0", "max": "100", "default": "50" }
0003 }
Then, when you are parsing your configuration, you have an easy way to check the values are valid and even sane defaults if they are not. Your JSON configuration can then fail-safe. With autonomous robotics, sometimes you want to be able to change configuration as the robot is up and running - and sometimes you accidentally set a bad or insane value. The last thing you want is a powerful humanoid robot trying to kill itself or you!
Now for the next part…
JSON, I like you and all, but we have some things to talk about.
It’s possible to set strings (at least in some parsers) as:
0010 {
0011 "key": "value",
0012 'another-key': 'another-value'
0013 }
Mixing and matching " and ' is definitely a
mistake. Given that visually ' and backtick can be easily
confused, I think it would be best to just use ". This is
technically how it’s supposed to be, but it isn’t.
But then if reducing confusion is the goal, JSON is by default UTF-8.
Sounds all good and well, but consider that the unicode character
or \uXXX are both valid. So you may or may not
need to decode the string depending on your application. I believe by
default it should be ASCII with all unicode pre-escaped.
Numbers in JSON can literally be infinite in size - there are no limits at all. Each library implements their own arbitrary numerical parsing. Depending on the library, this may or may not convert to something useful - and may or may not throw an error of some type:
0014 {
0015 "key": 58962345984235890432756982347652735347594375624938759483574398572349573454398257023495704893
0016 }
As there is no consensus, personally I would just suggest that everything is a string, and let the implementer parse their own data types. I can imagine that some people may want to put letters around their numbers to indicate their type too, such as:
0017 {
0018 "binary": "b1100",
0019 /* Common format for addresses */
0020 "hex": "0xC",
0021 /* Common format for general hex */
0022 "also-hex": "Ch",
0023 /* Common format for hex colours */
0024 "again-hex": "#C",
0025 "byte": "12b",
0026 "integer": "12",
0027 "long": "12l",
0028 "float": "12.0f"
0029 }
As you can see, there are literally tonnes of values - and some will be right for your application, some will not. Parsers should leave this to the implementer and only offer helper functions.
This is especially true when it comes to support for exponents using
E and e characters in the numbers. Most people
aren’t going to be using this functionality and it isn’t obvious how it
should be supported.
And if it wasn’t complex enough already, numbers support signing. Not
just negative signing with -, but also positive signing
+. Technically that also includes zero too, which can lead
to awkward things like this:
0030 {
0031 "num-1": "+0",
0032 "num-2": "-0"
0033 }
Some languages implement numbers such that
- ouch. Depending on your application, this may or may not be a bug. Do
you want your parser to leave this in or not? Personally I recommend
dropping + altogether and leaving the zero case to the
implementer. Some people may even want
!
Okay, okay, that must be it? Nope. Booleans. Depending on the parser, all of these could or could not be ‘true’:
0034 {
0035 "a": true,
0036 "b": True,
0037 "c": 1,
0038 "d": "true",
0039 "e": "True",
0040 "f": 3426435,
0041 "g": "random text"
0042 }
Clearly we have a problem here. Again it makes the most sense to leave this problem to the user of the parser with helper function, with everything as a string. Data types are simple not universal.
I believe in general it was arrogant to not offer some versioning, especially with so many ‘arbitrary’ implementation details. I believe the perfect place for this would have been at the beginning, like this:
0043 "cool-json-version": {
0044 "key": "value"
0045 }
Failing that, if comments are supported you could even have:
0046 /* cool-json-version */
0047 {
0048 "key": "value"
0049 }
And now for some general points about implementations:
"" or null. There is
no need to crash the program because a value could not be found. Sure
there is exception handling, but this just creates more code that you
really want to access a simple configuration file.Rant over.
I still like JSON and I’m not recommending yet another standard. I don’t believe I could do better. But for the most part, I will personally be using a subset of it and encourage others to do the same. It’s strings only for me.
Comments
No comment. Literally. There is zero way to explain why something has been assigned a value as it has been. This is one of the major drawbacks of JSON as a configuration language. In properties files you might do something like:
The most obvious thing to do would have been to use C-style comments in my opinion. I would have literally have gone to:
0006 { 0007 /* This is set like this for reasons */ 0008 "key": "value" 0009 }With
/*indicating the start of a comment and*/indicating the end of a comment. I would avoid//for comments as they are dependant on line endings, and for a parser things could get a little complicated with Windows/Unix line endings for example. This is also the same approach as per CSS. It’s my opinion that support for that type of comment structure would be a mistake.