Comments in JSON

127 points by p3drosola 12 years ago | 176 comments

There is a interview with the inventor of JSON somewhere. In that interview he explained why he did not allow comments in JSON like in XML. He said - if I remember correctly - that it was intentional to not have comments in JSON. The reason way that comments could be misused to add additional information for a parser. For example in XML you could use comments and a special parser could use these comments to create code while parsing. He did not want that. He wanted every JSON parser to be a JSON parser and nothing more. If you wanted to have comments in JSON he said that you could simply make the comments inline and have a convention for the keys which are comments for example every key ending with _comment could have a value which is then seen as a comment by the application but not by the parser.

jaredmcateer 12 years ago | |

Yes the JSON spec was designed with interoperability in mind, I don't believe Crockford claims to have invented JSON, merely discovered it.

That said if you want your Static JSON objects to have comments, just pipe the JSON object through a minifier to strip comments before parsing.

TranceMan 12 years ago | | |

You are correct - confirmed in this video: Lessons of JSON

'A recent (and short) IEEE Computing Conversations interview with Douglas Crockford about the development of JavaScript Object Notation (JSON) offers some profound, and sometimes counter-intuitive, insights into standards development on the Web.'

http://inkdroid.org/journal/2012/04/30/lessons-of-json/

{ Thank you Douglas for your vision :) }

jerf 12 years ago | | |

He both invented and discovered it. Yes, the object literal syntax existed, but he also carefully (and IMHO correctly) specified a strict subset as well, for these interoperability reasons. For instance, Javascript is happy with {a: 1}, but that is not legal JSON. It's a very well done standard.

benesch 12 years ago | |

Douglas Crockford has also posted his explanation on Google+:

https://plus.google.com/118095276221607585885/posts/RK8qyGVa...

wissler 12 years ago | |

"I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability." -- Crockford

This is horrific design reasoning. It's an authoritarian, presumptuous, "punish everyone in the classroom because one child misbehaves" mentality.

Comments would be useful in JSON because comments are useful in code, and JSON is code. For example, I might have a config file that I'm typing in that I want to leave a documentation trail for.

Don't tell me I can do a silly thing like redefine a field, as if it's "neat". It's an abomination that I have to resort to such things. And guess what: by resorting to such things I can still do precisely what Crockford claims he was trying to prevent. So his rationale is not only insulting to one's intelligence, it's sheer stupidity.

IanCal 12 years ago | | |

> It's an authoritarian ...

Which is pretty much what a specification is.

It's one or more people saying "This is how things are if you call them X".

> presumptuous

Presumptuous? It was in response to the feature being abused!

> "punish everyone in the classroom because one child misbehaves" mentality

No more than creating laws is. A significant subset of the population are misusing it in such a way as could cause widespread damage. It is a minor inconvenience to the 'law abiding people' (particularly given than any comments would be removed if read in and spat out by any program). There are workarounds ("field_comment":"some comment") or if that's not enough, use another format. Use one that allows comments, there are many.

> Don't tell me I can do a silly thing like redefine a field, as if it's "neat". It's an abomination that I have to resort to such things

It's also completely unreliable, it's a terrible solution and nobody should use it. I think we're fully in agreement here.

> And guess what: by resorting to such things I can still do precisely what Crockford claims he was trying to prevent. So his rationale is not only insulting to one's intelligence, it's sheer stupidity.

No you can't. The point was to stop people adding pre-processing commands or other such things to json, which would be in random formats and invisible to some parsers (as comments should be), visible and important to others. You don't want to pass a valid piece of JSON through a parser and end up with two different outcomes dependent on something in a comment, do you? Or have to use parser X or Z because Y doesn't understand directive A, but it does understand directive B and C, and while Z understands C, and X knows B, Z doesn't, so I have to use the version from a pull request from DrPotato which I think supports...

What I'm saying is that there is a benefit in simple standards.

nonchalance 12 years ago | | |

> and JSON is code

JSON is data. It appears to be JS code, but JSON is data. Data is not code ( http://www.c2.com/cgi-bin/wiki?DataAndCodeAreNotTheSameThing ). That's why the idea of data holding parsing directives is silly. If you want to do that, then embed that in the data (hold a MsgType key in the data records). There's no need for comments unless you are trying to use it for something other than raw data.

pdeuchler 12 years ago | | |

So is all opinionated design "stupid"?

I do not presume to know who you are, or what you have accomplished, but there are few people with the professional and academic background that qualify to be able to call Douglas Crockford "stupid".

jdp 12 years ago | | |

JSON isn't a configuration language, it's just another data encoding format with the added benefit of being readable by humans. That and its ubiquity make it an appealing choice for stuff like ad-hoc configuration at first glance, but it's not the best choice. If you want a config language for shared human and machine consumption, use one designed for that purpose. JSON is pretty much just an encoding that is easy for humans to inspect and debug.

jmcdonald-ut 12 years ago |

I'm sure there are counter points to what I'm about to bring up, but three observations:

1. In my experience JSON is frequently output programmatically, and taken in programmatically. Comments are not useful in these cases.

2. The only time comments could be perceived as useful then would be when parsing JSON by eye or hand. However, it is not difficult to parse JSON and understand it unless the keys have used obfuscated names. If key naming is obfuscated, comments aren't really the correct solution.

3. "An object is an unordered set of name/value pairs", as mentioned by jasonlotito and others earlier. There is no guarantee that a JSON parser will give you the right value if there are two of the same keys in the same scope.

CanSpice 12 years ago |

Given the RFC says "The names within an object SHOULD be unique", there's nothing stopping me from writing a parser that takes the first name/value pair and throwing all the others on the floor. Or even better, picks a random name/value pair when the same name appears. Both of these behaviours are allowed by the RFC, and would break this hack.

Putting comments into JSON in this way is a hack and shouldn't be used by anybody who has any interest in writing maintainable software. Relying on ambiguities in an RFC and someone saying "JSON parsers work the same way" is a good way to end up with a really obscure bug in the future.

serichsen 12 years ago | |

At least in ECMA-262 5, Ch. 15.12.2, there is a NOTE: "In the case where there are duplicate name Strings within an object, lexically preceding values for the same key shall be overwritten."

It still does not feel right.

bzbarsky 12 years ago | |

Assuming you mean RFC 4627, you're quoting the restrictions on what character streams can be called "JSON". The "should" means that if your names are not unique you can still call it "JSON", but you should think twice about it.

The parsing behavior for JSON is not defined at all in RFC 4627, actually. Browsers (and Node, since it's using a browser js engine) use the parsing specification in ECMA-262 edition 5 section 15.12.2.

Note that ES5 section 15.12 in general is much stricter than RFC 4627, as it explicitly points out if you read it.

adamtj 12 years ago |

This is misguided. You don't need comments in a JSON config file. Why? Because you don't use JSON for config files that need comments.

JSON is like duc(k|t) tape. It's really easy to stick two things together with it. That doesn't mean you always should. It's the simple thing that gets the job done so you can focus on what matters.

One shouldn't pick JSON for your config files and then hold it up as good design. "Look at me, I'm daring and _not using XML_!" Using JSON is crap design, but good engineering means sometimes picking something crappy and not wasting effort on things that don't matter in the end.

If your configuration files become both complicated and important enough that you need comments, then you should stop using JSON. If your duck tape job starts needing additional reinforcement, then you should probably just get rid of the duct tape and do it right.

If one of your requirements is a sufficiently trendy yet commentable config language, look into YAML. Also, gaffer tape. The white kind is easier to write on.

glhaynes 12 years ago | |

If crap design like JSON is the right engineering choice sometimes (and I agree that it is), that seems like an argument that adding comments in this crappy way may sometimes be the right engineering choice.

IanCal 12 years ago | | |

Relying on undefined behaviour in a parser for comments is something I find quite hard to define as "the right engineering choice" in any situation.

tieTYT 12 years ago | |

Yeah maybe you don't use JSON for config files that need comments, but that's because there's no documented way of how to put comments in JSON. The article solved the problem.

Actually, I'm 100% playing the devils advocate here. I'll even flip-flop to prove it. Regarding the article, I doubt that every JSON parser will let this slide. To me that's an even better reason to avoid this practice.

IanCal 12 years ago | | |

> Regarding the article, I doubt that every JSON parser will let this slide. To me that's an even better reason to avoid this practice.

If someone uses undefined behaviour in config files for the sake of storing a comment, I reserve the right to hunt them down if I have to maintain their code.

nonchalance 12 years ago |

The JSON RFC (http://www.ietf.org/rfc/rfc4627.txt?number=4627) says

    The names within an object SHOULD be unique.

SHOULD is defined (http://www.ietf.org/rfc/rfc2119) as

    3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
       may exist valid reasons in particular circumstances to ignore a
       particular item, but the full implications must be understood and
       carefully weighed before choosing a different course.

Salient point is that you would need to ensure that you are only using JSON parsers that tolerate duplicate names (and use the last value)

IanCal 12 years ago | |

> Salient point is that you would need to ensure that you are only using JSON parsers that tolerate duplicate names (and use the last value)

To drive this home a bit more forcefully, it requires knowing the behaviour of your parser where it is marked as "undefined" in the spec.

If that isn't enough to stop you, DON'T USE JSON. A patch level change in a library could break your code in a non-obvious way and it would be your fault. If you want comments, DON'T USE JSON, JSON DOESN'T HAVE THEM.

bzbarsky 12 years ago | | |

Note that if your parser is the ES-standard JSON.parse, then the behavior here is in fact defined by ES5 section 15.12.2, even with duplicate names.

juandopazo 12 years ago | |

And the big point here is that the members of the RFC group were considering breaking the EcmaScript standard and change it to MUST which would break existing programs and the "workaround" in the article.

tonyg 12 years ago | | |

I wish they had! I wonder why they didn't? JSON is already a subset; limiting it to non-duplicated keys would just tighten it a little.

NathanKP 12 years ago |

This hack, while nice, is still just a work around. I highly recommend that if you can, in as many places as possible use YAML instead of JSON.

JSON works great for on the fly communication with frontends that are running JavaScript, or for communication between JavaScript processes like Node.js servers. But for configuration files and other things that need comments YAML is many times better, both for it's clean, Markdown reminiscent structure, and its native comment support.

Node.js has a great module called js-yaml (https://github.com/nodeca/js-yaml) which automatically registers handlers for .yml and .yaml files, allowing you to require them in your Node.js code just like you can with JSON files.

It also comes with a YAML parser for the browser side of things, so if you want you could even communicate YAML directly from the server to the client side, although frankly I don't see much advantage to sending YAML over the wire instead of JSON. (And as others have mentioned below untrusted YAML sources could insert malicious objects in YAML, so I wouldn't recommend this technique.)

You can even use YAML for your package.json in a Node program: (https://npmjs.org/package/npm-yaml)

hosay123 12 years ago |

This would completely break any event driven (streaming) parser.

the_gipsy 12 years ago | |

Or a parser that simply discards existing keys.

IanCal 12 years ago | | |

Which, importantly, would be perfectly fine according to the spec (as I understand it).

jgeerts 12 years ago | | |

It's overwriting existing keys, which is fine imo. When I use a map in any language and put a new value with a new key, expected behavior is that the previous key is overwritten.

jasonlotito 12 years ago |

My first thought in seeing this was that objects aren't guaranteed to maintain order: "An object is an unordered set of name/value pairs" - http://www.json.org

jfoutz 12 years ago | |

There is an intrinsic order in the text though. it's up to the parser to keep clobbering a value every time a new value comes in for a given key.

This seems like a bad idea. It seems heavily reliant on edge case behavior. But hey, might work well for the original author.

IanCal 12 years ago | | |

> it's up to the parser to keep clobbering a value every time a new value comes in for a given k

Nope, parsers are perfectly in their rights to do whatever they want with multiple keys. They could read them backwards, sort them, whatever. The behaviour in the instance of multiple keys is undefined.

> This seems like a bad idea.

It is an astonishingly bad idea. I'm concerned by it being so high on the page.

> But hey, might work well for the original author.

Depends on their parser. It's undefined behaviour according to the spec. It might work now, but I'd argue it doesn't work well, as a patch level change could bork this.

_ZeD_ 12 years ago | |

while this is true, I think it's irrelevant: the "trick" is about "abusing"

* the fact parser work from top to bottom of the text

AND

* the fact that assigning the same key many times with different values update the key with the last value

your quote regards the order in witch the different keys are saved.

masklinn 12 years ago | | |

Both are only "correct" for specific implementation, this is not specified behavior (and duplicate keys is strongly recommended against by the key)

JulianMorrison 12 years ago |

This definitely qualifies for a Zen style thwack over the head with a stick and a reprimand of "stop being clever!"

varikin 12 years ago |

This sounds great until some parser uses the comment definition instead of the value. Is it defined in the spec that parsers need to use the last defined value for a key?

dak1 12 years ago | |

Since the order of an object's keys is not guaranteed, it seems like even if a parser respected the last-defined rule, you could still potentially end up with the wrong field last.

ygra 12 years ago | |

Not really defined, but since an object is defined as an unordered collection of key/value pairs, a conforming parser could probably shuffle the pairs before parsing them.

treerex 12 years ago | | |

I suppose it could, but the point of the object being defined as an unordered collection is because the most straight-forward way of implementing this is through a hash table, where the order of the keys cannot be guaranteed without additional work. I'm sure they didn't consider a parser randomly permuting the lexical order of the pairs as something a sane person would do.

rwmj 12 years ago | |

About as defined as anything else in JSON, eg. the range of integers.

masklinn 12 years ago | | |

Actually, duplicate keys is very specifically recommended against in the RFC, and left entirely unspecified.

avolcano 12 years ago |

Can we all just agree, as a community, to add comment support to our JSON parsers? Hell, I'd do a PR on V8 if I knew C++.

It's ridiculous that I can't document notes on dependencies in my NPM package.json, or add a little reminder to my Sublime Text configuration as to why I set some value, because we're using JSON parsers that can't handle the concept of ignoring a line with a couple slashes prefixing it.

IMO - either we add comments to JSON, or we stop using it for hand-edited configuration.

julius 12 years ago |

Funny story. JSLint[1] does not approve of this technique. I asked Crockford to implement the duplicate check in April 2009 via email. 20 minutes later, out of nowhere, he was done implementing that check and wrote back "Please try it now."

This guy is fast. Especially nice considering we do not know each other at all.

[1] http://www.jslint.com/ - JS checking tool from the inventor of JSON

WayneDB 12 years ago | |

I sent him an email once asking for the same JSLint license that he gave to IBM (you know, the one without the "do not use this for evil" clause.)

He responded that he was getting annoyed by everybody asking for this, so it was going to cost me $100K to obtain such a license.

I responded that I only asked for that license in order to annoy him (and thanks for the confirmation that it worked), because his immature license clause is annoying everybody else.

kalleboo 12 years ago |

Note that these comments would disappear the second you use a JSON-aware tool to manipulate one of these files.

mtkd 12 years ago | |

You hope it is the comment dupe that disappears and not the field you want.

kstenerud 12 years ago |

Instead of using tricks that rely on parser implementation behaviors, why not just put an actual comment field in the object?

    {
        "myvalue_comment": "This is a comment",
        "myvalue": 42
    }

MatthewPhillips 12 years ago | |

That example is fine, but you wouldn't want a long comment getting loaded into memory because the parser doesn't know any better.

dnautics 12 years ago | | |

for that matter, just do:

{

  "comment":"this is a comment";
  "value": 45;

  "comment":"this is also a comment";
  "value2": 64;

  "comment":"we like overloading the comment field";
  "stringval":"but these stay the same";

}

kstenerud 12 years ago | | |

Why not? It's just a configuration file.

nrivadeneira 12 years ago |

Terrible spec-violating hack aside, the idea of the author soliciting upvotes on StackOverflow doesn't sit well with me. I'd hate for SO solutions to become diluted by answers from users who are 'marketing' for upvotes.

jgeerts 12 years ago |

It is a 'hack' as discussed in the article and I will probably never use it. JSON should be either self explanatory or documented, I don't see any reason why you would add this unnecessary clutter to these messages.

It is already hard to read as is and it's making it worse to read and confusing, if some big service would start using this, you would have to know about this 'hack' otherwise he would have to look up what the hell is going on.

Also, this is the same information for each call and thus redundant, makes your messages larger when an advantage of JSON is that it's generally a small message.

JOnAgain 12 years ago |

This, to me, looks like an example of relying on a nondeterministic implementation. To my knowledge, the standard doesn't prescribe that parsers take the second/last of a duplicate key. As a result, this is relying on implementation-specific choices which can lead to a terrible upgrade process.

Switch to a different JSON parser, does it still work? probably. but I wouldn't bet that much.

If I were implementing a JSON parser, might I throw an error on a duplicate key? maybe. Maybe I would just print a warning?

If I were every going to give someone advice it would be to never do this.

asnyder 12 years ago |

You should use standard JS comments and process them out. Douglas Crockford's offical answer on comments, https://plus.google.com/118095276221607585885/posts/RK8qyGVa.... Essentially just process them out beforehand with something like jsmin, pretty straightforward.

sktrdie 12 years ago |

This is a horrible hack. You should use JSON-LD [1] to describe the fields of your JSON. It's a W3C standard!

Also, it's not defined in the JSON standard in which order an implementation needs to parse the JSON fields/keys. So you could end up with potentially wrong results!

1. http://json-ld.org/

basicallydan 12 years ago |

This is a nice trick, but probably only should be used in systems where the set people touching the code is a limited, rarely-changing set of people and anything using the JSON is strictly going to treat the last defined value as the value to use. Dragons lurk elsewhere!

peterkelly 12 years ago |

> Believe it or not, it turns out JSON parsers work the same way

Please don't do this. There's almost certainly some parsers out there currently that don't work like this, and if not, there likely will be one day.

rcarmo 12 years ago |

I do something else that is a lot more readable:

    { 
      "#": "this is a comment for the next line",
      "url": "http://foo.bar"
    }

Simple.

IanCal 12 years ago | |

Hopefully you don't use the same key multiple times, as that's not guaranteed to work in different parsers.

zemo 12 years ago |

if I ever saw this in a project, I would remove those comments in a heartbeat. The behavior here is specific to the json parser. JavaScript is not the entirety of programming.

It does break the json parser in the Go standard library, in a totally nonobvious way: http://play.golang.org/p/BsDd47vWna

I would be surprised if it doesn't break many parsers, especially json parsers in static languages. If you want that sort of behavior, don't use json.

znmeb 12 years ago |

This is a celebration of programmers' ability to generate unmaintainable code by exploiting implementation dependencies. People get fired for pulling this horseshit every day!

M4rkH 12 years ago |

A common practice in config files is to comment out whole sections e.g. optional proxy server settings. This sort of multi-line comment is not addressed by this hack

kgabis 12 years ago |

Well, here we go: https://github.com/kgabis/parson/issues/7

wickedlogic 12 years ago |

Don't use them, there is no such thing. Make your comments first class citizens in the data.

lttlrck 12 years ago |

Nice hack but fails JSHint.

[1] http://jshint.com/

opminion 12 years ago |

JSON has comments already. It just requires you to decide what the comment marker is.

quantumpotato_ 12 years ago |

I thought JSON is mainly for machine to machine consumption.. who reads comments?

knodi 12 years ago |

This is a recipe for disaster.

davidradcliffe 12 years ago |

Neat trick! Not sure I'd trust it, and might be confusing for anyone reading who didn't know this.

8ig8 12 years ago |

That seems pretty fragile.

3. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.