Comments in JSON(fadefade.com) |
Comments in JSON(fadefade.com) |
That said if you want your Static JSON objects to have comments, just pipe the JSON object through a minifier to strip comments before parsing.
'A recent (and short) IEEE Computing Conversations interview with Douglas Crockford about the development of JavaScript Object Notation (JSON) offers some profound, and sometimes counter-intuitive, insights into standards development on the Web.'
http://inkdroid.org/journal/2012/04/30/lessons-of-json/
{ Thank you Douglas for your vision :) }
https://plus.google.com/118095276221607585885/posts/RK8qyGVa...
This is horrific design reasoning. It's an authoritarian, presumptuous, "punish everyone in the classroom because one child misbehaves" mentality.
Comments would be useful in JSON because comments are useful in code, and JSON is code. For example, I might have a config file that I'm typing in that I want to leave a documentation trail for.
Don't tell me I can do a silly thing like redefine a field, as if it's "neat". It's an abomination that I have to resort to such things. And guess what: by resorting to such things I can still do precisely what Crockford claims he was trying to prevent. So his rationale is not only insulting to one's intelligence, it's sheer stupidity.
Which is pretty much what a specification is.
It's one or more people saying "This is how things are if you call them X".
> presumptuous
Presumptuous? It was in response to the feature being abused!
> "punish everyone in the classroom because one child misbehaves" mentality
No more than creating laws is. A significant subset of the population are misusing it in such a way as could cause widespread damage. It is a minor inconvenience to the 'law abiding people' (particularly given than any comments would be removed if read in and spat out by any program). There are workarounds ("field_comment":"some comment") or if that's not enough, use another format. Use one that allows comments, there are many.
> Don't tell me I can do a silly thing like redefine a field, as if it's "neat". It's an abomination that I have to resort to such things
It's also completely unreliable, it's a terrible solution and nobody should use it. I think we're fully in agreement here.
> And guess what: by resorting to such things I can still do precisely what Crockford claims he was trying to prevent. So his rationale is not only insulting to one's intelligence, it's sheer stupidity.
No you can't. The point was to stop people adding pre-processing commands or other such things to json, which would be in random formats and invisible to some parsers (as comments should be), visible and important to others. You don't want to pass a valid piece of JSON through a parser and end up with two different outcomes dependent on something in a comment, do you? Or have to use parser X or Z because Y doesn't understand directive A, but it does understand directive B and C, and while Z understands C, and X knows B, Z doesn't, so I have to use the version from a pull request from DrPotato which I think supports...
What I'm saying is that there is a benefit in simple standards.
JSON is data. It appears to be JS code, but JSON is data. Data is not code ( http://www.c2.com/cgi-bin/wiki?DataAndCodeAreNotTheSameThing ). That's why the idea of data holding parsing directives is silly. If you want to do that, then embed that in the data (hold a MsgType key in the data records). There's no need for comments unless you are trying to use it for something other than raw data.
I do not presume to know who you are, or what you have accomplished, but there are few people with the professional and academic background that qualify to be able to call Douglas Crockford "stupid".
1. In my experience JSON is frequently output programmatically, and taken in programmatically. Comments are not useful in these cases.
2. The only time comments could be perceived as useful then would be when parsing JSON by eye or hand. However, it is not difficult to parse JSON and understand it unless the keys have used obfuscated names. If key naming is obfuscated, comments aren't really the correct solution.
3. "An object is an unordered set of name/value pairs", as mentioned by jasonlotito and others earlier. There is no guarantee that a JSON parser will give you the right value if there are two of the same keys in the same scope.
Putting comments into JSON in this way is a hack and shouldn't be used by anybody who has any interest in writing maintainable software. Relying on ambiguities in an RFC and someone saying "JSON parsers work the same way" is a good way to end up with a really obscure bug in the future.
It still does not feel right.
The parsing behavior for JSON is not defined at all in RFC 4627, actually. Browsers (and Node, since it's using a browser js engine) use the parsing specification in ECMA-262 edition 5 section 15.12.2.
Note that ES5 section 15.12 in general is much stricter than RFC 4627, as it explicitly points out if you read it.
JSON is like duc(k|t) tape. It's really easy to stick two things together with it. That doesn't mean you always should. It's the simple thing that gets the job done so you can focus on what matters.
One shouldn't pick JSON for your config files and then hold it up as good design. "Look at me, I'm daring and _not using XML_!" Using JSON is crap design, but good engineering means sometimes picking something crappy and not wasting effort on things that don't matter in the end.
If your configuration files become both complicated and important enough that you need comments, then you should stop using JSON. If your duck tape job starts needing additional reinforcement, then you should probably just get rid of the duct tape and do it right.
If one of your requirements is a sufficiently trendy yet commentable config language, look into YAML. Also, gaffer tape. The white kind is easier to write on.
Actually, I'm 100% playing the devils advocate here. I'll even flip-flop to prove it. Regarding the article, I doubt that every JSON parser will let this slide. To me that's an even better reason to avoid this practice.
If someone uses undefined behaviour in config files for the sake of storing a comment, I reserve the right to hunt them down if I have to maintain their code.
The names within an object SHOULD be unique.
SHOULD is defined (http://www.ietf.org/rfc/rfc2119) as 3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
Salient point is that you would need to ensure that you are only using JSON parsers that tolerate duplicate names (and use the last value)To drive this home a bit more forcefully, it requires knowing the behaviour of your parser where it is marked as "undefined" in the spec.
If that isn't enough to stop you, DON'T USE JSON. A patch level change in a library could break your code in a non-obvious way and it would be your fault. If you want comments, DON'T USE JSON, JSON DOESN'T HAVE THEM.
JSON works great for on the fly communication with frontends that are running JavaScript, or for communication between JavaScript processes like Node.js servers. But for configuration files and other things that need comments YAML is many times better, both for it's clean, Markdown reminiscent structure, and its native comment support.
Node.js has a great module called js-yaml (https://github.com/nodeca/js-yaml) which automatically registers handlers for .yml and .yaml files, allowing you to require them in your Node.js code just like you can with JSON files.
It also comes with a YAML parser for the browser side of things, so if you want you could even communicate YAML directly from the server to the client side, although frankly I don't see much advantage to sending YAML over the wire instead of JSON. (And as others have mentioned below untrusted YAML sources could insert malicious objects in YAML, so I wouldn't recommend this technique.)
You can even use YAML for your package.json in a Node program: (https://npmjs.org/package/npm-yaml)
This seems like a bad idea. It seems heavily reliant on edge case behavior. But hey, might work well for the original author.
Nope, parsers are perfectly in their rights to do whatever they want with multiple keys. They could read them backwards, sort them, whatever. The behaviour in the instance of multiple keys is undefined.
> This seems like a bad idea.
It is an astonishingly bad idea. I'm concerned by it being so high on the page.
> But hey, might work well for the original author.
Depends on their parser. It's undefined behaviour according to the spec. It might work now, but I'd argue it doesn't work well, as a patch level change could bork this.
* the fact parser work from top to bottom of the text
AND
* the fact that assigning the same key many times with different values update the key with the last value
your quote regards the order in witch the different keys are saved.
It's ridiculous that I can't document notes on dependencies in my NPM package.json, or add a little reminder to my Sublime Text configuration as to why I set some value, because we're using JSON parsers that can't handle the concept of ignoring a line with a couple slashes prefixing it.
IMO - either we add comments to JSON, or we stop using it for hand-edited configuration.
This guy is fast. Especially nice considering we do not know each other at all.
[1] http://www.jslint.com/ - JS checking tool from the inventor of JSON
He responded that he was getting annoyed by everybody asking for this, so it was going to cost me $100K to obtain such a license.
I responded that I only asked for that license in order to annoy him (and thanks for the confirmation that it worked), because his immature license clause is annoying everybody else.
{
"myvalue_comment": "This is a comment",
"myvalue": 42
}{
"comment":"this is a comment";
"value": 45;
"comment":"this is also a comment";
"value2": 64;
"comment":"we like overloading the comment field";
"stringval":"but these stay the same";
}It is already hard to read as is and it's making it worse to read and confusing, if some big service would start using this, you would have to know about this 'hack' otherwise he would have to look up what the hell is going on.
Also, this is the same information for each call and thus redundant, makes your messages larger when an advantage of JSON is that it's generally a small message.
Switch to a different JSON parser, does it still work? probably. but I wouldn't bet that much.
If I were implementing a JSON parser, might I throw an error on a duplicate key? maybe. Maybe I would just print a warning?
If I were every going to give someone advice it would be to never do this.
Also, it's not defined in the JSON standard in which order an implementation needs to parse the JSON fields/keys. So you could end up with potentially wrong results!
Please don't do this. There's almost certainly some parsers out there currently that don't work like this, and if not, there likely will be one day.
{
"#": "this is a comment for the next line",
"url": "http://foo.bar"
}
Simple.It does break the json parser in the Go standard library, in a totally nonobvious way: http://play.golang.org/p/BsDd47vWna
I would be surprised if it doesn't break many parsers, especially json parsers in static languages. If you want that sort of behavior, don't use json.
In fact, reading the RFC:
> The names within an object SHOULD be unique.
I'm pretty sure an implementation could refuse to parse the form altogether.
I know there is a lot of JSON handling that happens behind-the-scenes, but there is also a non-trivial amount of JSON that I have manually created and/or altered, and have to share with a team.
It's a blessing and a curse, these modern NodeJS projects -- it's awesome that I can simply create/modify a .json file with a few properties, run a command, and magic happens. However, if I want to try and communicate out the intent of the values to my team of 20+, it becomes really convoluted. The projects all magically work by looking for foo.json, but if I comment that file then it breaks.
So I have to create another foo.comments.json, add another script that will remove the comments and then call the original instructions. Then I need to create additional documentation instructing the team to ignore the developer's docs regarding native use, and to run the application with our own homebrew setup.
It also can make testing a pain in the ass, because now I can no longer comment out values, I have to remove them completely. Not a huge deal, annoying nonetheless.
For the past few years, I've generally been using either apache-style via http://p3rl.org/Config::General or some sort of INI derivative (git is proof that ini is good enough for a lot more things than you might expect).
For the future, ingy and I have been working on http://p3rl.org/JSONY which is basically "JSON, but with almost all of the punctuation optional where that doesn't introduce ambiguity" - currently there are perl and ruby parsers for it, javascript will hopefully be next.
Admittedly, we -haven't- got round to defining a format for comments yet, but my point is more "JSON wasn't really designed for that, let's think about something better".
The advantage I see in this way of commenting is that the comment becomes accessible inside the program instead of being stripped off by the parser. For the human reader it's also more obvious.
Unfortunately, it's not possible to add comment to anything else than objects. But the OP's proposal as well.
There's the famous Rails vulnerability due to YAML. Python needed to add 'yaml.safe_load'.
YAML is a little too rich. It's always one poorly thought out convenience feature away from disaster.
It has parsers for nearly every language, I wrote one for js: http://npmjs.org/package/tomljs
YAML is easy to type, even with the whitespace. So is INI. And as verbose as XML is, it's easier, ime, to type than JSON. Of those four, JSON is the hardest to write by hand; certainly it's the one I make most mistakes with, to extent I have a particular technique for writing it out (prefixing the commas). As a result JSON as a config file format is tedious, verbose, and error prone; its sweet spot is a machine interchange format that a human can debug/read if needed.
Rails RCE, sup
But I do like the Rails convention of using YAML format and have adopted that in my own code as much as possible.
Also, many of the security holes in YAML come from its use as a serialization format which can represent native classes. I wish the YAML parsers had more explicit support for simple data schemas which would reduce the security risk and be sufficient for most configuration files.
For -configuration- you want a simpler format; INI is worth considering, as is http://p3rl.org/JSONY which is ingy's implementation of a vision we thrashed out for a more sysadmin-friendly config format.
+1 re YAML
Even with indentation problems, the time saved in not typing curly brackets, extra quotation marks, and commas, and the time saved in not having to visually parse these when reading YAML more than makes up for the occasional data structure bug caused by bad indentation.
Why not have
{ "keyname" : "aldkjfhaldhfa"
"keyname_comment" : "asdfjnad" }
If that's not enough, use something other than JSON. Adding comments will just result in it being valid in some parsers and not others.Regardless, of course, people add metadata to JSON already - there's zero reason you can't "_type": "int". It's a completely arbitrary reason.
Bing, Bing, Bing. We have a winnar!!!
XML sucks in large part not because of XML but because people used it for everything, everywhere in places it was highly ill-suited. Don't fuckup JSON the same way.
s/^#.*//g
or yaml.safe_load(json_file_with_comments)He never said that.
>I do not presume to know who you are, or what you have accomplished, but there are few people with the professional and academic background that qualify to be able to call Douglas Crockford "stupid".
Why, who do you think Douglas Crockford is and what is his "academic background"? He doesn't even have a related degree. Most of his JS fame he ows to his book.
> The reason to use semicolons is because coding rigor tends to produce significantly better software.
I also never said that "opinionated design is stupid".
Perhaps you could rephrase your question in such a way that you aren't presuming to speak for me.
3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
The consequences are undefined, I feel, for a reason. You can't put them all down on paper, it depends on what all the parsers do. The parsers can accept or reject things with duplicate keys, or they can play a nice little ditty through the speakers.All it means is a parser isn't required to reject JSON with multiple keys. It can, however, do whatever the fuck it wants with them.
If the wording was precise, then it should be a MUST. SHOULD indicates a terrible world of unknown consequences.
Given a choice it's better to have a .ini style format like the one that pythons ConfigParser will digest. That way you can have sections, comments and you won't be tempted to have the application write things into the configuration on it's own...
if not key in hash:
hash[key] = value
That's a sensible approach, valid as per the spec.> I'm sure they didn't consider a parser randomly permuting the lexical order of the pairs as something a sane person would do.
It could sort the keys, in which case the order is no longer guaranteed (again this doesn't seem insane).
The proposal is to rely on undefined behaviour for comments. I'm amazed we're still talking about this.
However, I think we wholeheartedly agree, don't rely on this behavior. It is an outright strict mode error.
Adding easy syntax highlighting is my next step to address this problem.
"The primary objective of this revision is to bring YAML into compliance with JSON as an official subset."
http://yaml.org/spec/1.2/spec.html
If I understand it correctly, the earlier versions were close but not 100% compatible.
But that makes no sense at all to me. I agree that using comments as metadata/directives is typically an antipattern hack, but what about for non-metadata comments? Embedding comments into code is just as ass-backwards as embedding code into comments. Neither is right.
> For the human reader it's also more obvious.
Strongly disagree here -- if I open a file that I've never worked in before, I have faith that the comments were meant specifically for me. Likewise, I assume all code in the file is not for me (on account that I'm not a compiler/interpreter/etc.).
> PIs are not part of the document's character data, but must be passed through to the application
If they're being passed through and not being used by the parser, it's no different really than a
"directive" : "blah"
in JSON, which is fine. The application at the end needs to deal with it, but the parser doesn't, and that's really important. If it's just a comment, passing the file into and out of a program could remove the comment. something.json | python -mjson.tool | myjsonprocessingapp
Should be the same as something.json | myjsonprocessingapp
If the parser does need to understand the directive, at least there's a difference between an error of "I don't understand directive X" and no error at all because your parser ignored the comments.You are of course correct that JSON turns out not to quite be a strict subset in the set theory sense of "strict subset", though obviously that's a bug in the spec rather than a deliberate design decision.
In practice, many C implementations recognize, for example, #pragma once as a rough equivalent of #include guards — but GCC 1.17, upon finding a #pragma directive, would instead attempt to launch commonly distributed Unix games such as NetHack and Rogue, or start Emacs running a simulation of the Towers of Hanoi.[7]
--- !ruby/hash:ActionDispatch::Routing::RouteSet::NamedRouteCollection
'foo; eval(eval(puts '=== hello there'.inspect);': !ruby/object:OpenStruct
table:
:defaults: {}
Allowing people to run arbitrary code on rails servers.[0] http://rubysource.com/anatomy-of-an-exploit-an-in-depth-look...
But for over the wire communication, JSON makes more sense than YAML, not only because parsing unsafe YAML from an untrusted client could cause exploits like you mentioned, but also because YAML is dependent on indentation and line breaks, and therefore makes communication with the client side much more awkward than just sending JSON to the client or receiving JSON from it.
Honestly, tough, I think all major JSON parser behave following the two assumption.
I think parsers for JSON and Yaml, INI etc should be designed in such a way as to make it impossible to assign anything like an object, class, function, etc. Numbers, strings, and collections of numbers and strings... that's all you should get (though obviously "string" is frought with peril.) Anything more is unnecessarily complex.
The way to have avoided the issue would have been for JSON to have a grammar that broke eval(). But one could argue the ability to pass JSON into eval() to get JavaScript is one of the reasons JSON became popular to begin with.
Keys SHOULD be unique.
And, no, this scheme doesn't break the go parser because there isn't a typeshift between the "comment" fields, they are all strings.
But if the spec says that keys SHOULD be unique, what's the behaviour when they aren't?
Which is fine, because it's in a format that everyone can parse. Adding
//COMMAND: Extension(github:IanCal/preparser).parsethis
is the kind of things this forbids. Most peoples parsers would ignore this as a comment (if we have comments), but maybe some would do it in a special way. Either everyone ignores the comments in the parser (this is unlikely to carry on for long, someone will want to extend it) or nobody is allowed comments. That way everyone parses the same text.And the "community" in question had repeatedly and grossly demonstrated itself to be unworthy of such trust.
Crockford was not hypothesizing that this might happen, he'd seen it. Repeatedly. If you want to argue against it even so, fine, but bear in mind that is what you are arguing against, real pain that real people experienced, not mere possibilities.
The problem is that JSON is not meant to be used a configuration file format and just because it's really good for information exchange, doesn't mean it's good for configuration (and vice-versa). Configuration really requires comment support and information-exchange is better avoiding it. Two standards are needed.
{
"_type": "int",
"foo": "123"
}
a JSON parser will always know how to represent that object. How you process that object is up to you.However, when you write:
{
// @type int this is a comment
"foo": "123"
}
and you call JSON.parse(), what would you expect to get back? You can no longer represent it as a simple object, you need some way to access the comment, how do you do that? Moreover, whose responsibility is it to process the annotation in that comment? the parser's? Should you get back an integer rather than a string for obj.foo? how would you support different types of annotation? What happens if you're using parser A and your client uses parser B? Does parser B support all the annotations that parser A supports? If you need to modify a JSON structure, e.g. JSON decoding, adding a property and re-encoding, should the comments be preserved? ...You can see that having comments introduces a whole host of other questions, ambiguity and would only make it harder for different platforms to share data. Avoiding this kind of cruft is why JSON is winning vs XML for most things these days.
However, when using ad hoc parser, then all bets are off what the result is in both cases again, not just the comment case. Regardless of comment support in JSON the same problem appears to exist.
Well it's not really about the spec changing, the spec doesn't have a defined behaviour for duplicate keys.
> But I wouldn't be sad if the spec were changed to allow for this, or to allow for comments.
I don't think duplicate keys should be allowed, but I've no strong feelings on comments. I don't think there's any real need for them though.
Is this a true statement? Even books have margins, and word docs comments. I think it’s not infrequent that pure data calls for metadata to put it into context for future users of that data.
And in computing most "pure data" formats have had either comments - or schemas and specifications which outline which the contents. The later sure look like comments stored externally to the documents, from my perspective.
In general I do not think data is self describing, and thus must be commented on in some form to describe it.
{
"data": "some data",
"data_comments": "here are my comments"
}edit for clarity: You're assuming that the application code isn't doing something with each key that it reflectively sees in the object, e.g. creating database fields to match them, or launching missiles towards those destinations, etc.. If you wouldn't automatically add dummy elements to a hashmap or dictionary in Java or Python, then you shouldn't add keys in a javascript object, unless you control the source to the program that will processing the data. Even then you shouldn't, because it will become a habit to add comments this way, and that will bite you when an extra key does matter.
Lisp programmers disagree.
Code is data and vice versa. Look up what the acronym JSON means sometime.
Code more Lisp and read more Hofstadter ;).
All code is data, but not all data is code.
JSON is code because I use it as code. It's not your business to tell me it's not code -- you haven't seen how I'm using it. And don't go chirping that I should only do things your way, it's none of your god damned business what I'm using it for.
Further, if JSON was really only data, then it's an incredibly stupid way to store data, given that it has a human-readable syntax that the computer can only deal with after it's been parsed. As data, it's bloated and inefficient. To the extent that JSON is a good format, it's code. To the extent that it's data, it's not a good format.
If you don't like the format or feel that JSON is too restrictive/bad feel free to extend it or create your own format from scratch.
While I don't think that comments belong in JSON, I don't agree JSON is designed as "data and not code" format. Trees of tokens are actually the natural format for writing code (also known as Abstract Syntax Trees, AST) and the data/code distinction is really, really blury when those two meet together, so it's only to be expected that people will end up coding in JSON (what are the 'build definition' files for various build tools / package managers, if not very simple programs)?
> Further, if JSON was really only data, then it's an incredibly stupid way to store data, given that it has a human-readable syntax that the computer can only deal with after it's been parsed. As data, it's bloated and inefficient.
So use something else. Also, a computer can only read any file after it's been parsed in some way. I'm not really sure what you're suggesting as an alternative.
> To the extent that JSON is a good format, it's code
Is it executable? Is it turing complete?
It represents groups of more-less arbitrary tokens as trees, therefore it's a natural format for code representation as it's equivalent to an AST, therefore it's trivial to attach a basic execution context with if and lambda defined, and now it's executable and turing-complete.
You could use JSON as code, but that's somewhat silly, because there's already a superset of JSON designed for that use.
{"JSON":"ro
cks!"}
(there's a unicode line separator -- 2028)You can't use JSON to compute things, therefore it is not code (unless you are willing to concede that any document format is code).
You could, if you were crazy enough, write perfectly valid JSON that passed the values to eval() or a parser or what have you. And while there are encodings in JSON that don't work in javascript (i've broken JS innumerable times trying to get that to work) JS does of course allow you to add closures as an object, or an array, whatever you like, and some forms of valid JSON (if not all) are also valid javascript. So you could indeed use JSON to compute things if you wanted to.
Because it is. Data vs. code distinction is arbitrary. The following sequence of characters:
"echo 'foobar';"
can be interpreted as describing a string, a series of tokens, a piece of code, a piece of music or a small icon, whatever interpretation you choose.
That's not the question. "All data is code" is not the same statement.
In a different context: "All apples are fruit" may be true but that doesn't imply "all fruit are apples"
- Ant build descriptions that look suprisingly like executable Lisp code if you replace "<tag> ... </tag>" with "(tag ...)".
- Musical notation which is obviously code for humans playing instruments (it even has loops, I think, AFAIR from my music lessons; don't know about conditionals; if it has them, maybe it's Turing-complete? (ETA it would seem it is[3])).
- Windows Metafile format for bitmap and vector graphics which is basically a serialized list of WinAPI calls [1].
- "fa;sldjfsaldf" - the "not code, just data" example from [2] that happens to be "a Teco program that creates a new buffer, copies the old buffer and the time of day into it, searches and then selectively deletes". Oh, and it's also "a brainfuck program that does nothing, and a vi program to jump to the second "a" forwards and replace it with the string "ldjfsaldf"".
[0] - https://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach
[1] - http://en.wikipedia.org/wiki/Windows_Metafile
[2] - http://www.c2.com/cgi-bin/wiki?DataAndCodeAreNotTheSameThing
[3] - http://programmers.stackexchange.com/questions/136085/is-mus...
The are conditionals in standard music notation, at least ones that involve "executing different code" based on the value of a loop counter.
Arbitrary data don't exist without some notion of an execution (or interpretation) platform.
We tend to use "code" as a word for "commands telling some execution process what to do" and "data" as a word for "information that is meant to be transformed" but in reality this distinction is meaningless; both are fundamentally the same thing, and even our "code" vs. "data" words have blurry borders. It's very apparent when you start reading configuration files. For example, aren't Ant "configuration files" essentially programs[0]?
We all know what we usually mean in context by saying what is "code" vs. what is "data", but one has to remember, that in fact they are the same - minding it leads to insights like metaprogramming. Forgetting about it leads to dumb languages and nasty problems, and is generally not wise.
[0] - the answer is: yes, they are, see http://www.defmacro.org/ramblings/lisp.html for more.
ETA:
Questions to ponder:
- are regular expressions code, or data?
- is source written in Prolog code, or data?
Also I recommend watching http://www.youtube.com/watch?v=3kEfedtQVOY to learn how what would be data, as defined by formal grammars of some real-world protocols, can - by means of sloppy grammars and bad parser implementation - cross the threshold of Turing-completeness and become code.
- Is the text of Hamlet code?
- Was it code as soon as Shakespeare wrote it?
- If not, did it become code once the electronic computer was invented? Or did that happen once a version was stored in a way accessible to an electronic computer?
- Did all the existing paper copies immediately become code at that point as well?
> Was it code as soon as Shakespeare wrote it?
Yes, the text of a play is code meant to be executed by humans.
I guess that's true.
The flavour of "code vs. data" discussion in this thread was one of representation formats. You could argue that when looking at works of art from past centuries one should immediately say "data!" [0]. But in case of JSON, a format suspiciously almost identical to Lisp in structure, one needs to be careful in saying "it's for data, not for code".
Actually, I'm not sure what kind of point I'm trying to make, as the more I think of it, the more examples of things that are borderline code/data come to my mind. Cooking recipes is the obvious candidate, but think about e.g. music notation - it clearly feels more like "code" than "data".
I feel that you could define a kind of difference between "code" and "data" other than in intent, something that could put bitmaps into the "data" category, and a typical function into "code" category, but I can't really articulate it. Maybe there's some mathematical way to describe it, but it's definitely a blurry criterion. But when we're discussing technology, I think it's harmful to pretend that there's a real difference. Between configuration files looking like half-baked Lisp listings and "declarative style" C++ that looks like datasets with superfluous curly braces, I think it's wrong to even try to draw a line.
[0] - there's a caveat though. "How to Read a Book" by Mortimer Adler[1] discussess briefly how the task of a poet is to carefully chose words that evoke particular emotional reactions in readers. It very much sounds like scripting the emotional side of the human brain.