Good intro: http://cognitect.github.io/transit-tour/
GitHub: https://github.com/cognitect/transit-js
Note that in the introduction they provide a simple benchmark where Transit is both more compact and faster to parse than JSON with custom hydration.
Unreadable format, as mentioned in this thread.
{"key:A<A<s>>":[["values"],["here"]]}
This doesn't mean anything to me as a developer, unless I've seen the spec. It's kludgy. It's not reverse-compatible if you don't install a TJSON parser.
Two solutions immediately strike me as better, one has been mentioned here.
(1) Not optimal, but actually spell out words in key names. There's no reason "A" has to mean Array. That doesn't mean anything to me. If I'm seeing it for the first time and have no idea what TJSON is, the very next value could be "key2:B<B<t>>".
(2) Far more optimal: as an example has been provided with "date", just nest objects as values for any extended types. Then this spec is completely reverse compatible and compliant, and as a developer I don't have to worry about parsing key names.
e.g.
{
"some_nested_array": {
"type": "array.array.string",
"value": [
["values"],
["here"]
]
}
}
Extremely easy to implement and not reliant on a governing body.I have certainly studied XML and think XML Schema did fantastic work specifying datatypes:
https://www.w3.org/TR/xmlschema11-2/#built-in-datatypes
I briefly considered adopting this work wholesale:
https://github.com/tjson/tjson-spec/issues/37
If you'd like to see that happen, please make a note of it in the issue. Thanks!
Also note: I'm not a JS hipster, I'm part of the Rust Evangelism Strike Force.
Second, XML is extraordinarily complicated. Flipping around the XML 1.0 spec (https://www.w3.org/TR/xml/) isn't really encouraging me that all of this is there for a reason. I'd love to be proved wrong though!
In contrast, RFC 7159 is incredibly short and readable: https://tools.ietf.org/html/rfc7159. The TJSON spec isn't bad either: https://www.tjson.org/spec/. Even combining both the result is still far shorter and more clear than XML.
{"hello-world:s": "Hello, world!"} → (hello-world "Hello, world!")
{"hello-base-sixteen:d16": "48656c6c6f2c20776f726c6421"} → (hello-base-sixteen #48656c6c6f2c20776f726c6421#)
{"base-sixty-four-is-default:d": "SGVsbG8sIHdvcmxkIQ"} → (base-sixty-four |SGVsbG8sIHdvcmxkIQ|)
{"hello-signed-int:i": "42"} → (some-int 42)
Ø → (some-big-int [bigint]|GY0+kwq94p4QRs2j4rHisQLgEN3zsFSZNJrgK+ZFcV0s1ShyMkMFOHip0oRuG7v+TAC7qmDaYSojFbZjNV5dSA==|)
{"hello-timestamp:t": "2016-10-02T07:31:51Z"} → (hello-timestamp [timestamp]2016-10-02T07:31:51Z)
Seriously, this is IMHO so clearly good I'm surprised more folks don't agree.The main inspiration for this format is SPKI/SDSI, which was based on S-expressions. As beautiful as you think the S-expression version may be over the (T)JSON, I personally blame the use of S-expressions as one of many reasons SPKI/SDSI failed to gain more widespread traction, and personally think something like TJSON is a lot more likely to gain traction than the second coming of S-expressions. This is, of course, a debatable point, but you won't find me working on Sexp-based formats any time soon.
ASN.1 of course has a sordid history in the credential space as well, often reviled by security experts as the source of frequent vulnerabilities, particularly problematic encodings like BER. I will admit OER is nice, but nobody uses OER and the IETF prefers things be standardized in terms of DER.
"Research things", yes been there, done that.
I assume the TJSON libraries throw errors if invalid types or formats are provided --- which is good, but that makes this a validator. Developers have been representing non-standard formats in JSON for years.
Google's response to JSON's limitations was the Protocol Buffer [1], and as I understand it, it's used internally relatively extensively, but there hasn't been much adoption outside of Google. JSON is just the right mix of simple + robust for the majority of use cases.
So it feels more like a machine format, but in that case why not use a more efficient one, like a binary format?
> TJSON documents are amenable to "content-aware hashing" where different encodings of the same data (including both TJSON and binary formats like Protocol Buffers, MessagePack, BSON, etc) can share the same content hash and therefore the same cryptographic signature.
TJSON is designed to facilitate documents that retain the same content hash when transcoded to/from binary formats.
http://json-schema.org/examples.html
Been there for almost a decade. Already supported by all the major json libraries in all the major languages.
{ "date": "1937-01-01T12:00:27.87+00:20" }
As you can see, JSON doesn't stop anyone from using RFC3339 to encode dates.
What I'm getting as is that a date gets serialized into JSON as either a string or a number, depending on who wrote the toJSON method, and that the consumer of that JSON needs knowledge about the schema of the data in order to properly deserialize it.
{"foo:O":{}}
really tell you more than {"foo":{}}
?The ability to encode sets, integers, binary data and time stamps is useful. But why tag things which are what they look like? It's a waste of space.
Or, a more mundane explanation: the parser will silently clobber the name because it contains a ":"
Leaving any names untagged is ambiguous.
Besides that, in JSON Schema the schema is not bundled with the data. This is a feature for input validation: the receiver must know what it allows, not just what is received. This is a feature for readability (which is a great feature of JSON) as the data is not uncumbered with the schema. A receiver is free to use a schema or not. While TJSON imposes a receiver to recognize its dirty format.
So TJSON brings nothing new, except interoperability problems.
Only if the software running on the other end does not support your data format.
Meanwhile, HTML uses string attributes to declare languages, and no one ever complained that browsers may interpret the lang tag as a string.
const data = { name: 'foo', time: new Date('05 October 2011 14:48 UTC') }; const data2 = JSON.parse(JSON.stringify(data));
typeof data.time === 'object' typeof data2.time === 'string'
What I'm getting at is: JSON.parse and JSON.stringify convert date objects into strings. While dates are not javascript primitives, there are no other things I can think of that you'd use to hold state that aren't preserved across JSON.parse(JSON.stringify(thing)) boundaries. If you want to write a JSON handling API endpoint, you can pretty easily process all the incoming data and deal with it in a zero-knowledge fashion, but if you want to properly deal with dates there, you're going to have to add knowledge to the parser receiving the JSON object.
This is a complaint I have about JSON in general, as a method for serializing data for transport across the wire. XML is an annoying format to work with, but at least metadata was possible (of course, dealing with dtds or other metadata formats basically made zero-knowlege parsing impossible anyway, you just wrote a meta-parser that used the dtd to encode the knowledge).
If, instead of '2011-10-05T14:48:00.000Z', or 1317826080000, we took a page from binary / octal / hex numeric representations, and defined a date prefix - 0xcoffee is a number, why can't Dx1317826080000 be parsed by JSON.parse as a date?
I mean, I know why. But it's still annoying that, when receiving data from an API, I basically can treat everything in the API as fine as written, except that I have to go in and fiddle date strings into date objects.
With toplevel arrays, in absence of this type information being explicitly specified in an object, implementations would have to rely on detecting homogeneity at decode time.
This is certainly possible, and in fact the serialization logic does it. But it seems like a sharp edge to include in deserialization logic in a security-oriented format. The format aims to keep the deserialization logic free of any sort of "guesswork".
Correct:
https://github.com/tjson/tjson-spec/blob/master/draft-tjson-... https://www.tjson.org/spec/#rfc.section.3.8
Also, canonicalization is a bit of a mess. There are several incompatible canonicalization schemes for JSON, and even within a single one of those people have a difficult time implementing them correctly. See e.g. https://github.com/theupdateframework/tuf/issues/362
Also, I'm collecting a list of all subsets of JSON here if anyone knows of more: https://housejeffries.com/page/7
EDIT: Wow there's a lot of criticism in this thread. For the record I think TJSON is great.