Cap’n Proto

535 points by implmentor 9 years ago | 182 comments

kentonv 9 years ago |

Hi all, Cap'n Proto author here. Thanks for the post.

Just wanted to note that although Cap'n Proto hasn't had a blog post or official release in a while, development is active as part of the Sandstorm project (https://sandstorm.io). Cap'n Proto -- including the RPC system -- is used extensively in Sandstorm. Sandboxed Sandstorm apps in fact do all their communications with the outside world through a single Cap'n Proto socket (but compatibility layers on top of this allow apps to expose an HTTP server).

Unfortunately I've fallen behind on doing official releases, in part because an official release means I need to test against Windows, Mac, and other "supported platforms", whereas Sandstorm only cares about Linux. Windows is especially problematic since MSVC's C++11 support is spotty (or was last I tried), so there's usually a lot of work to do to get it working.

As a result Sandstorm has been building against Cap'n Proto's master branch so that we can make changes as needed for Sandstorm.

I'm hoping to get some time in the next few months to go back and do a new release.

StephanTLavavej 9 years ago | |

As of MSVC 2015 Update 3, the support for C++11 is nearly complete. The missing pieces are that Expression SFINAE is partially supported (what's there is good enough for the STL and Boost, with certain workarounds that you must be aware of), and C99 preprocessor support is full of bugs. Aside from that, all features are present and generally usable. (Some are buggier than others; e.g. we're overhauling multithreading library support for the next major version of the libraries.)

Hydraulix989 9 years ago | | |

There's still a bunch of minor differences/annoyances with the MS toolchain that require me to go in and make a bunch of small changes everywhere to my project, which compiles fine on clang/g++ (and I wouldn't say I'm doing anything _crazy_ like the aforementioned "Expression SFINAE").

Right now, my project's rather small 100 kLOC codebase compiles (with very minimal #ifdef hackery) on Android, iOS, Mac, and Linux; but Windows is still very much a WIP, even some of my third-party libraries don't compile on VC. I'm actually considering trying the MINGW toolchain at this point, and I'd be curious to hear anyone's thoughts?

I'm sure part of it is that the compiler itself probably developed with the tight feedback loop from developing and maintaining large codebases in house at Microsoft (like Windows and Office). It's pretty hard when part of the internal pressure to support large codebases like that conflicts with the need to conform to outside third-party standards. I've heard great things about the compiler people at Microsoft, and I'm sure they have a technically strong team, but they are most likely caught in the middle of this organizational deadlock.

I'm sure it doesn't help that I'm not mentioning any specifics, I'm going to be revisiting the Windows desktop port of my mobile app soon (which I gave up and haven't touched for a couple months), and everything will be fresh in my mind again. I do vaguely remember something about having to explicitly add more #include statements to pull in header files that were already getting pulled in by my other compilers.

kixpanganiban 9 years ago | |

Kenton, thank you for your work on this! I currently use ZeroRPC (which uses protobuf and msgpack) and I was blown away by Cap'n Proto. Really excited to try it out soon! Some questions:

- Do you guys have an RPC library written in anything other than C++? If not, could you point me to protocol specs so I can start writing my own?

- Since it uses a streaming model to support random access, what encryption method do you think would work best with Cap ' n Proto that would keep it speedy and still retain all functionality?

Thanks!

kentonv 9 years ago | | |

> - Do you guys have an RPC library written in anything other than C++? If not, could you point me to protocol specs so I can start writing my own?

People have written implementations in Rust, Go, and Erlang, and wrappers around the C++ library in Javascript and Python: https://capnproto.org/otherlang.html

Scroll down that page for some info on how to start writing an implementation in another language.

The RPC protocol spec is here:

https://github.com/sandstorm-io/capnproto/blob/master/c++/sr...

> - Since it uses a streaming model to support random access, what encryption method do you think would work best with Cap ' n Proto that would keep it speedy and still retain all functionality?

Hmm, I'm not clear on what you mean by "streaming model" -- I think of "streaming" as the opposite of random access.

Regarding encryption, this is a very big question and there are a lot of different needs and use cases to consider. Mostly I don't think that use of Cap'n Proto affects encryption decisions much, but if you want to make sure you don't lose random access, you should of course use a cipher that supports random access, like chacha20 or AES-CTR.

jcoffland 9 years ago | |

Recently I've started compiling more of my code for Windows using Mingw (or clang). Performance is very good and you can build in Linux for Windows inside a Docker container. You can even build Windows installers using NSIS. I've also tried to do this for OSX but with less success.

kentonv 9 years ago | | |

Cap'n Proto supports MinGW as a target platform. (It has also supported Cygwin since very early on.) However, Cap'n Proto is a library, so it needs to support whatever compiler the users of that library are using, and for the vast majority of Windows developers that's MSVC.

socmag 9 years ago | |

It sounds really well designed. Great job Cap'n Proto, I've been following your work for some time!

I've also built another message format recently (yes I know, they are never ending). It can't do everything Cap'n Proto can do, although it shares a lot of the same values. One thing I chose to do is to have order preservation on types, which can be very useful. It does mean that the wire format is largely BE though. Anyway, that's an aside.

I'm curious what you think of Amazon ION by the way?

Good stuff!

kentonv 9 years ago | | |

I haven't looked at Ion before, but it appears to be similar to BSON or Msgpack in that it's a binary format that encodes field names as textual identifiers, to be "self-describing" and avoid the need for an external schema. I generally like schemas, because I like static typing. Of course, static types vs. dynamic types are another ancient flamewar and I'm unlikely to cover any new ground by stating arguments here. :)

forrestthewoods 9 years ago | |

Treating Windows like a red headed step child is why I, with great disappointment, passed on Cap'n'Proto.

Cross platform libraries that really only care about a single platform is an endless source of frustration. :(

Twirrim 9 years ago | | |

Writing software that is safely and gracefully cross-platform is a pain in the neck :-/

wvh 9 years ago | |

Slightly off-topic but it's good to see after all these years that companies like Microsoft (and Apple?) have to make a serious attempt to embrace the open-source community or miss out on a lot of cool new software and developers. Times have changed somewhat; the mantra used to be "doesn't work on Linux", while nowadays we've got a lot more heterogeneous environment.

lmm 9 years ago | |

If those platforms are causing issues would it be worth downgrading them to a second-class "LTS releases only" state or some such?

anongoogler2 9 years ago | |

I could imagine implementing something like this as an alternate wire format and client library for protocol buffers. Can you outline the main reason you chose not to go this route? In particular, what aspects of the proto descriptor language don't fit Cap'n Proto? Or what aspects did you feel needed to be changed for some other reason?

kentonv 9 years ago | | |

To be clear, by "alternate wire format and client library for protobuf" you mean "reuse the protobuf IDL but change everything else", right?

There are a few major reasons I didn't go that route:

* The .proto language has a lot of weird quirks that I don't like. Some of the quirks are specific to the protobuf encoding (e.g. int32 vs. sint32 vs. fixed32 being different types), while other quirks have no particular rationale behind them. I didn't want that baggage.

* The .proto language does not treat interfaces (aka services) as a first-class type. That is, you cannot define a field whose type is an RPC interface type -- a reference to a remote object. The ability to do this is a critical part of Cap'n Proto's interface design.

* It's highly unlikely that the protobuf team would be interested in accepting changes to the language which were not actually supported by protobuf. This means that if I shared the language, I would have my hands tied when it comes to new features -- or I'd also have to implement Protobuf equivalents to make them happy.

lobster_johnson 9 years ago | |

I'd like to see Ruby bindings (not just serialization). We are using Go, Node.js and Ruby in production, and we have been looking to move from plain HTTP to gRPC. If Cap'n Proto is better, we might use it, but not without Ruby bindings.

kentonv 9 years ago | | |

I don't think anyone is currently working on Ruby bindings, but if you're interested, feel free to take it on!

(Note that Sandstorm is focused on making Cap'n Proto work well for Sandstorm. We welcome contributions, but we generally don't have resources available ourselves to work on third-party feature requests unrelated to Sandstorm. That is, unless you want to pay us a bunch of money, in which case, feel free to contact me. ;) )

lux 9 years ago | |

I recently ran into flatbuffers https://google.github.io/flatbuffers), which claims the same no-serialization step. Curious how the two compare?

bchallenor 9 years ago | | |

I recently evaluated the two and went with FlatBuffers, largely because its Java support appeared to be more mature.

The FlatBuffers encoding is based on vtables and is relatively straightforward (the runtime library is tiny). This also means it's inefficient for small messages, but in my testing its vtable deduplication worked great for my use case (~100k messages of the same type per memory-mapped file), in that the vtable overhead tends quickly to zero.

Cap'n Proto has a more complex encoding that is probably more efficient in terms of wire size, and particularly for small/standalone messages, but the runtime is larger as a result.

adrianratnapala 9 years ago | | |

Does flatbuffers have an RPC mechanism? Or is it purely a serialisation tool?

Cap'n Proto is both. I.e. you have the serialisation, which can be used on its own, but you can also define interfaces with methods that can pass those serialised structs as parameters, and which return asynchronous promises.

DenisM 9 years ago | |

How about first-class map / dictionary support? That is, const or log time retrieval of key-value pairs.

Right now I can only see a way to create lists, so access will be in linear time.

kentonv 9 years ago | | |

The "lists" are actually arrays, so access by index is constant-time. You can build a hashtable on top of that. I'd like to add built-in support for maps at some point, but it's one of a very large number of wishlist items...

easytiger 9 years ago | |

From reading the examples, it looks like one can only serialise to a file descriptor. What if i wan to serialise to a byte array? Am I perhaps missing something obvious?

kentonv 9 years ago | | |

You can certainly serialize to a byte array. The example uses a file descriptor but in fact the code defines an abstract stream type which you can implement any way you want, and you can also obtain pointers to the message's underlying storage in order to extract the bytes directly (avoiding a copy).

_0w8t 9 years ago | |

I am curious, what is your take on Thrift? Is it really an improvement on protocol buffers?

kentonv 9 years ago | | |

Thrift to me is a Protobuf clone. Very similar design. Early on it was not very well optimized compared to protobuf, but I imagine they've fixed that by now. The big advantage they had was that they included an RPC system in their first release -- although it was a FIFO RPC system which struck me as an odd design choice (I think they may have fixed this more recently?). But now GRPC exists, so I don't think there's much reason to choose Thrift over Protobuf/GRPC.

However, I am obviously very biased. :)

zump 9 years ago | |

Why did you leave Google?

kentonv 9 years ago | | |

Well, I could write a lot about this, but...

In general, because I felt there was too much resistance to me pursuing ideas that I wanted to pursue. Google has become increasingly top-down, whereas it was fairly bottom-up when I started in 2005. That's not necessarily a bad thing, but I felt that I personally would be happier running a startup where I could call the shots.

FWIW, if you just want to write code, be comfortable, and make a crapton of money, and you don't care if you're implementing someone else's ideas, I highly recommend working for Google. That's not meant to be sarcastic or disparaging -- I totally respect that approach and there are days when I wish that were me. But if you have ideas of your own and you won't be happy unless you see them implemented... it probably won't happen at Google.

(To be fair, some people at Google would surely argue that my problem is that my ideas are crazy and bad, and I don't have any firm evidence -- yet -- that they are wrong.)

oneloop 9 years ago | |

What a great pitch, that website is. I don't even understand what it is, but I'm stocked.

crossroads091 9 years ago | | |

"Stoked". Sorry for the nitpick.

Cyph0n 9 years ago |

For some reason, the banner (infinitely faster?), name, and introductory FAQ-style responses made me think the whole thing is a joke - similar to Vanilla JS [1].

Anyways, it seems like a cool project, so I'll be sure to follow its development closely.

[1]: http://vanilla-js.com/

pjscott 9 years ago | |

The best jokes in software usually have running code. The best of the best are practical.

DonHopkins 9 years ago | | |

Like X-Windows: The world's first fully modular software disaster. [1]

[1] http://www.art.net/~hopkins/Don/unix-haters/x-windows/danger...

StavrosK 9 years ago | | |

Like Flask.

draw_down 9 years ago | |

I had the exact same reaction. I really had to read for a long time (including code) before deciding it was real. "infinitely faster" confused me too.

kentonv 9 years ago | | |

I originally released it on April 1st, 2013, with the announcement post:

So, uh… I have a confession to make.

I may have rewritten Protocol Buffers.

Again.

otoburb 9 years ago | | |

Although this is listed on the introduction page, the author is also the same person that co-founded Sandstorm.io[1].

[1] https://sandstorm.io/about

jtolmar 9 years ago | |

If Cap'n Proto is a joke, JSON beat it to the punchline.

What's an efficient binary representation? C Structs. What's an efficient text representation? Javascript objects. But casting arbitrary data to a struct is a horrible idea. And eval'ing anonymous javascript is a horrible idea. Back to the drawing board.

Then N years later someone has the brilliant idea of just... not parsing these formats that idiotic way. And wrote a code generator because the smart way is tedious.

edraferi 9 years ago | | |

I'd love to read the full story there - how JSON got so much momentum behind it. Do you know of a good overview blog post?

niftich 9 years ago |

I've always liked Cap'n Proto because it was (quite literally) the ideas behind Protobuf taken to an extreme, or, depending on your point-of-view, reduced to its most basic components: data structures already have to sit in memory looking a certain way, why can't we just squirt that on the wire instead of some fancy bespoke type-length-value struct?

Of course, the hardest part is convincing everyone that it's not your bespoke type-length-value struct, but that you have good reasons for what you're doing. I think the humorous, not-so-self-serious presentation has worked in its favor (but that's just a subjective opinion and I can't back it up with data).

throwaway13337 9 years ago |

To get an overview of the area of binary interchange formats that are language agnostic, the author of Cap'n Proto does a good job in this:

https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-...

"Protocol Buffers" has been the go-to for a long time but there are more options now.

For uses where serialization/deserialization CPU time is a concern, it seems to really a question of Cap'n Proto versus flatbuffers ( https://google.github.io/flatbuffers/ ).

Perceptes 9 years ago |

Big fan of what Sandstorm is doing, both with Sandstorm itself and this component. I really want to use this instead of gRPC, as it seems technically superior, but language bindings and adoption across language ecosystems are likely to be a big downside given that (as Kenton mentions in a comment elsewhere here) Sandstorm isn't really interested in Cap'n Proto being widely adopted. All my new stuff is built in Rust, so the Sandstorm team's interest in and use of Rust are a good fit for me. But when it comes to interoperability with other languages, this may end up being a concern compared to gRPC. In any case, I hope to see the Rust implementation eventually replace the C++ one as the official reference implementation.

kentonv 9 years ago | |

FWIW, language interoperability is of interest to Sandstorm since apps can be written in many languages. But we're only 7 people with a lot on our plate, so unfortunately we can't currently be the ones to go around implementing Cap'n Proto in every language. That will change as Sandstorm grows.

Perceptes 9 years ago | | |

Great news, and thanks for the response! I don't have an immediate need for a system like this, so perhaps by the time I do there will be broader adoption.

venning 9 years ago |

Correct me if I'm wrong, but some of this sounds like blitting, except optimized for the in-memory structure and not the on-disk structure.

In 2008, Joel Spolsky wrote about 1990s-era Excel file formats and how they used this technique to deal with how slow computers were then [1]. Same technique, new problem set.

[1] http://www.joelonsoftware.com/items/2008/02/19.html

setori88 9 years ago |

Fractalide (http://githib.com/fractalide/fractalide) is an implementation of dataflow programming (specifically flow based programming). Component build hierarchies are coordinated via the Nix package manager. Capnproto contracts are weaved into each component just before build time. These contracts are the only way compenents talk to each other. Thanks Sandstorm.io for this great software.

edraferi 9 years ago | |

FTFY: https://github.com/fractalide/fractalide

megak1d 9 years ago |

I've always liked the look of this, saw it a while back but we are still using protobuf in our .NET environment simply due to the "free" schema generation using AOP/attributes [ProtoContract]/[ProtoMember] in Marc Gravell's excellent protobuf-net (https://github.com/mgravell/protobuf-net) project - I assume this would also be possible for cap'n proto.

zaptheimpaler 9 years ago |

Could this be used as an alternative to Apache Arrow[1]?

[1] https://arrow.apache.org/

mrfusion 9 years ago |

I hate to ask but can anyone explain this like I'm from 2000?

I guess it's a way to send data to your front end java script but not use json and this compresses it so it's faster? How much better than using json is it?

striking 9 years ago | |

It lets you put data on the wire, in a structured format, right out of memory. Asking the question "how much faster is it" isn't even valid here, because it skips the usual serialization process.

Cap'n Proto generates you some code that contains some data structures. You put data into these structures, and they will automatically be in the right shape to put directly on the wire. And then that data can be pulled right off the wire and right into memory and be fully ready to access, with no intermediate step.

It's, in a sense, infinitely faster than JSON serialization or deserialization. Because it doesn't even perform any serialization. It's just data.

There are some other tricks at play here, but I won't go into them. This is plenty cool.

Matthias247 9 years ago | | |

It's infinitely faster if you have control over the data layout in your application - which most likely means your are developing your application in C/C++, or maybe Rust. In the case of JS you don't have, so accessing and writing the data is slow (since you would need [de]deserialization there to write it in a byte array). In fact any serializations in JS are slower than JSON, since this is natively implemented in the JS VMs while others are not.

Drdrdrq 9 years ago | | |

Well, to be fair, serialization is still performed - but interchange format is cleverly picked out to be as similar as possible to common memory representation.

mrfusion 9 years ago | | |

So could django send data to jquery with this? Or what are some simple use cases?

lmm 9 years ago | |

It's a lot higher-performance, because you virtually don't have to parse. Remember when MySQL started supporting the memcached protocol because it'd reached the point where form simple pkey lookups it was spending more time parsing an SQL query than actually executing it? It's like doing that for your program.

But for me at least, the real advantage over JSON isn't the performance but the schema compatibility. You have a spec for your data and generate code from that, which means the spec is guaranteed to be correct, and there's clear documentation about what changes to the spec are or aren't forward or backward compatibile. (You get the same thing from the original Protocol Buffers though).

wtbob 9 years ago |

> The Cap’n Proto encoding is appropriate both as a data interchange format and an in-memory representation, so once your structure is built, you can simply write the bytes straight out to disk!

Eh, I'd rather pay the cost of serialisation once and deserialisation once, and then access my data for as close to free as possible, rather than relying on a compiler to actually inline calls properly.

> Integers use little-endian byte order because most CPUs are little-endian, and even big-endian CPUs usually have instructions for reading little-endian data.

sob There are a lot of things Intel has to account for, and frankly little-endian byte order isn't the worst of them, but it's pretty rotten. Writing 'EFCDAB8967452301' for 0x0123456789ABCDEF is perverse in the extreme. Why? Why?

As pragmatic design choices go, Cap'n Proto's is a good one (although it violates the standard network byte order). Intel effectively won the CPU war, and we'll never be free of the little-endian plague.

It's all so depressing.

sandGorgon 9 years ago |

How do you pronounce the name? If libreoffice is bad.. This name is absolutely impossible.

Is it captain? Is it cap+n+proto?

A lot of collaboration is verbal - people sit around and talk about stuff. I don't know if it is a fun take on an American word... But it is impossible to use in the rest of the world.

I really wish you would call it something else... Unless it is personal for you :(

kentonv 9 years ago | |

CAP-ən PRO-to

But "Captain Proto" is acceptable if you have trouble with the contraction.

Or you can also think of it as "Cap-and-Proto". Which is an intentional pun ("capabilities and protocols", or something).

Googling either of these will get you to the right place, so I think it gets the job done.

nemo1618 9 years ago | | |

>Or you can also think of it as "Cap-and-Proto". Which is an intentional pun ("capabilities and protocols", or something).

I never realized that! I like the name much more now.

sandGorgon 9 years ago | | |

Protocap maybe?

Btw, you rank in the 5th result for "protocap" on Google.

Now that's a name all of us can pronounce!

haneefmubarak 9 years ago | |

So yeah, in the states, we have this cereal called Cap'n Crunch that some people love. I think he was making a pun out of that.

The pronunciation would thus be "cap [the sound the letter 'n' makes] crunch".

couchand 9 years ago | | |

Seems like a good opportunity to point out that the pun is that the RPC model is based on capabilities. Capabilities and Protobuf -> Cap. 'n' Proto. -> Cap'n Proto

DonHopkins 9 years ago | | |

Gee, all this time I thought he was hitching his wagon to the right honorable, dapper, charismatic, articulately spoken, upstanding guru of phone phreaking culture.

amelius 9 years ago | |

> How do you pronounce the name?

More importantly, how do you Google for it?

dekhn 9 years ago | | |

It's pronounced "cap enn proto". Cap'n is a contraction, you just leave sounds out of the fully expanded word.

you can search for cap'n proto in any number of ways, including its literal name [ cap'n proto ] or [ capn proto ] or [ captain proto ].

hendzen 9 years ago | | |

At $dayjob we use capnproto as our primary interchange format. I google for 'capnp'.

joshuawarner32 9 years ago |

Here's the discussion from a while ago: https://news.ycombinator.com/item?id=5482081

chaotic-good 9 years ago |

I think that compression is a must for the serialization library. Protobuf uses almost twice less memory than Cap'n Proto. Using an external compression is not an option in some cases. E.g. consider building tcp-server that communicates with thousands of clients simultaneously. Each client connection will have its own LZ4 context that should be heap allocated. I believe it's about 16KB per connection + buffers. This results in large memory consumption and a lot of random memory access and TLB misses.

kentonv 9 years ago | |

Cap'n Proto offers "packed" encoding which applies light compression (removing zero-valued padding bytes), brings it in-line with Protobuf, and ought to be much faster than the things Protobuf does for "compression" (varint is a very slow encoding!).

chaotic-good 9 years ago | | |

I'm glad that I was wrong about it.

flatline 9 years ago |

Interfaces! Inheritance! Looks promising. Protocol buffers are nice for their compact encoding and multi-language generator support but as a schema language they are really cumbersome. Composition is pretty much all you get, there are no longer required fields, you can't even use enums as a key type in a map. I'm sure their use cases are not necessarily the same as mine but sometimes I miss just using plain old XML.

kentonv 9 years ago | |

To be clear, Cap'n Proto's serialization layer, from a schema perspective, is almost exactly the same as Protobuf (though with a very different underlying encoding).

The interfaces and inheritance relate to the RPC system. The interfaces are for remote objects.

See: https://capnproto.org/rpc.html

morecoffee 9 years ago |

> capability-based RPC system.

This sounds like a cool idea, but so far I haven't seen any good explanation of how it works, and why it will save me from rolling my own ACL system. For bragging about it in the very first sentence, there is surprisingly little detail about how it works.

kentonv 9 years ago | |

It's a complicated topic -- it requires thinking about things in a different way, and tends not to make a lot of sense until at some point it "clicks" and you realize all sorts of patterns you were already using are actually special cases of capabilities.

Here is some reading:

https://capnproto.org/rpc.html#security

https://sandstorm.io/how-it-works#capabilities

http://zesty.ca/capmyths/usenix.pdf

mixmastamyk 9 years ago |

Apache thrift doesn't seem to be mentioned, how does it compare?

kentonv 9 years ago | |

Thrift is very much equivalent to Protobuf for the purpose of everything discussed on the web site. As the site mentions, I prefer to pick on Protobuf because I am also the author of Protobuf v2. :)

Twonneilb22ll 9 years ago |

I'm happy to hear from you all in glad to see you're doing all you can for updates Programs codes appreciate the hard work

Twonneilb22ll 9 years ago |

I'm happy to hear from you all in glad to see you're doing all you can for updates Programs codes appreciate the hard work

Twonneilb22ll 9 years ago |

I'm happy to hear from you all in glad to see you're doing all you can for updates

Paul_S 9 years ago |

Would probably be a good idea to have a no-exceptions version.

kentonv 9 years ago | |

You can compile Cap'n Proto with -fno-exceptions, and it does a bunch of things differently to make that work. Basically, invalid-input exceptions instead replace the invalid data with a reasonable default value and set a flag on the side that you can query to see if there was any invalid input. Assertion failure exceptions (where there is no way to recover) largely turn into fatal errors.

imaginenore 9 years ago |

Is it faster than MsgPack?

kentonv 9 years ago | |

Depends on the use case! In fact, the answer to "is X format faster than Y format" always depends on the use case. It's always easy to construct cases where one or the other looks better. People of course want to know "on average", but in reality there's no such thing as an "average" use case. You'll ultimately have to test the case you have in mind to find out.

With that said, here are some considerations:

- msgpack is usually used as a binary encoding of JSON, with no schemas. That means that textual field names are included in the encoded message. Formats like Protobuf and Cap'n Proto that have schemas known in advance can avoid this bloat, making them faster and smaller.

- msgpack is not a zero-copy encoding. It's necessary to parse the whole message upfront before you can use it, like with protobuf. Cap'n Proto is zero-copy, the advantages of which are described extensively on the page. For example, if you have a multi-gigabyte file containing a massive Cap'n Proto message, and you just want to read one field from one place in that message, you can do that by memory-mapping the file. No need to read it all in. That's not possible with Protobuf or Msgpack.

I think it's best to focus on these kind of paradigm-shifts when trying to reason about performance. You can always micro-optimize the encoding path later on, but you can't suddenly switch to zero-copy later if your data format wasn't designed for it.

honkhonkpants 9 years ago | |

Is X faster than Y will be tough to determine without a really detailed treatment of the use case. For example in C++ you want to account for the unavoidable construction of your class type, its inevitable destruction, how long it takes to put it on or take it off the wire, how often you might expect to miss the cache when referencing it, whether the type of movable or copyable and how expensive that is, whether or not you can reset it for reuse without destroying it, and much more. If you were to compare to protobufs I doubt that you'd find serialization and deserialization to be the predominant costs. I don't know what the cost breakdown looks like for capnproto, but maybe kentonv has standing benchmarks.

matmann2001 9 years ago |

TazeTSchnitzel 9 years ago |

Blender's file format does something similar, it essentially saves a core dump to disk.