REST Hooks - Stop the polling madness(resthooks.org) |
REST Hooks - Stop the polling madness(resthooks.org) |
http://blog.friendfeed.com/2008/08/simple-update-protocol-fe...
Erm. Is there a source for that?
"Over a representative time period, Zapier polled for changes 30 million times but only took action on 460,000 of those polls (1.5% efficient)."
If someone has a better understanding of how this works, I would appreciate a TL;DR version for programmers (which are probably the target audience?). Thanks!
And i got oversold on an idea that sounds good but i have no clue how it works.
REST Hooks itself is not a specification, it is a collection of patterns that
treat webhooks like subscriptions. These subscriptions are manipulated via a
REST API just like any other resource. That's it. Really.
Thanks for the feedback!You also need to explain who you are and why you're doing this, too. Right now it looks like you have an interest in selling something but you never explain what it is. It reminds me a lot of those sites which sell some miracle product, and are deliberately very long and vague, and it ends finally with some button to order an ebook or something.
Is a network connection kept open? Is there an assumption that the user has some port open that can be contacted?
I see that you somehow reduced your calls, but I don't see how. Please tell me in actual socket level networking terms how this is done; who establishes a connection to what, for what, and for how long?
Also how do you deal with disconnects, timeouts, and missed realtime data.
I also agree with a previous commenter who said it's unclear what you guys' role is. Are you just defining this "spec" to help the world, and hoping that the world gets on board? Or, are you offering some sort of service? Also, how would a dev just start using these, when it requires that the services they are consuming "speak Rest Hook?". It just seems like a good idea, but not immediately useful. If there was a simple call to action that said, "Hey devs, let's all use these when building services", then the pitch would be more understandable (assuming that's what you guys are promoting).
BTW, I do think it's a worthwhile idea. Personally, I've found Webhooks to be very simple, thus I'm not really sold on the "research" that indicates devs struggle with them more than they would Resthooks. So, that's not the sell to me. It's having a standard, programmatic way of discovering Webhook availability and consuming them across various services/APIs that's the draw in my book.
Anyway, good luck.
Seems like it adds data that would normally be included in a poll response to requests that were going to be made anyway, and possibly adds requests that were normally going to be made independently to long polls, though I'm not sure about any of this.
If I am right, would be nice to explain this and how it compensates if there is not a "regular" request within a certain period of time. And well, to basically explain WTH REST hooks are....
Is the most important piece of information really: "REST Hooks itself is not a specification, it is a collection of patterns that treat webhooks like subscriptions"? That seems of tertiary importance at best. How about telling us what REST Hooks are and how they are that - not what they aren't from a marketing perspective.
This is targeting a technical audience. It shouldn't so drastically underestimate its readers.
I'm sure if I put more time into trying to understand what is going on I could figure it out, but to me, the message isn't that clear.
I looked through several pages (waste of time) and I still can't figure out how this is real time without websockets, polling, server side events. No way I would use this service if they don't know who their audience is.
Unfortunately a lot of us developers see the term REST API and think a-ha, this is something I can consume in my browser. This isn't that, so confusion ensues.
Bummer. No matter if you like it or not, a collection of patterns with a name _is_ a specification, just possibly a poorly defined one. See the confusion in this thread. If it were a link to a spec, nobody would be confused.
> Skip the pedantic arguments about standards and implementation details
This reads to me as "everyone is going to have a slightly incompatible implementation. One library wont be good enough, I'll need to write a new one per site that uses this."
Furthermore, what about PubSubHubbub?
Finally, polling is great: Ive found few situations where it doesn't work well, people just tend to only do the most basic of implementations and blame polling. See http://roy.gbiv.com/untangled/2008/paper-tigers-and-hidden-d... for a really interesting example.
EDIT: numbers were real, added a supporting link
[1]: http://mqtt.org/
It sounds trivial, but you'd be surprised how many APIs don't support one or both of those features. When you're writing an API it might seem unnecessary to start (after all, who could ever have 1000s of <object>?), but if someone ends up polling your API frequently, having those two features can reduce a lot of unnecessary load for both you and the poller. And, of course, make sure you have an index on the created and/or updated dimensions.
That said, webhooks are terrific. Few things to consider when implementing them:
- Think carefully about the payload you send to the webhook. It's usually a good idea to send some related objects/data because many times when someone gets a webhook payload, that'll trigger calls to your API to get related information you could've reasonably sent them in the initial payload.
- You'll likely want to some way to keep track of errors so if an endpoint starts returning 404s or 500s you have a way to programmatically discard it after X failed attempts.
- In your docs, give sample, "real world" payloads developers can test against. It saves times over creating a RequestBin, pushing there, copying, cURLing, etc. (Remember, you can't set up a webhook to localhost.)
- A nice to have is some sort of retry capability with an exponential back-off. Servers go offline and if they get pushed webhook data then, those messages are lost. You could say, "tough, it's the consumer's responsibility," but if having all the data is important, most people will resort to polling. (Somewhat related, you'd be surprised how often the APIs of some larger SaaS companies are "offline" -- e.g. returning 503 --, so these things do happen.)
Great points though.
[0] https://github.com/zapier/django-rest-hooks
[1] http://demo.resthooks.org/
Here are more resources:
Intro: https://zapier.com/engineering/introducing-resthooksorg/
Has any thought been given to the concept of supporting the same framework over something useful in a browser? http://resthooks.org/docs/alternatives/ lists some problems with the common methods, and rightfully so, but I don't see a different recommendation.
This is a collection of patterns, right? Well why not make this really REST and rather than list a bunch of URL templates provide REL types for each of these like so:
REL subscriptions-list -> GET subscription-create -> POST subscription-read -> GET subscription-update -> PUT subscription-destroy -> DELETE
Now it doesn't matter what the URL structure is, I can pull the <link> elements from the page and be TOLD the URL, rather than follow a URL template. That way, this doesnt rely on out of band knowledge i.e this web page and its (poor) description.
The quick win of PubSubHubbub compared to Resthook is that it has a 'security' mechanism in it, which means that the subscriber can be 100% sure that the service sending the notification is not someone impersonating it.
Aside from this, I feel like PubSubHubbub and Reshook solve the same problem: programatically setting up webhooks.
1. Webhooks
2. A subscription layer via REST
Several major players already are doing this but it doesnt have a name.
REST Hooks are a way to consolidate that momentum and push it to a broarder audience.
I'm thinking these are just Webhooks, but the REST part is throwing me off because I think of it more as a consumption concept (consuming resources, etc).
Update: Found this, which I think explained it best: "REST Hooks itself is not a specification, it is a collection of patterns that treat webhooks like subscriptions. These subscriptions are manipulated via a REST API just like any other resource. That's it. Really." - but it was in /docs, not the main page.
This project (er hmm, "initiative") is core to Zapier's business. If every service out there had a hook atop their service, it'd make things a lot easier for Zapier. That's cool. What bugs me is the feeling that Zapier's branding in the whole thing is less than transparent.
It seems that open source projects are getting more and more marketing driven, and the way this "initiative" is packaged is a sign of things to come.
Which brings me to the question: should I care about something like REST Hooks? Is it a mistake to become complacent and assume that the lower-level infrastructure will Just Work? Or can MeteorJS (and presumably other high-level frameworks) be trusted to handle this kind of stuff in a way that makes it safe for me to forget about it?
Just curious what the esteemed HN denizens think about this, as I'm sure there are some strong and reasonably well-informed opinions out there...
There should be some standard link relation like <link rel="subscription"> so clients have some hint that this is available and where to request it.
I'd also want some way to manage the freshness/load tradeoff, like "please notify me within one hour but no more than every five minutes".
Most REST apis these days seem to be consumed form client side.
This is a collection of patterns, right? Well why not make this really REST and rather than list a bunch of URL templates provide REL types for each of these like so:
REL subscriptions-list
> In other words, if everyone implemented REST Hooks, server load for both the sender and receiver could be reduced by 66x.
No, the number of requests could be reduced by a factor of 66. I'm not saying that's not impressive, I'm saying that the polling requests that ended up resulting in no action are cheaper than actionable requests, so, server load will go down by much less than a factor of 66x. The amount of work is the same, just busywork is less.
I personally think you should be using one that gives you permanent URLs (ngrok or Passageway) but that also keeps a log when you're client is disconnected. I'm pretty sure only our Passageway does that.
I have a solution for polling. Its called websockets. Even more, there's a well-supported library called socket.io that transparently handles it for all browsers.
http://developer.github.com/v3/repos/hooks/#pubsubhubbub
We also have a regular JSON endpoint for our hooks resource (which is essentially a subscription API).
> See the confusion in this thread. If it were a link to a spec, nobody would be confused.
We definitely had a little confusion in the thread, I think that was mostly because we put too much marketing on the homepage for this audience, but the absence of a proper spec could have definitely contributed. We're correcting some of this.
This is all about adoption of some sort of subscription-based HTTP callback: at this point in time any flavor will do. We have no doubt that a formal spec will pop up someday (and that would make us very happy!).
> Furthermore, what about PubSubHubbub?
PubSubHubbub always seemed a little heavy and was a departure from APIs that weren't XML/ATOM based (meaning most JSON API providers wouldn't touch it). Not saying I agree with it, but that is the feedback we got.
> Finally, polling is great for 99% of use cases (I can make up statistics, too)
We posted the numbers driving our stats on the homepage. They are not made up and come straight for our Elasticsearch cluster. I'm happy to elaborate on them!
Even then, this page (/docs) barely says anything. Almost anything can be 'compliant,' because there's almost nothing to say. I really think some rigor would help a lot. I can appreciate not wanting to get into full RFC2119 right away, but you need some amount of description.
PuSH is not limited to a XML/ATOM, and companies like SuperFeedr use PuSH with JSON.
In fact, I'm pretty sure PUSH is 98% of a webhooks implemenation.
I didnt understand that the numbers were from anything real, my mistake and apologies.
Instead of polling `facebook.com/api/posts` every 2/5/15/60 minutes, you'd set up a subscription for Facebook to ping you at `yoursite.com/hook.php`. The subscription would be managed under `facebook.com/api/subscriptions`.
server.com/files/a-cs3/important.pdf
server.com/files/a-cs4/important.pdf
server.com/files/a-cs5/important.pdf
You'll see server.com/files/important.pdf
The information about the version of that pdf is contained in the document and managed document itself. What this allows for is you updating the PDF in any PDF editor without breaking how everyone else on the internet will process the file. You can edit the pdf with anything, re-save it and not break older version. This is the same concept that you should use to extend an API. If you think of the API as an document for a version of DATA (Just like any file/image is an endpoint), it will help you get a better grasp of why you shouldn't version in URL's.Here is a more accurate description of the how you should design a REST API from the Author of the HTTP spec: http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hyperte...
That said, I don't think this actually mandates any URLs. I read that part as just a suggestion for how it could be done.
A wants to get updates of B. Instead of polling B for changes, it sends B or C an URL of A and says: POST to this URL the moment something changes on B.
If that's the case, this really gets lost on all the hooks/subscriptions/load-mumbojumbo on this site.
> REST Hooks are a lightweight subscription layer on top of your existing REST API.
That is too short. There the explanation is missing, the same way on the website. maybe in better english if mine is broken, but that addition could fix it: "...,thus that instead of having to poll that API regularly, the subscriber is notified with a POST the moment a change occurs."
I'm probably biased, because that is kind of the same way I once tried to explain the concept, but I think it would be clearer that way.
> An initiative by Zapier 2013.
Zapier is a company that makes money off sites having REST endpoints to push and pull data. It looks like Zapier would prefer that more sites have endpoints to accept push, and this marketing effort is a brand-campaign (think Public Service Announcement) to get more people to tailor their products to work with Zapier.
That explains why more emphasis was put on marketing, etc. The primary audience is not developers but PHBs and PMs to have sound bytes to parrot to developers.
(I'm not cynical, I work in advertising, most of my products already have 'REST hooks'...)
Having felt that pain before, and not opting for WebSockets yet, I was hoping for some kind of simple alternative. I have a portion of code that occurs during a registration process that is currently polling the backend for status updates as to the worker queue's progress (it's a lengthy registration with several moving pieces and calls to multiple remote APIs).
(I'm not saying that I think it should be that way, just elaborating.)
But this thing here they are talking about is something I've been doing for a long time. I didn't think about making a brochure site for it though.
Just follow the principle of making the parts as dumb as possible. They should be the stupidest simplest things ever. Sometimes you can't get to this model on the first try, and that's ok. But the closer you get to it, the better the product becomes. I swear this is true and part of the soul of the machine's ghost.
If you can't make things laterally connect this way, then make them vertically stack this way and then laterally connect this way. But don't obfuscate the objective.
As for timeouts, check out http://resthooks.org/docs/retries/ for ideas on how those problems can be solved.
Of course, Server A must "speak Rest Hook" in order to enable that, so in a manner of speaking I guess it is a way to configure Webhooks on Server A.
But, it basically exposes a common interface for Server A and Server B to establish Webhooks. Server A can tell Server B which notification subscriptions are available (i.e. for which events), and Server B can then choose to subscribe, providing the callback URL for the Webhook.