Automatic Flushing: The Rails 3.1 Plan

Automatic Flushing: The Rails 3.1 Plan(yehudakatz.com)

46 points by ivey 15 years ago | 30 comments

judofyr 15 years ago |

I’ve been doing some research for this earlier, and my conclusion was: This is very hard, if not impossible, to implement automatically. The main problem is that it’s impossible to handle exceptions correctly without making the whole stack aware of it.

Currently, when an exception occurs, the system can simply change the response (since the response hasn’t been sent to the client yet, but is only buffered inside the system). With this approach, a response can be in x different states: before flushing, after the 1st flushing, … and after the xth flushing. And after the 1st flushing, the status, headers and some content has been sent to the client.

Imagine that something raises an exception after the 1st flushing. Then a 200 status has already been sent, togeher with some headers and some content. First of all, the system has to make sure the HTML is valid and at least give the user some feedback. It’s not impossible, but still a quite hard problem (because ERB doesn’t give us any hint of where tags are open/closed). The system also need to take care of all the x different state and return correct HTML in all of them.

Another issue is that we’re actually sending an error page with a 200 status. This means that the response is cacheable with whatever caching rules you decied earlier in the controller (before you knew that an error will occur). Suddenly you have your 500.html cached all over the placed, at the client-side, in your reverse proxy and everywhere.

Let’s not forget that exceptions don’t always render the error page, but do other things as well. For instance, sometimes an exception is raised to tell the system that the user needs to be authenticated or doesn’t have permission to do something. These are often implemented as Rack middlewares, but with automatic flushing they also need to take care of each x states. And if it for instance needs to redirect the user, it can’t change the status/headers to a 302/Location if it’s already in the 1st state, and therefore needs to inject a <script>window.location=’foo’</script> in a cacheable 200 response.

Of course, the views shouldn’t really raise any exceptions because it should be dumb. However, in Rails it’s very usual in Rails to defer the expensive method calls to the view. The controllers sets everything up, but it’s not until it needs to be rendered that it’s actually called. This increases the possibilty that an exception is raised in the rendering phrase.

Maybe I’m just not smart enough, but I just can’t come up with a way to tackle all of these problems (completely automated) without requiring any changes in the app.

runeb 15 years ago | |

Good points, and I agree. Sending 200 OK before you know if the response truly is OK is problematic, at best. I'm not sure I agree with you about authentication / authorization. These should do their job before a controller gets a chance to render / first flush a view and would thus get a chance to send other headers. In the extreme the first middleware would flush a 200 header right away and then call down the middleware stack, but thats not how I read this proposal.

Anything that might happen while actually rendering a view is a concern here (and as your get more lazy that could be quite a lot), but you'd normally sort out auth before actually rendering/flushing anything.

judofyr 15 years ago | | |

The authentication / authorization was merely an example, and I agree that most of these solutions are done before rendering occurs.

However, my point is that in order to take advantage of flushing, you want to start sending the HTML as soon as possible and this forces you to decide the status and the headers. If there's something in your stack which requires different status/header, you either need to evaluate it earlier in the request or hack around it by appending different HTML.

The more you decide to evaluate earlier, the less efficient becomes the flushing. So for every piece in the stack you have to make a choice: What is the chance that this requires different status/headers/content? How much do we gain by deferring? How can we hack around it if we've already started sending a response?

This is something you can't do automatically, and as far as I can see, this isn't mentioned in Yehuda's post at all.

raggi 15 years ago | |

It does require changes in the app, but app authors who need this kind of performance benefit will be willing to accept that hit. The solution is far better than the alternatives:

- Allowing users to flush manually (people screw this up real bad) - Changing the rack spec (allowing for #each on the body to be lazily yielded, and terminating on nil or the like) - Moving to an always async stack (totally kills most users)

Yes, there are plenty of issues with this, and I agree with your concern, but it is also something which can have a marked effect on performance for users. It's also worth noting that a well componentised partial can render an error in-place of the partial itself, for example, rendering a page that contains the whole layout, and a single red box of errors (say a render of the _new partial can be added to the buffer after a _create fails, instead of rendering the success box). Yes, that requires some refactoring of the application (rather than using for example, the standard 302 approach).

It's also worth noting that a larger class of applications that would find this actually useful should generally have reasonable test coverage and code maturity. Whilst this isn't always the case, we also don't protect users from eval, and other evil tools, in ruby or rails.

judofyr 15 years ago | | |

From the post:

For Rails 3.1, we wanted a mostly-compatible solution with the same programmer benefits as the existing model, but with all the benefits of automatic flushing

And from there he goes on with very specific implmentation details and the only caveat is some API change. This gives the impression that this is something you can easily enable for any app.

I just want to point out that 100% automatic flushing is pretty much impossible with the current state of Rack/Rails, and there's still plenty of work before there's anything near flushing support in Rails.

In addition, everyone should be aware of the trade-off you're making with flushing (potentially sending 500 responses as 200 Ok etc.)

runeb 15 years ago | | |

That breaks caching, as pointed out by judofyr. I can imagine having to remove caching will eat up the performance gains pretty fast.

Twisol 15 years ago |

So how does this work with Rack? Unless I'm mistaken, you have to return the body all at once, which entirely negates the benefits here. I don't see Rails mandating that an asynchronous server be used (i.e. Thin, Mongrel2, etc.), so I'm rather confused.

judofyr 15 years ago | |

In Rack you need to return a body which responds to #each (which yields strings); it doesn't need to return the body all at once:

    class Dummy
      def initialize(controller)
        @controller = controller
      end
      
      def each
        @controller.render.each { |part| yield part }
      end
    end
    
    @body = Dummy.new(self)

Twisol 15 years ago | | |

Aaaah, and the work is done within #each and not within #call. I see. The only issue is if you have a middleware that modifies the output, because unless you're careful and/or you're doing something extremely minor near the start of the page, it'll all be processed in the middleware rather than the server. So the server still gets it all in one piece, and so does the client.

briandoll 15 years ago |

From Yehuda on twitter: "BTW: Those who have brought up issues with exceptions/status codes re: flushing, you're right, but it's not specific to the fiber solution"

zbanks 15 years ago |

Cool idea. It's one of those "cheap" speed boosts which are always nice to find/have.

It'd be nice to see this implemented in Django as well...

aaronblohowiak 15 years ago |

This encourages having SQL queries initiated by the view, after the header has rendered. This seems antithetical to MVC to me.

collint 15 years ago | |

Not really, in Rails, you might have this controller code:

@things = Thing.where(:it => "good")

And this view code:

<% for thing in @things %> <%= thing.name %> <% end %>

But the SQL query doesn't fire in the controller. It gets kicked in the view when you "for x in y"

Concerns still wonderfully separated.

aaronblohowiak 15 years ago | | |

Not really, in Rails, you might have to do more than just retrieve some models. For instance, you might have to load up the current user, grab some stuff from memcache, check with your SSO system to validate the session, and then retrieve the data pertinent to the current request. Then, you might have to make some data modifications (which will create transactions and hit your db.) Finally, the view rendering can begin.

In only the trivial cases can you defer the actual SQL queries from being performed before the view is rendered.

jtgeibel 15 years ago | |

In some cases the query is lazy loaded with a call to .each or something similar in the view already. In other cases, the query is run in the controller before anything is rendered. I think the main goal is to improve client side performance (by downloading scripts and other referenced files sooner) and see the most benefit on pages that load multiple models, for instance a sidebar showing popular posts and recent activity.

aaronblohowiak 15 years ago | | |

lazy loading is bad for overall performance, though. Retrieving all of the models with eager-loading will avoid exploding query counts and lower the total time to render a page (and lower the load on your servers.)

fizx 15 years ago | |

There's other schools besides MVC, including "component-oriented."

class Dummy def initialize(controller) @controller = controller end def each @controller.render.each { |part| yield part } end end @body = Dummy.new(self)