Nginx design details(aosabook.org) |
Nginx design details(aosabook.org) |
Also be sure to read the chapter about LLVM compiler family (written by LLVM creator, who is an Apple employee now): http://www.aosabook.org/en/llvm.html
It's great. A learnt a lot from that chapter.
This book (The Architecture of Open Source Application) is a treasure trove. Just look at the index here: http://www.aosabook.org/en/index.html and tell me you don't want to skip work (or school, or whatever else is you're doing) for a week to read it all :D
On more than one occasion, you have to unintuitively try to figure out whether a rule at the bottom is getting matched before a rule at the top. The matching algo is quite intricate and easy to to trip up on. It tries to help novices by matching literal strings first for them rather than simply making a note of the performance benefits of putting literal locations at the top, but ends up failing to reveal the true matching order at a glance - you need to check the verbose logs, yuck.
In addition to "location" eval order, "If" statements are another common gotcha.
But the only way to deal with regexps is to execute them sequentially.
btw, the official doc is here: http://nginx.org/en/docs/http/ngx_http_core_module.html#loca...
The complexity is entirely unneeded, mod_rewrite could def use some minor convenience tweaks, but is far more intuitive and therefore more effective in both understanding and debugging. I assure you that "match in order defined, unless passed through explicitly by a sub-condition" is sufficient and simple for 99.5%.
nginx's "match literals, then match everything, then choose most specific, in the order defined by its type" is craziness, "getting used to it" does not make it good, it's simply the first unnecessary step in making it useable.
For those looking for full featured and robust embedded scripting support in nginx, Yichun Zhang's lua-nginx-module is highly recomended. https://github.com/chaoslawful/lua-nginx-module
Just like Java is sometimes the best language to use because it has huge mind share despite the warts, sometimes Apache is the right server to use just because that's what almost everyone uses and it makes things so much easier than using something more rare.
Returning status codes from nginx is easy. Adding headers is easy. Perhaps you could elaborate about "certain things handled easily by Apache's large ecosystem of modules".
Apache's large ecosystem of modules is built around doing everything in the webserver. More often these days, that stuff is handled by a dynamic application. Nginx talks to those via fastcgi, scgi, uwsgi, or plain http proxying. Nginx does not need to know anything about how the apps do whatever it is they do.
Here's the wiki on how to set headers: http://wiki.nginx.org/HttpHeadersModule
Here's the wiki on returning a particular http code: http://wiki.nginx.org/HttpRewriteModule#return
I'm not sure that says what you think it does about Ngnix and the developers you deal with...
* you can serve page faster; * you can mix on a same domain more than one app from different servers in the DMZ (for sharing domain based mechanism (flash, Cross site ajax) by rewriting the url; * you can have one front (nginx) and several servers, which with heartbeat mechanism can handle failover. * you could (when ssl certificate were only IP based) share one SSL certificate for more than one back (VIP)
It is a pretty low cost quite scalable architecture. I guess you could do it with apache, but I dropped apache since its licence is as understandable as its configurations.
Locations definitely need reworking, never said they didn't, I just took issue with the having to check the verbose logs as it's perfectly possible without.
Those rules are not well defined either. What is a "literal string" / "conventional string"? Ok, "=" is clear, but what is /images/, how about /im[ag]es/, and /images/$? Since these literal strings are not quoted, what black magic is used to classify the location directive as regex or literal? I would actually love to know this, probably the biggest stumbling point for me. Especially since it uses decoded uris, because "$" could have been a "%24" in a literal string path, OR it could be a regex end of line assertion...wtf?
if you can somehow understand this without having to poke the logs or through vast experience of prior trial and error, my hat's off to you, sir.
The real WTF is when choosing which location is the most specific when there is a string literal and a regex match.
Ignoring ^~ locations which will always be preferred over regex the only locations that can actually be chosen over regex is exact match "=" or default string literal (no prefix) that has an exact match. This part is what is typically the real cause of "I don't know WTF is going on here."
"location = string" and "location string" and "location ^~ string" are non-regex matches.
"location ~ pattern" and "location ~* pattern" are regex matches.
Matching order is unintuitive and therefore bad, and there are plenty of other quirks with nginx configs that should be made more intuitive, but confusion between regex and non-regex is unusual.
It's all described right here: http://wiki.nginx.org/HttpCoreModule#location
i generally disagree with ^ being the universal "not" indicator out of context since it's a "beginning of line assertion" when not preceded by "[" within a regex. The fact that it indicates what follows is a regex-halting literal string prefix match (exactly what ^ would indicate in a normal regex) is plain confusing.
thanks