How the EVE Online Servers Deal with a 3,000 Person Battle(penny-arcade.com) |
How the EVE Online Servers Deal with a 3,000 Person Battle(penny-arcade.com) |
In most of the games I play, the pay off of the grinding is getting to enjoy fighting more. But I could never play EVE, because battles would be so anxiety provoking. There would be no pleasure at all in earning a Titan, because I would be forever terrified of losing it.
Of course, all games are kind of there to destroy your time. But for some reason it doesn't feel quite as wasted when you have a virtual thing there that is bad-ass in proportion to your time investment.
I don't really think about a pvp ship much differently than about ammunition. They're an expendable resource, and you probably have a hangar full of replacements just waiting for you to wake up in your cloning pod. It's probably a bit more involved with capital ships and the like, as they are really a group effort so the replacement capacities might belong to your alliance and not you personally, but they're still accounted for before a shot is fired.
In a way, this is really what makes EVE interesting. Losses needing to be replaced doesn't only make EVE battles meaningful on a different level than matches in other games (though I hesitate to say more meaningful...), but it also allows the internet spaceship economy to be very central to the game rather than just some mini-game-like distraction. "Item creation" becomes a question of securing resources and production pipelines and supply lines and whatnot, rather than a one-off effort that precedes the "actual" game.
Otherwise, they'd never get their supercap pilots out of the hanger :)
Original > Years before, the Test Alliance was part of the HoneyBadgers in a hulking super-coalition. Seeking to carve out a piece of the galaxy for its own, a large portion of Test broke away from the accord to form HoneyBadgers, an independently operating group still pledging allegiance to Test but not to GoonSwarm.
Years before, TEST was part of the Clusters (really: Clusterfuck Coalition, aka CFC). Only in the past year did TEST split off from the CFC and form the Honeybadger Coalition (HBC). TEST and Goons are old friends, so it will be very interesting to see how the current hostilities shake out.
In the case of this particular battle, CFC lost 3 titans. This alone is several thousand dollars worth of real-world money, but it will barely be noticed. CFC has hundreds of titans, and the pilots who lost ships will simply have them replaced by their corps. So no one actually lost anything; the coalition will have to drag a few titans and a few dozen supercapital ships out of its reserves to re-arm its pilots.
Except for that year or two when they could shoot smaller ships with immunity, a LOT of people in nullsec alliances got one with their own funds just to shoot bigger guns with impunity.
Oh, and since you cannot dock Titan (it won't go inside the station) you will need to park it somewhere, most likely near POS. And if you plan to do something else with your pilot - you need to have TWO pilots to be able to pilot Titan because once you exit ship - someone can steal it. So you need an alt who will be always sitting in the Titan when you log off :)
That's why ppl fly blops in cruisers and battlecruisers. Fun. Fast. Cheap.
In general, you get that kind of ship especially for these types of fights. There isn't much base-building or PvE inside of EVE; it's all about the PvP. Being able to say that you flew a Titan into a battle is a major accomplishment inside of the EVE universe, so you don't care as much if you lose it (of course, you want to keep it safe, but only to fly it into more battles).
(Unless, that is, the pilot presses the wrong button and jumps the titan itself into battle, rather than the fleet it was trying to move. Which is what started this whole fight.)
It's a very fascinating read.
That means that what socially could be called a flash crowd accretes in that sector and drags the servers further and further down.
Its much easier to play the market and/or prey on new players moving expensive stuff in areas they think are safe.
Incidentally, probably the most fascinating part of Eve is really the economy. It's by the closest thing to a 'real' market ever created in a virtual setting.
Incidentally, this is what makes some territory more valuable to hold for an alliance, and worth fighting over - some areas can generate much more income than others. Most of the high-level metagame happens over who controls what moons, which is actually invisible to the average player.
Most alliances who own space in EVE though don't actually hold the moons that are valuable and so its pretty much all risk for them and no reward at least in terms of money, you do get a reward of having fun.
For the player as I said the risk vs reward ratio is very off balance because there is really no advantage to living in an area that you are more likely to be killed or participating in any pvp of your own choice
This blows my mind. I've been thinking about writing a small multiplayer game for a few weeks. And today I was actually doing back of the napkin calculations. The thing that quickly became clear is bandwidth is O(n^2) where n is the number of players in the same location, since you have to share the location and velocity of each player with each other player.
Here's an example of a calculation I was using. It's the monthly bandwidth in TB of having a certain amount of players in a shared space (3000 in this case), assuming that the location/orientation/velocity of each player's ship is contained in a measly 48 bytes and you attempt 30fps.
(48 * 30 * 3000 * 3000 * 3600 * 24 * 30) / 1e12 = 33592 TB = 33 PB
THIRTY THREE PETA BYTES to simulate a single battle for a month.
I'm looking around at various VPS providers and the bandwidth they offer. Using the naive calculation above and the cheapest Linode offering and it'd take $3 million/month to support a single one of these ongoing battles.
Now, I'm sure the numbers above aren't used in reality. Obviously, you can reduce the updates per second... but not by much. You can't shave much off how much you send. It at least gives you a sense of the orders of magnitude required to support such a thing. Even contemplating supporting 100-1000 people at once is looking very difficult for me to pull off.
HBC routed the CFC, which has Goonswarm as its core, which is organized around the Something Awful forums.
TEST Alliance was nurtured by the CFC for a long time, splitting off only in the past year to forge a more independent identity. As such, the hostilities between these old friends is hard to gauge. At this point, the presumption is that it's all in good fun. But that may change.
http://themittani.com/features/why-didnt-hbc-and-cfc-go-war
As far as I understand the article, the CFC and HBC are the two biggest coalitions in the galaxy, and war between them is a risky bet for either in the current conditions. There's not enough incentive for them to do it just yet, and old ties come into play. (Some commenters mentioned technetium moons as an underlying factor - very profitable sources of income which PL stands to lose in a war.)
From my outsider point of view, the political situation in the EVE galaxy has matured: the Allies (CFC) beat the Axis (BoB), some residual fighting has gone on, and everyone's trending either towards a Cold War or a United Nations. The ethos of EVE predisposes alliances to conflict, but there's a lot of territory and months of sleepness nights at stake when supercoalitions decide to go to war. And these supercoalitions used to be (at least in part) allied to each other against older enemies.
The pendulum might swing again, but this time it won't end as easily as BoB crushed ASCN in 3 months.
The time dilation is a neat solution to the server load problem, but it's sooo annoying as a player. In beta, it was interesting to watch the entire game desync and grind to a halt, but we could still chat and look around. Interesting, but frustrating.
A friend of mine did some research for DARPA/NSF on internet 'crowds.' His research was looking at the question of fractal gatherings, which was basically you were 'near' the people around you in the virtual 3D space and could hear what they were saying, you were adjacent to people who were one space away but could hear them if they 'shouted' and the crowd was at a gathering that was sharing an experience from a presenter who was 'projected'. Basically a virtual concert venue for example where process affinity scheduling took into account where you "were" in the 3D space. The questions are many and as far as I can tell the folks who research this are few and far between.
Process migration continues to be a hard thing to do. VMware and others have made progress, virtualized interfaces and peripherals, self contained 'state'. Works 'ok' at the VM level but still doesn't work at all AFAIK at the 'thread' level.
World of Warcraft addresses this somewhat by 'instances' which is that under certain circumstances (entering a dungeon for example) you and your party of 5, 10, 25, or 40 people all transition (through a loading screen) into a place with nobody else (except you). They can dedicate machines to host an instance and do process migration at that level.
DARPA is interested of course because they would love to have a way of connecting warfighters into a virtual command and control center regardless of where they are physically. Essentially being in two places at once is a big force multiplier.
If anyone is interested in talking about this, helping, or funding me, please get in touch. My server is called Proxima and my game is First Earth (you can find it if you're interested). I've been slowly bootstrapping it for 3 years.
Someday I'll make time to blog about the server.
What makes this all even more crazier is the game has been around for over 10 years and counting. I still play Eve everyday, once you get past the menial tasks, you can have some real fun with this game being a drug runner, miner or just straight up renegade roaming the infinite environment.
It is the only game I ever played I actually felt an enormous thrill and adrenaline rush when playing pvp. It doesn't matter if you're pvping alone or with a fleet. The fact that you actually lose/gain on a fight differentiates the game from any other MMO. Events that you may have experienced in-game can be recorded forever and actually impacts the whole game universe.
I really hope more companies invest into having one persistent world in their MMOs. Instead we see this segregated servers in which great players/achievements/events are limited to this little server. It ends up lessening the interest of many players, specially older ones.
Funny thing is that, normally, when the hit MMO stops being a hit they end up merging lots of servers anyways. Which IMHO end up lessening even more the time people have spent on that game.
Could you please expand on this? I'm interested in this point, but I haven't played Eve or other MMOs.
But back when I played EVE (that's a very long time ago, the game's changed since)... I was playing as a trucker in low-sec, essentially carrying goods in somewhat dangerous territory. I was really lusting for T2 transports (blinged out trucks), and it was my goal to get one. So I trained and saved for months, until I finally could afford it. I'm not a very patient person, so I took it for a ride before it was fully fitted with the modules I needed for my protection. Some pirate jerk blew it up. It was insured, but I wasn't quite able to easily replace the modules I had lost. In one fight, weeks of my in-game time were lost.
In other MMOs, winning and losing can mean that you've wasted a few minutes of your time. In EVE, it can mean months of wasted effort.
The entire game is a virtual machine.
When a player leaves a server boundary, their player VM state is packaged up and sent to the next world server and then resumed on that server.
Players don't even notice they crossed server boundaries, and this was possible because the game can be paused mid flight for that player - pretty neat.
It seems like a fairly small addition to the game logic to allow the game to freeze for a few seconds and be forcibly moved to another server. Is this kind of scenario so rare that it's not worth the trouble, or am I missing something else?
As you say, it could probably be done, but it would be a lot of work, which wouldn't get used very often.
Great article, great battle.
* You can bit-pack your structures a lot more efficiently - let's say 2 bytes for a local player ID, 10 bytes for each of velocity/orientation (quaternion represention and heavily quantizing the theta component), and 10 bytes for the location. Clients are never allowed to determine the canonical physics simulation, so all of the above are really just for display purposes, and can be trimmed down as appopriate - we don't need to worry about desyncing (as I'll explain in a bit)
* 30 packets-per-second is way too high for network play, with a game of EVE's mechanics you could probably get away with something perhaps as low as 1 pps. Intermediate simulation of the player entities is done by a technique called dead reckoning that's linked earlier (though in practice you'd use a slight improvement on it to stop entities leaping around the world).
* Sometimes game mechanics allow you to strongly cut down the number packets you send. For example, in EVE, it might be desirable to not send information about a ship if it's completely occluded by another ship, or to only send information about ships that are in a frontal cone ahead of you. This usually doesn't affect the bandwidth function (though sometimes it does), but you can almost always cut a constant factor of (rule of thumb) 50% off the bill.
So that ends up being 32 * 1 * 3000 * 3000 ~= 250MB/s, which looks about right to me. One thing you didn't account for is that you don't typically allow MMO game clients to connect to each other, but to multiplex everything through a server. So it ends up being twice as large as that - 500MB/s.
Good luck with your game! 100 interacting players in an MMO is a challenging target but not an unreachable one for a single-developer game, and it's definitely a very interesting project to undertake.
You don't have to send 48 bytes each frame (you can usually send only delta), and 30/sec is only necessary in a reaction-based FPS type of game. Hosting players in a local area, local meaning they can see each other and interact, is n^2. My calculations a year ago came up with a per-minute cost for the amount of AWS hardware it would require to host 1,000 and 10,000 'local' players. Basically, it wouldn't be affordable to have a huge gathering/battle more than a few times a month. It would be VERY expensive. Plus, each player has to have enough download bandwidth.
This type of thing will definitely happen in the future. Clustering is still in its infancy, and it's picking up steam. And there are smart optimizations to be made, such as sending more frequent data about nearer players. You can also design around it by not making players usually local to each other even though there is a single seamless game world. I'm doing that.
This greatly reduces the bandwidth requirements because you have extra time to transmit the data to the user; you only send some of the unit data in each packet. Moreover, you can limit the data to units within X radius of the player/camera and simulate any ships outside of this radius, transmitting only major events like the destruction of a ship. (By simulate, I mean the game could just create random ships and explosions client-side to give the effect of a major battle without actually transmitting or receiving data about those ships; they would be eye-candy only.)
So if a ship has set a course and has stopped accelerating it doesn't need to send anything else unless that course is changed.
This allows a client to have a light-weight "cached" view of the reality, and the only trade-off is the occasional discrepancy from the reality. However, if you send updates about changes in speed/directions in real time, you get a pretty reliable representation of the state known by the server.
What I'm describing is definitely leaning more towards players that are constantly moving around though. It helps explain to me why in Asheron's Call, for instance, when you had too many people in the same city a "portal storm" arose and people started getting teleported out of the city randomly.
If you start doing P2P, you're opening yourself to all kinds of hacks, and increasing the amount of duplicate data traveling around.
So if you're in a battle of 3000 people, you would have to somehow connect to 3000 people, send 3000 times more data. This would probably saturate your uplink and get you banned by your ISP as a suspected botnet victim.
A) Allow a single solar system to span multiple machines. Very hard, especially if the server software isn't architected for this. Retrofitting this can be nigh on impossible.
B) Have a few huge machines that can be used to host scenarios like this and, more importantly, have a way of migrating users over to the huge machine seemlessly.
The latter can be done but it's tricky, especially if transferring game state between instances of the server is not simple (I'm not talking about transferring the VM itself with something like vMotion). It comes down to:-
1) Being able to make the bigger machine act as a temporary proxy pushing connections data back to the smaller machine.
2) Having a way of telling clients to make a new connection to the bigger machine and, once that connection is made (and the data is being proxied to the smaller machine) cut the connection to the smaller machine. Users see no loss of service or reconnects at all.
3) Once all clients are now being proxied by the bigger machine; pause and transfer the game state from the smaller machine to the big machine and then continue. Obviously it works best if a chunk of state can be transferred in the background and then the final transfer (and pause) is as short as possible in order to transfer over the bang up to the minute state.
Option (A) is always the proverbial "In v2 of the server we'll do it a completely different way..."
Yet players have a tendency to figure out when places are too overcrowded to be fun. So your old problematic load is almost never representative of how many players wanted to be in that area, but merely how many players were willing to put up with that level of degraded performance.
So upon release (or sufficiently close to it to start stress testing, which is conveniently when it's too late to really change architecture) the new limits are quickly hit.
For instance, the article actually talks about having said huge machines. There's a way in EVE to inform the GMs about anticipated big fights, at which point they'll do the reinforcement preemptively. In this case, there wasn't such a convenient warning.
Usually on the Telecom equipment, the backup / state transfer is done at a process level, not at a VM level as suggested, but it's quite common practice.
The best equipment I've seen, does this by spawning many equivalent processes, and distributing them among the available blades in the chassis. If you have process mgr1, you get a backup1 process on another blade. As mgr1 processes you're call state, it checkpoints all critical data to the backup1 process. If the mgr1 process itself crashes, or the entire blade fails, all the processes are simply re-spawned, contact their corresponding backup process, and transfer all the state information back, and simply resume. Most end users won't even notice. Using this method, I've seen equipment recover well over 30,000 subscriber sessions in under 5 seconds, most of which probably wouldn't even notice, and even if you did it wouldn't be enough to drop you're data connection (VPN, video streaming, or whatever you're doing). We also don't lose you're bill for the usage either ;)
The challenges with applying this to the game environment, is in telecom each user session is independent, and doesn't really interact with other sessions, so we don't have an issue of a single process becoming overloaded and needing to free up resources to handle it. However, it would be properly easy to do within this model, since failure is expect to occur and be recovered from.
As a programmer, you have to be properly diligent in the software design, what get's check pointed, when does it occur. I couldn't even imagine trying to retroactively apply this type of design to "legacy" software, that wasn't build from the ground up with this model in mind.
I wasn't sure how to interpret that either.
I've always been a bit surprised that the the backend systems for these MMOs aren't a bit more flexible. Though, EvE did launch nearly 10 years ago and it has never had a huge number of subscribers.
I have no idea whether more recent games have solved these problems.
Most games solve this problem with completely separate servers and population caps. I know in Guild Wars 2 they actually stop displaying players past a certain number to improve performance. This works fine in calm areas like cities, but in PvP it causes issues with being killed by "invisible" groups that the game fails to load in time.
I may be remembering incorrectly, but last I remember reading CCP was basically in uncharted territory on the tech front as far as EVE Online is concerned. No other game developer has even tried to tackle this problem at the level they have, and I can't think of many applications in the world that could comparable in scale and complexity to what the EVE servers have to deal with.
Edit: To summarize, even being 10 years old, no game has come close to matching it in this context.
So actually, EVE is way ahead of the rest, technology-wise.
What they said is they run many solar systems on one server. During a major battle they migrate those other solar systems to different servers so that the server only has to deal with the one system.
So in a sense, yes, each solar system is its own shard.
That seems like a fairly substantial amount of users in a single world. Is there any other game world with this many active users in a single world?
That said, I wouldn't call something huge when there are comparables (even if it's just one or two) which dwarf it and others which (at some point) exceeded it.
You can always just live in high-sec (safe zone) and probably never lose your ship. Of course the game tends to reward you for taking risks.
If they prioritized displaying just party members (parties only go up to 5 people) and maybe people on your friends list, and then enemies, it would have worked better.
The ship movement command in Eve are limited to:
* Set course / speed (double clicking in space, speed set via clicking a speed dial thing). This is not something that gets constant tweaks, more a 'move in a general direction' command.
* Orbit x @ y distance
* Keep x @ y distance
* Approach x
In addition other actions in eve are relatively long lived, there isn't fps style aiming but 'locking on' and activating modules which have cycles times of 2-60 seconds.
In practice, even in the heat of battle, I doubt the average player gets even close to 1 input action per second over the course of a fight.
I think for some really hard engineering problems like this the best course could be to try and find a 'good enough' approach that will allow a significant ammount of players to interact while limiting the upper bound with a game mechanic to make it feel less artificial.
http://code.google.com/p/stableorbit
here are a couple videos from the never ending pyopencl port.
http://www.youtube.com/watch?v=lnOmy1ly6M0 http://www.youtube.com/watch?v=XCvRBHtPbzE
That doesn't seem like a sensible comparison considering how much the gameplay of EvE differs from the typical MMO.
In any case, I'm not making a dig at EvE, rather acknowledging that the game has had a long life and has never been a multi-million subscriber behemoth. Therefore, they might make concessions, or allowed the persistence of previous limitations / decisions in design/infrastructure for lack of resources [1][2].
1: http://news.ycombinator.com/item?id=5135873
2: http://massively.joystiq.com/2008/09/28/eve-evolved-eve-onli...
The "typical" MMO doesn't even try to handle large populations. The use of separate servers, population caps and login queues are how just about every one else deals with congestion problems.
Even from your second link there is the quote "Working with IBM, the EVE server cluster is maintained in London and is currently the largest supercomputer employed in the gaming industry." Sure that is from 2008, but newer MMOs are built using the same overall server architecture you saw back in the days of Ultima Online.
Also, travel takes a long time. This battle lasted 2.5 hours and both sides scrambled everyone, waking people up at 4am to go fight. The coalition headed by reddit arrived with 500 ships just as the battle was ending. Part of the reason the folks from Something Awful got beaten as bad as they did is that many different alliances, not just TEST, showed up just for a chance to stick it to them.
Why was everyone trying to stick it to Something Awful ?
For example, Sovereignty Wars in EVE (taking territory from others) is a long, gruelling process that can go on for months. So the goons started launching their attacks at 3 or 4 AM on a monday, to force their adversaries to wake up and go into work with no sleep. They'd do this daily, for a month. In the end, rather than defeating their foe militarily, the opponent would get demoralized and give up, surrendering all of their territory rather than have EVE continue to affect their real lives.
Beyond that, though, they're simply an incredibly powerful faction. There's a resource called technetium that's needed in order to build any of the massive ships, so it's vital for any sizeable corp to have. The Goons grabbed a virtual monopoly on it, and formed OTEC which is a cartel whose goal is to fix the prices of technetium to make it so other groups could never match the goon's strength in massive ships, while growing ridiculously wealthy in the meantime.
Basically, the goons are just really shitty to play against. Couple that with the fact that they and reddit (who used to be the closest of allies) own about 2/3s of everything there is to own in the game, and many players really, really want them to go to war and kill each other off.
They're actually a major cultural force in EVE. ("Shoot blues" "Little bees") They're a bit like America in international relations, except there are fewer redeeming qualities.
I'm mainly talking about EC2, since some other offerings aren't suitable for responsive games. I tried a SimpleDB backend to be hip a couple years back, but it has latency and designed failure rates that are impossible.
Not sure how spikey game traffic is, but it would seem unlikely that your game is super popular for a day and then drops right off the next as would be the case for many websites.
For example if your game business model is charging some nominal fee per month (say $10) you might find that heavy players can burn through way more than $10 worth of EC2 traffic in a month.
Anyone know if any popular games are hosted on EC2?
With higher bandwidths and server capacities I'm guessing these timeouts have been reduced, but never underestimate the player's ability to abuse your trade-offs.
The effect was to allow you to sling shot past the destination at warp speed, and if you logged back in at teh right time (seconds later) you could setup a waypoint far outside the outermost celestial body.
(Normally you can only warp to planets and moons, etc... you can't just pick an arbitrary direction and engage warp).
But even if you only send a "real" update once a second, I still don't see how this thing scales (that still leaves 1 PB in the above calculation). And if you send it less than that, I imagine things would start to look rather jittery. The "occasional discrepancy form reality" would be awful.
Last but not least, compression is your friend.
"Peace means having a bigger stick than the other guy"
Alliances accumulate supercapital fleets so they can fly their collections around with relative impunity. Of course it tends to be awful for everyone who doesn't get to be in the big fleet - as they rarely get to play. This leads to the congregation of supercapital pilots in the same alliances.
Pandemic Legion is famous for this - as they have long had the biggest fleet, used it the most, and thus attracted large numbers of other supercap pilots. But they do not hold space.
And to correct a seemingly prevalent misconception - most titans and supercarriers are personal possessions. There are some alliance/corp owned ones - but the majority are simply owned by long time players who either botted (completely un-policed for a very very long time), scammed, traded, speculated, Wormhole'd or Ratted/missioned (Oh god please no - boring monotonous PvE in the extreme for low returns) their way to 80b isk fortunes.
In a case like this, it seems the non-SA folk got fed up and ganged up on them. As it should be. No dominant force can remain dominant; if they piss off enough folk, either by being evil or just by owning all the things, said folk will band together and take them on.
And that's one part I love about Eve (despite not playing it); the dynamic and self-organizing nature of the universe and its market.
http://en.wikipedia.org/wiki/Something_Awful http://en.wikipedia.org/wiki/4chan
It's actually more the other way around.
Just reinforcing JonnieCache's point.
EDIT: Apparently SA was around before 4chan. grr. In any case, they are fairly independent cultures. One charges $10 to register, the other doesn't require a username/login...
Nowadays of course SA generally stays away from 4chan, which has become about a lot more than anime and games.