Thursday, February 9, 2012

Playing with fire

Intro
I started my game dev career looking after the networking code in Battlefield 2142. Throughout the years (and engines) the networking model of authoritative servers has been the natural choice as I've been working on multiplayer shooters.
As with any model it has its strengths and weaknesses. One of the weaknesses is that it's rather complex to get it right. 
This spurred me to start experimenting with more simpler models. Or really the simplest model possible (?).


My goal was to write as little (net)code as possible and see where I ended up. Would the differences be noticable?  What strength and weaknesses would the model I ended up with have?


So what's up with the title? Well it turns out I'm violating the most sacred rule of multiplayer networking...


Worth noting is that I'm not claiming to have invented something new and revolutionary. Consider this post as a travel journal :)

Rule no 1: Never trust the client
I started my project with the ambition to use a networking model suitable for coop. The sole reason was that I could completely ignore the concept of cheaters. Friends playing together don't cheat on each other. Right? And by ignoring cheaters I could break rule number 1 in multiplayer networking: never trust a client. Any data originating from a game client can be tampered with. In an authoritative setup the "only" thing that can be tampered with is player input. And even though that's enough to create a whole range of cheats the impacts are limited. 
For some reason I ended up testing my model not in coop but in multiplayer...


In a normal setup the game client sends its input to the game server. The game server runs a full simulation of the game moving objects, applying players' inputs on their controlling objects and sends the results to all clients of the game and they update the respective object states. We call this ghosting or replication of objects (aka ghosts). The controlling client don't wait for the server to respond as that would introduce latency as you would have to wait ping time before you input actions got a response.
Instead the client predicts what's going to happen and when the response from the server comes it corrects the actions based on what the server responded with. Correction and prediction are two central concepts in authoritative multiplayer networking for making sure objects are positioned at the same place at the same (networking) time on all clients and server.


In my setup I did a bit of the reverse. I use authoritative clients (common in racing games). Instead of sending my input to the server to do the simulation, I run my character's simulation on the client and send its transform to the server, which relays it to the other clients. This totally removes the need of client prediction and correction as the controlling player will get a perfect simulation and the server/other clients will get the result of that simulation.
Code wise this offers a very slim implementation as the code needed literally is only a few lines of code of sending/reading character position and direction.


As with any networking model the send/receive frequency is much lower than the framerate so to provide smooth movement interpolation/extrapolation is applied on replicated objects.


Also worth noting is that I only used authoritative clients on objects directly controlled by the player such as the player itself or networked projectiles. All other networked objects (such as NPCs, movable crates) are simulated on the server and ghosted to the clients.


Dealing damage
Moving characters are nice and all. But as I was building an FPS I needed projectiles, hit detection, and damage models. 
In an authoritative server model everything is controlled by the server (hence the name...). What happens when a player fires a gun is similar to the moving character scenario above, i.e. the client predicts what's going to happen (fire gun) but the server controls the outcome (fire weapons, deal damage). This means that (especially in high latency scenarios) the client and server can get out of synch for a short period of time before the client is corrected by the server. This is often noticable as you fire at your opponent, you get an impact effect, but no damage is dealt. 
To cater for latency the concept of latency compensation is used where the server takes into account the player's latency when doing hit detection (for instance). If latency was a constant this wouldn't be a problem, but as latency is very volatile (ms difference) you can't be 100% accurate. 


In my setup I let the controlling client decide over its own projectiles. When a player fires a weapon, the projectile is simulated on the client. The client also performs hit detection. So far this is similar to an authoritative server setup (except no prediction/correction is needed). The difference is that the client will also request that damage should be dealt. It do so by sending a "deal damage" request to the server, which applies damage and replicates the updated health state of the object to all clients.
The effect is that when a player fires a weapon and hits a target, damage will be dealt.


From a network code perspective this resulted in super simple code again as the only thing networked is a replay buffer of commands (e.g. fire, fire, reload, zoom in, fire, zoom out, reload, etc) together with simple messages requesting damage to be dealt. The other clients just had to replay the buffer. No weapon simulation needed.


The result
To be able to measure some results I implemented a simple telemetry system to track players' positions when they got killed together with the killers' positions. The data was saved to a file that the participants sent me.
I had a number of playtests to get some input data. And to be fair I've only tested the setup in a low latency environment as I wanted to see if my network model would break immediately or not.


The tests were done in 1-on-1 fights so that it would be apparent where the opponent was during a fire  fight. After a playtest the big questions were: "Did you ever feel you were killed when you shouldn't have been?" and "Did you feel you hit the other player when you thought you should have?"


The results showed that it felt good and snappy. Feelings are good, but did the telemetry data back up the results?
To get some hard facts I implemented a simple viewer for my telemetry data where I can load and group each clients data to compare the results. 
What it actually does is creating spheres indicating the positions of victims and their killers. I then use a free camera to fly around and view the data. In future versions I will add the possibility of doing this on the actual level to get an even better view of the data. 
I also wrote some simple tools to calculate the distance between the positions to find out how big the difference was between the clients.
It turned out that the biggest distance between where client A thought he was killed and where client B thought the same kill happened was 40cm (for characters running 6m/second). Considering the limitied amount of code written that is pretty awesome.


Naturally the results would look different in high latency scenarios. But then again any multiplayer game will behave differently in high latency scenarios.


Conclusion
So what are the conclusions? Except that it would be so much nicer if people didn't cheat?


Did I learn something? The one thing I was most amazed by was the small amount of code needed to provide a full scale multiplayer experience. Sure it has the enormous drawback of being hacker friendly but used in the right context it certainly have its place. Due to the simplicity it's robust, easy to maintain, and extend when needed.


I was also pleased to see the simplicity of the code dealing with replicated clients (due to the absence of prediction/correction and by using the replay buffer) as virtually no simulation was needed. An extra bonus was that this lead to a very lightweight implementation both server side and client side.




Further reading
If you want to read more about the authoritative networking model, Valves documentation is great. I've included a link to the parts included in this post (prediction, correction, latency compensation)
[Valve] Latency Compensating Methods in Client/Server In-game Protocol Design and Optimization