6IT

TCP Server serving Multiple Clients

[[This is Chapter 13(d) from "beta" Volume IV of the upcoming book "Development&Deployment of Multiplayer Online Games", which is currently being beta-tested. Beta-testing is intended to improve the quality of the book, and provides free e-copy of the "release" book to those who help with improving; for further details see "Book Beta Testing". All the content published during Beta Testing, is subject to change before the book is published.

To navigate through the book, you may want to use Development&Deployment of MOG: Table of Contents.]]

As we’ve (sort of) described problems and benefits resulting from using UDP, let’s see what can be done with TCP in the context of games. In most aspects, Websockets are very similar to plain TCP (see also "Websockets: TCP in Disguise" section below).

The Most Popular Bug when using TCP

Assumption is a mother of all {disasters|screw-ups|…}
— attributions vary —

Before we start with game specifics, I want to mention one bug which from my experience is by far The Most Popular Bug in TCP-related code (whether it is game code or not). Just one example – at some point in my programming career, I joined a Really Big Company; the project I joined, was three-days-before-release, and on my very first day in their team I found this very bug in their code.

The problem (or maybe misunderstanding?) which a lot of developers are having with TCP (and other stream-oriented solutions such as pipes), is that they erroneously assume each call to send() function (or equivalent) to result in exactly one call to recv() function (or equivalent). While during local-machine testing (and sometimes LAN testing) this assumption MIGHT stand (though even there it is not guaranteed in any way), this assumption falls apart pretty much the same second you’re moving to the Internet.

To avoid running into this problem, it is necessary to remember that

TCP is a byte stream, the whole byte stream and nothing but the byte stream¹

In particular, call to send() does NOT put any markers into the byte stream, so if you want to pass some messages over TCP, you need to put those markers into the stream yourself

In particular, call to send() does NOT put any markers into the byte stream, so if you want to pass some messages over TCP, you need to put those markers into the stream yourself (effectively creating your own subprotocol on top of TCP).

In one example, we may decide that we want to communicate messages over TCP, and say that our TCP stream always consists of 2-byte message-size (do NOT forget to agree whether it is going to be big-endian or little-endian), and then of the message-size bytes which represent message body.

Note that on the receiving side we will need to have two loops (each with recv() inside) to parse this format: the first kinda-loop will read message-size (yes, it is possible that recv() will get only one byte), and the second loop will read message body until message-size bytes has been read. While such parsing is certainly not a rocket science, it does require attention to details.

While we’re at it, let’s also discuss message sizes; quite a few people might think that 2-byte message-size is not enough for their messages, and expand to 4-byte ones – and these days even to 8-byte message-size. While doing it, you need to keep in mind that accepting arbitrary large message-sizes creates a potential for a DoS attack; if an attacker can force you to allocate 1000G of RAM (and it doesn’t matter much whether you’re allocating it all-at-once or in chunks) – you’re ending up in a Big Trouble. It means that:

You DO need a limit on your message sizes
- While it IS possible to deal with Really Large Messages, in practice handling them is too error-prone to do it on a regular basis. In other words - for file transfer services with only 5 different messages and inherently Large Files, handling Large Messages in an ad-hoc manner might be a viable option, but for games it Very Rarely qualifies as a good idea.
The smaller this limit is – generally the better; going above few-M-in-size is rarely a good idea (and if you can fit into 64K – it is even better)
Therefore:
- 8-byte message-size is almost-universally a Bad Idea
- 4-byte messages-sizes are ok, but usually require additional check for not exceeding a pre-defined threshold (such as few-M-in-size)
- if you can fit your messages into 2-byte message-size – it is even better
If you’re transferring fragments of max 64K each, but assembling up to 64K of those fragments on receiving side before processing – from DoS point of view it is not better than having one fragment of 4G size
BTW, when speaking about DoS via large messages: it is NOT sufficient to limit message size on the wire. If you’re transferring fragments of max 64K each, but assembling up to 64K of those fragments on receiving side before processing – from DoS point of view it is not better than having one fragment of 4G size (which is usually too risky from DoS perspective). Moral of the story: it is a size of logical message which matters to avoid DoS.

¹ well, except for OOB but normally you don’t use OOB, more on it in “On OOB” section below

TCP: Reducing Latencies

Nagle algorithm and TCP_NODELAY

When using TCP, take into account that quite a few TCP features which affect interactivity (usually - to the worse 🙁). One of them is so-called “Nagle Algorithm” (which is by default enabled on a TCP connection). Nagle algorithm (when enabled) restricts the connection to having only one single “packet in transit” - that is, unless outgoing TCP buffer has got a full packet of MSS size, where one typical MSS value is 1460. I won't argue too much whether having Nagle as a part of TCP specification is a Good Idea,² but we need to work with TCP-which-we-have.

Effects of Nagle algorithm on games is often devastating. For example, if Nagle is enabled on a TCP connection, and over 5 seconds we’re trying to send 100 updates (at 20 network ticks/second) of 50 bytes each over a connection with RTT=100ms, then in reality packets will be sent only every 100ms (as soon as the previous packet gets acknowledged; by that time the whole outstanding packet will be only around 250 bytes, which is much less than typical MSS, so most of the time it is an acknowledgement which will be a trigger for sending a new packet, and not a packet becoming full).

To disable Nagle algorithm setsockopt() function with TCP_NODELAY parameter may be used.³

² IMNSHO it is not; it is not a job of the TCP stack to introduce an additional layer of complexity to fix poorly written programs, and calling send() one byte at a time does qualify as such (this stands even with Nagle enabled(!)), see also below

³ that is, assuming that you’re using Berkeley sockets or equivalent

TCP with TCP_NODELAY: minor caveat

One thing to be remembered when using TCP stream with TCP_NODELAY flag on, is to

call send() ONLY when the whole packet is ready

With TCP_NODELAY, each call to send() causes a TCP packet to be sent; the stream is still correct, and your program will still work, but it will cause a significant (and unnecessary) overhead. For example, if you’re implementing the protocol mentioned above (2-byte-message-size + message-body-of-message-size) in the following manner:

void MsgSender::send_msg(const Message& msg) { 
  uint16_t sz = msg.sz;
  
  send(sock, &sz, 2, 0);  // (*) Send the message size first
  send(sock, msg.buf, sz, 0);  // (**) Then send the message buffer
}

- then it will work more-or-less ok without TCP_NODELAY,⁴ but with TCP_NODELAY it will cause two packets to be sent (the first one in line (*), and the second one in line (**)), causing an additional 40+ bytes of overhead (20 bytes for IP header, another 20 for TCP header, and that’s not counting Ethernet headers).

Things become even worse if you’re constructing your packet with more-than-two send() calls; for example, if you’re writing each field with a separate send(), and your message consists of six 4-byte fields, then you’ll get an overhead of 40+*5=200+ bytes for an otherwise 40+24=64-byte packet, ouch!

Bottom line:

If you’re using TCP_NODELAY, avoid multiple send() calls for the same logical message at all costs

If you Really Really cannot combine your send() calls together - use TCP_CORK (or a reportedly equivalent workaround for Windows described in StackOverflow); while these options will incur additional costs CPU-wise (due to multiple kernel-level calls), they will still save your traffic.

even if you’re NOT using TCP_NODELAY, combining calls to send() is usually a good idea

And while we’re at it: even if you’re NOT using TCP_NODELAY, combining calls to send() is usually a good idea (to avoid doing a relatively expensive kernel call more than once), though “avoiding at all costs” might be a bit of overkill in some cases, as memory copying and especially allocations can easily outweigh gains from avoiding one kernel call (kernel call is usually in the range of 300-500 CPU clocks, so one extra call is not that much). Ideally, CPU-performance-wise, to combine buffers residing in different places in memory, you should aim to use some kind of “vectored I/O” (also known as “scatter/gather”), such as sendmsg() in Linux or WSASendMsg() in Windows.

⁴ Strictly speaking, even without TCP_NODELAY it is suboptimal, as it causes an extra call to kernel, and kernel calls are expensive, but with TCP_NODELAY it becomes a very obvious problem

TCP with TCP_NODELAY: still not a match to UDP with fast-paced sync algorithm, BUT might be necessary at least for TCP fallback

Now let’s compare pushing fast-paced updates via TCP-with-TCP_NODELAY with the UDP algorithm aimed for fast-paced sync (the one from “Fast-paced Updates: Compression without built-in reliability” section above).

Actually,

if we’re using TCP_NODELAY, and there is no packet loss whatsoever, the difference between TCP and UDP is negligible.

if all packets reach the Client, there isn’t that much difference between TCP and UDP

For each TCP-with-TCP_NODELAY send() call there is a TCP packet sent, and for each UDP’s sendto() call there is a UDP packet sent, and if all packets reach the Client, there isn’t that much difference between the two approaches. Yes, TCP has a bit more overhead, but on the other hand, for TCP we can use compression-compared-to-previous-packet (and for UDP we’re bound to use compression-compared-to-last-acknowledged-packet, see “Fast-paced Updates: Compression without built-in reliability” section for discussion); in any case, the difference will be pretty much negligible in the Grand Schema of things (that is, as long as there is no packet loss(!)).

However, the very first lost packet will show the difference. With UDP we won’t care about it and will keep sending packets to the other side, so just one network tick later the situation will be corrected (that is, if there are no two packets lost in a row). With TCP, however, situation will be quite different and much more complicated.

Usually, with TCP_NODELAY enabled, our next call to send() (on the next network tick) will cause sending-side TCP stack to send our second packet to the Client. However,

on the receiving side this second packet WILL NOT be delivered to our program for quite a while

This happens because TCP is a stream and part of this stream (corresponding to the lost packet) is missing, so receiving-side TCP stack will not let application access the second packet until the first one (originally lost) arrives. Drats, drats, and double-drats! We do have all the information we need, on the receiving computer, but we cannot access it!

Sending-side TCP stack will retransmit the lost packet, but it will happen around 2*RTT time later.⁵ As typical RTT for a multi-player game is within 50-100ms, it means that for TCP with TCP_NODELAY we’re speaking about "lag spike" of about 100-200ms (for UDP state-sync at 20 network ticks/sec – it is only 50ms, and even less if network ticks are more frequent).

Therefore:

If your game is sensitive to 100ms-or-so delays – you ARE better with UDP
- However, as UDP is not that-universally available as TCP (see discussion on it in [[TODO]] section above) – I still suggest to implement TCP as a fallback.
On the other hand, if you’re ok with 1+ second delays – you MIGHT be able to get away with TCP-only (though keep reading for further related trickery below)

⁵ exact number will probably vary (and depends not only on RTT, but also on “jitter”, see RFC6298 for exact specification as it should be implemented by all the modern TCP stacks), but usually won’t differ too much from 2*RTT. To get more specific numbers, I suggest to simulate network loss (more on it in [[TODO]] section) and see behavior of the relevant TCP stacks in Wireshark yourself, as behavior can vary significantly

“Hanged” TCP connections

If you have seen that some page in your browser got stuck, you hit refresh, and bingo! – it is here in no time, chances are that you’ve just seen such “hanged” TCP connection.

One Big Problem when it comes to using TCP for games, is related to “hanged” TCP connections. If you have seen that some page in your browser got stuck, you hit refresh, and bingo! – it is here in no time, chances are that you’ve just seen such “hanged” TCP connection. Yes, quite a bit of these situations can be attributed to coincidences and human "selective memory" (we tend to remember these things better than the opposite ones), but in real world “hanged” TCP does happen (and for a player, it can be Very Annoying – especially as usually there is no “refresh” button in games).

My First Guess - Exponential Backoff

I’ve seen a LOT of these “hanged” connections in games (both as player and as developer), and tend to attribute them primarily to TCP’s “exponential backoff” algorithm (see, for example, RFC6298). Whenever sender’s TCP stack doesn’t get an ACK to the sent packet within reasonable time, it keeps retransmitting the packet, but

Doubles retransmission time after each attempt

Now let’s consider a connection which has not-too-high-by-modern-standards 5% end-to-end packet loss rate (you can be sure that such a thing will happen for quite a few of your players) – let’s also assume for the time being that packet loss is completely random.⁶ Let’s assume that RTT is 100ms, and that first retransmit happens in 200ms from the first time when the packet got sent. Now, the second retransmit will happen in 400ms after the first one, the third one – in 800ms after the second one, and so on, and the sixth one – in 6.4 seconds after the fifth one (i.e. 12.6 seconds from the original packet), which is already too much for quite a few games out there.

In practice, however, it will be several orders of magnitude more frequent because of correlations between packet losses.

With the completely random distribution, chances of getting 6 packets lost in a row are 0.05^6 ~= 1e-8, which may seem as “it is not going to happen”. However, if you’re sending 20 packets per second, it would mean that each of your players will experience “hanged” connection problem every few days, which is not that good. In practice, however, it will be several orders of magnitude more frequent because of correlations between packet losses. In particular, with modern “active queue management” algorithms becoming widespread, chances of packet loss rate going to 20+% for a few seconds are rather high (and actually, these situations are to be expected).

⁶ In practice, it is not, and these correlations will make things significantly worse

Sudden IP Change - the Curse of Mobile

On mobile devices (phones and tablets, AND on WAN-connected PCs too) there is a rather well-known issue, which arises when you're moving. In such cases, your IP address CAN be changed. What happens in this scenario, is quite bad for TCP 🙁.

Whenever your IP address changes over an existing TCP connection, packets coming from server to that TCP connection, won't come back to your device anymore. Then, your TCP connection on the Client Side will stay in a "hanged" state, until your Client sends some TCP packet, which reaches the Server, and Server (most likely) issues an RST in response. However, there are numerous guys on the way who will be willing to drop this RST (or even your original packet, as it lacks valid TCP context), so this RST may never reach your device, leaving your TCP "perfectly hanged forever" 🙁.

OS features which switch providers automagically (such as WiFi Assist), while generally good for end-user experience, tend to cause more frequent IP switches, and exasperate this problem (that is, unless you're actively fighting it according to this very book 😉).

Other Possibilities

I’ve heard quite a few alternative plausible theories explaining “hanged” TCP connections⁷ ranging from PMTUD (mis)-implementations causing persistent packet loss in case of route changes, to different handling of SYN packets by routers and especially firewalls.

However, it doesn’t really matter what exactly causes those “hanged” connections.⁸ Whatever the reason for them, the only thing which really matters from our purely pragmatic perspective is a very practical observation that

If your TCP connection “hangs” for several seconds, there is a 30% to 70% chance that a new connection will be ready to transmit data before the “hanged” connection goes back to life (YMMV, batteries not included)

⁷ note that under “hanged” TCP connection we’re considering ONLY situations when both ends of the connection are perfectly fine, AND routes in both directions are generally fine (though they MAY have statistical drops along the road).

⁸ Most likely, there are several contributing scenarios (which also interplay with each other). After I’ve seen a TCP connection which got “hanged” each time when sending a very specific data pattern to a specific IP address, because the packet with this pattern was consistently dropped by a faulty next-to-backbone router (this problem has lasted for 2 weeks until the card in the router has failed completely and was replaced) – I can easily believe in pretty much everything.

Dealing with “Hanged” connections – Opportunistic Re-Establish

If during this process the original connection springs back to life – drop the second one and resume working over the first one

Given the observation above, the most obvious way to deal with “hanged” TCP (that is, besides “let’s drop TCP completely” 😉) goes along the following lines:

Detect that connection got “hanged” (how to do it is a separate story described below)
Try to establish a second TCP connection
- If during this process the original connection springs back to life – drop the second one and resume working over the first one
As soon as second one is ready to be used (this usually includes TLS session if applicable) – drop the first connection and switch to the second one.

This schema has been seen to work reasonably well for a not-so-fast major game (with acceptable delays around 5 seconds). Applying it to faster games, however, faces significant problems, mostly due to “hanged” detection taking too much time; we’ll discuss alternatives for fast sim-based games later, in "Dealing with “Hanged” Connections – Dual TCP" section.

One important property of the algorithm above is that we don’t really take any risks latency-wise – if the original connection goes back to life while we’re establishing the second one, we just go back to the original connection without losing anything latency-wise.

Detecting “Hanged” connection – app-level Keep-Alives

every second or so (if there was no other traffic), transport layer of our Server will send a special “Keep-Alive” message over TCP

One way to detect those “hanged” connections is application-level Keep-Alives. For example, every second or so (if there was no other traffic), transport layer of our Server will send a special “Keep-Alive” message over TCP, and if the Client didn’t see any of the messages (Keep-Alive or not) for 5 seconds on their side – the process of establishing a new connection (described above) is started. On the Server side, if there is no activity for 15 seconds, Server can simply drop the connection to release resources (leaving it to the Client to re-establish connection).

This thing was seen to work reasonably well, and IMHO at least once it has made a significant contribution to the player perception of “these guys have better connectivity then competition”.

Detecting “Hanged” connection – TCP-level Keep-Alives

I am a scientist. I don't take any risks!
— Scientist from 'Garfield and Friends' —

An alternative to app-level Keep-Alives is to use TCP-level Keep-Alives. TCP does have it’s own mechanism for Keep-Alives, but until recent years, there was no API to control times for TCP Keep-Alives, and with default being 2 hours, it was rather useless for games. However, with a relatively recent addition of TCP_KEEPINTVL/TCP_KEEPCNT/TCP_KEEPIDLE for Linux (and SIO_KEEPALIVE_VALS for Windows) it became possible to control keep-alive times via a simple setsockopt() call.

That being said, I still prefer application-level Keep-Alives to TCP-level ones. There are two reasons for it (that's besides me being a DIY guy in general 😉):

Controlling Keep-Alive times is not a standard feature, and it is often simpler to implement app-level keep-alive then to check that it is available for all the Client platforms
when using TCP-level Keep-Alives, we're bound to take a risk that connection was about to restore connectivity just at the moment when it was broken by TCP Keep-Alive
More importantly, when TCP-level Keep-Alives detect that connection is broken, they first drop the connection, and only then notify you about the failure. It precludes us from playing the “no-risk opportunistic game” described above, when we're trying to establish new connection while keeping the original one intact in hope that it may “unhang” while we’re playing around with the second one. In other words, when using TCP-level Keep-Alives, we're bound to take a risk that connection was about to restore connectivity just at the moment when it was broken by TCP Keep-Alive; it doesn't happen when application-level Keep-Alives are handled in the manner described above.

Dealing with “Hanged” Connections – Dual TCP

Of course, if your game sends packets every 50ms no matter what, you don’t need a special Keep-Alive; on the other hand, if you’re doing it – chances are that detect-then-reestablish way of handling that “hanged” TCP connection will take too much time for your game ☹️.

In this case, Client keeps two TCP connections to the same Server, and Server sends each message to both TCP connections pretty much simultaneously (even better - shifting one message by half of your network tick or so).

For such cases, dual TCP connections can be used. In this case, Client keeps two TCP connections to the same Server, and Server sends each message to both TCP connections pretty much simultaneously (even better - shifting one message by half of your network tick or so). On the Client side, whichever message arrives first – it gets processed (and the subsequent duplicate silently ignored). If one of connections gets “behind” the other one too much – it is considered a “hanged” connection and is dropped-and-re-established.

For the data flying in the opposite direction (from Client to Server) it works pretty much the same (though Servers don’t re-establish connections themselves, they drop the offending one, and also MAY signal Client to re-establish “hanged” connection on remaining connection).

Honestly, I didn’t use this schema myself, BUT I’ve heard rather good things about it (and I think it has Really Good potential). Feel free to try it, but don’t hit me too hard if it doesn’t work 😉 .

As a side bonus, different parts of such Dual TCP can be bound to different interfaces on the Client side, to achieve redundancy over two connections (and get packets delivered with a minimally possible jitter too).

The only negative downsides of Dual TCP are related to a bit more resources needed on the Server-Side (those TCP connections tend to eat RAM, but fortunately RAM of the order of 32K per player is not-that-precious these days), and more importantly - twice more traffic coming from the Server (and as you’re normally paying for outgoing traffic – it can be rather important ☹️). On the other hand, it can be seen as a monetization opportunity too (i.e. Dual TCP – or Dual UDP for that matter - can become a paid option, or VIP option, or whatever-else-your-marketing-guys-decide; while I didn’t see it myself implemented in games, I know quite a few players in different genres who would readily pay for such a feature). Alternatively, you may want to enable Dual TCP only for Important Tournaments.

TCP Buffers and Priorities

Another Pretty Bad Thing happens if you’re trying to send both large-but-slow data (like a new theme) and small-and-fast data (like those coordinate updates coming 20 times per second), simultaneously.

First of all, if you’re sending a 10Mbyte-file as a single TCP send(), it will often be accepted (even in non-blocking mode). However, with “stream and only stream” TCP ideology, it means that by sending such a large chunk of data, you have just occupied your Client’s incoming channel for quite a while (even for 100Mbit/s client-side connection it would take around a second – even more if something goes wrong so TCP congestion control kicks in). If you try to send something (like coordinate update) during this time, your send() call is likely to be blocked (or return EWOULDBLOCK or equivalent for non-blocking sockets), and you won’t be able to send your urgent data within this second-or-so, ouch.

this problem can be mitigated by splitting your file in smaller chunks (say, 4K each)

To a certain (and IMHO quite large) extent, this problem can be mitigated by splitting your file in smaller chunks (say, 4K each); these chunks would sit in some kind of priority queue on the sender’s side (before send() is called), and would be pushed to the TCP stack (via calling send()) only when socket indicates that it is ready to accept more data for writing (for example, such a socket MAY be observed via select() writefds or equivalent function⁹). Then, if new more-urgent data (such as coordinate update) comes in, it goes to the same priority queue, but with higher priority, so that when socket becomes available for writing the next time, it is more-urgent data which will be fed to send().

However, it is not the end of the story (yet). One further thing here is that all the data which is fed to send() function, doesn’t necessarily result in the packet being immediately sent ☹️ (yes, it applies even if TCP_NODELAY is set). Specifically, in the scenario when you’re in the middle of sending a large file in 4K chunks, when the socket is reported to be available for writing, it only means that there is some space in sending buffer. As modern TCP sending buffers are typically in the range of 8K-32K, it means that you’re likely to wait for additional 8-32K to be sent as packets before your higher-priority update gets a chance to kick in.

In other words, if you’re sending high-priority data together with low-priority one over the same TCP channel, and splitting your data into maximum-N-Kilobytes-chunks, and your TCP buffers are M kilobytes in size, then you can easily experience an additional delay of up to (N+M)*client-side-channel-bandwidth. In practice, if N=4K and M=32K, and your player has a 10Mbit/s connection (which is not used for some heavy download at the same time), we’re looking at additional delays of about 30ms – which is not exactly fatal, but does add to not-so-perfect player experience.

if you reduce the TCP-level buffer to, say, 4K (using setsockopt()’s SO_SNDBUF) – then in the example above you’ll be able to reduce additional latency from 30ms to around 6ms

On the positive side, it is possible to reduce the impact of sender-side buffers: if you reduce the TCP-level buffer to, say, 4K (using setsockopt()’s SO_SNDBUF) – then in the example above you’ll be able to reduce additional latency from 30ms to around 6ms.¹⁰

⁹ we’re not at the stage of discussing select-vs-poll-vs-whatever-else yet [[TODO: where?]]

¹⁰ this reduction in latency comes at the cost of somewhat reduced throughput for those large chunks of data, but for game traffic latencies are usually MUCH more important

On Multiple TCP Connections for Different Priority Data

NB: this is different from "Dual TCP" described above\

It might seem that with all the problems with different priority messages going over the same TCP connection, it is obviously better to use different TCP connections for different-priority data. While sometimes it MIGHT be the case, in practice it is not always that obvious.

The main argument against using multiple TCP connections for different priority-data is related to two observations:

when you have two IP connections going over the same route, there WILL be interaction between IP packets

and

all interactions between separate TCP connections are pretty much out of your control

For example, if you’ll be transferring that large 10M file over one TCP connection, and try to send your high-priority data over another TCP connection, these two TCP connections may still interact (!). In particular, if your file is large enough to saturate Client’s downlink, then your high-priority packets will compete with this file transfer over Client’s downlink (at the ISP-side router facing Client’s “last mile” deciding which of the packets to transfer and which ones to drop).

With multiple TCP connections, you won’t have any control over this choice-made-by-router, which in turn often leads to rather poor player experiences. Using the same TCP connection, you at least can test behaviour of your system within your lab, and to figure out how it will behave in the field. In practice, I’ve seen single-TCP-connection-with-priority-queues working very well for a not-so-time-critical-game (with acceptable latency of 1+ second).

Bottom line about Single-vs-Multiple-TCP-connections for data with different priority

Let’s summarize my personal feelings about single and multiple TCP connections for the purposes of transferring data with different priority (YMMV, batteries not included):

For not-so-fast games (those with acceptable latencies above 1 second or so), single TCP link with priorities tends to work pretty well
For fast time-critical games, it is better to avoid sending different-priority data at the same time completely
For fast time-critical games, it is better to avoid sending different-priority data at the same time completely (and BTW, it is better to use TCP as a backup only)
- If this is not possible – I suggest to try both single-with-priority and multiple-TCP-connections-for-different-priorities over real-world links (BTW, simulation won’t give you enough information for multiple TCP connections, as algorithms which routers use to choose a packet to drop, vary significantly and are poorly documented)

On OOB

Remember I’ve told above that TCP “is nothing but byte stream”? Well, I've lied (that is, if you didn't read a footnote). There is one thing within TCP which goes against this principle, and it is known as OOB (Out Of Band data).

Unfortunately, use of OOB (indicated by MSG_OOB flag when calling send()) is extremely limited. In particular, only one measly byte can be sent via OOB at a time Stevens: if you specify MSG_OOB flag for a block-larger-than-1-byte, only the last byte will be considered out-of-band data.

Up to this point, I didn’t see any use for OOB in games (and know of only one instance of it being used in any app-level protocol at all). However, keep it in mind – it is quite an interesting capability which MIGHT come handy some day.

Detecting Link Saturation. Conflation

One good thing about the model described above (the one with a priority queue storing incoming data) is that it can be generalized to deal with scenarios where there is only one priority, but your Client’s link still gets saturated one way or another (reasons for saturation can be numerous – from you sending too much to a player’s roommate starting a Huge Download).

if you’re about to push another coordinate update into the pre-TCP queue, BUT the queue already has the coordinate update – you MAY (and SHOULD) drop previous coordinate update from the queue and replace it with a new one

This feature may work the following way: if you’re about to push another coordinate update into the pre-TCP queue, BUT the queue already has the coordinate update – you MAY (and SHOULD) drop previous coordinate update from the queue and replace it with a new one (you still MAY use the-last-coordinate-message-already-pushed-to-send()-function as a baseline for compression). This kinda-merging of updates is known as “conflation” and first time I've heard about it in the context of TCP updates, was from Alessandro Alinone from Lightstreamer (though Lightstreamer implements it in a somewhat different manner).

In a sense, such “conflation” acts as an additional way of flow control, which allows to avoid congesting the client link with the data which don’t really fit there (at the cost of dropping some of the updates, but this is still MUCH better than falling further and further behind).

Terminating connection and SO_LINGER

One setsockopt() option which is often beneficial when using TCP in game context, is SO_LINGER with l_onoff = true, and l_linger = 0.

Normally, when you close TCP socket, it still delivers everything-you’ve-already-called-send()-for to the other side of connection. While such behavior is exactly what’s necessary for serving file transfers and web requests, it is not necessarily good for interactive games. In short – using SO_LINGER with options described abovewill cause TCP stack to terminate connection immediately rather than wait for graceful termination.

In practice, using such non-graceful connection termination for routine disconnects has both positive and negative aspects. On the positive side, it allows to avoid TCP connections in FIN_WAIT2 state from eating too much of server resources (and it MAY become a Big Problem).

On the negative side, non-graceful termination causes a different TCP packet (RST instead of normal FIN) to be used for connection termination; while this is perfectly legal per all the RFCs, some monitoring tools (and quite a few admins) will make a lot of noise about it. On the other hand, except for misreports by some tools, I didn’t see any real negative side for such a non-graceful termination; in particular, I know of a X00’000-simultaneous-player game which has been doing exactly this for 10+ years without any problems.To summarize my opinion on SO_LINGER with l_onoff = true, and l_linger = 0:

There isn’t that much difference, and usually it is not that big deal to change later, so it is not that important to decide on it right away
However, if you ever run into “too many FIN_WAIT2 connections eating too much of your server resources” – it MAY save your bacon
As for the noise made about too many RST packets by tools and admins – don’t worry much about it (see reasoning above).

TCP and Compression

One thing which is obviously simpler with TCP than with UDP, is compression.

One thing which is obviously simpler with TCP than with UDP, is compression. Basically (unlike with UDP), TCP allows for your usual streaming compression. However (as with pretty much anything else), games have their own specifics even for TCP-based compression.

In short – in game environments, your typical messages are small, and quite a few algorithms out there, while technically working correctly under such conditions, become suboptimal.

One example of such an algorithm which works well for file transfers but starts to behave less-than-optimal for games, is well-known deflate. For deflate, the cost of preparing the block of data for sending (known as flush) is quite large, which causes significant losses in terms of traffic SI98. One alternative to deflate, based on the same LZ77+Huffman ideas but optimized for small updates, is LZHL algorithm SI98.

Another thing to keep in mind with regards to TCP-based compression is that for potentially-conflated fast coordinate-like updates (see discussion in [[TODO!!]] section above), you will actually need two different compression algorithms. The first one is a coordinate-based stuff (using delta compression and dead reckoning), and this one will also participate in conflation. The second one is your usual TCP-layer compression (such as LZHL).

TCP Checksums and Encryption

TCP checksums (just as UDP checksums) are 16-bit and are not exactly reliable as a result.

TCP checksums (just as UDP checksums) are 16-bit and are not exactly reliable as a result. In other words, if you recv() something over TCP, it is not necessarily exactly the same as sender has provided to send() function.¹¹ In general, if using unencrypted TCP, you would need to add your own larger-then-16-bit checksum (at the very least something along the lines of CRC-32 or Fletcher32).

On the other hand, in game environments most of the time there are Big Fat Reasons to keep your traffic encrypted¹² (see "Why Encrypt??" subsection above for discussion of these reasons). When using any reasonably good encryption, which aims to protect traffic against malicious changes, any accidental changes along the road can be can considered as it is not going to happen, ever category. Note, however, that encryption does NOT eliminate the need for data sanitizing at least on the Server-Side (as malicious Client can easily push the maliciously malformed data to one of the ends of a perfectly encrypted channel).

¹¹ This effect can sometimes be observed on not-so-perfect connections with multi-gigabyte file transfers; the larger your file is – the more chances it gets corrupted

¹² also integrity-checked and authenticated

Encryption for TCP

When speaking about encryption protocols and libraries for TCP, we’ll see that there are MUCH more options available in this department. These include:

TLS. This is default option for security over TCP and it tends to work pretty well. It also have advantages of being more firewall-friendly (see also section "Magic port 443" below).
- TLS libraries¹³ include: OpenSSL, LibreSSL, GnuTLS, S2N, mbed TLS, Botan, and probably some others which I forgot about (though for others make sure to double-check for their license – if it is “only GPL or commercial”, usually it won’t be good for your game unless you want to pay).
SSH. SSH protocol is widely used for secure shells around the world. It CAN be used for applications, but in practice app-level SSH is rather rare.
- There is at least one decent and currently-supported library libssh,¹⁴ and you MAY implement your own messages over SSH. However, using it for application-level is quite uncommon, so you're likely to have more problems with SSH than with TLS. As a result, I suggest to use SSH only if you have strong feelings against TLS security (and honestly, you shouldn't - at least not in most of traditional game environment where cost of in-transit compromise is usually fairly low compared to attack costs).
CurveCP. CurveCP protocol is a very interesting development, and would be very interesting for games; in particular, CurveCP is slim and fast (and the fact that it isn't as mature as TLS or SSH has little bearing on most of the games out there).
- However, as of beginning of 2016, there seems to be no supported implementations of CurveCP (CurveCP was separated from supported libsodium into libchloride, and the latter looks pretty much abandoned now), and it is a pity, as implementing a secure protocol yourself is the last thing you want to do when developing your game.

Overall, a brief summary of my take for TCP encryption:

if your game is not not-so-security-critical - use any of the implementations above, it won't matter too much
if your game IS security-crticial (stock exchange or so) - consider double-encryption
if your game IS security-crticial (stock exchange or so) - consider double-encryption (along the lines described in "Common Encryption-Related Notes" section above)
DO make sure to read "Common Encryption-Related Notes" section above and to follow all the advice there. It still applies.

¹³ excluding those which will force you either to publish your game sources, or to get commercial licenses

¹⁴ I am not mentioning libssh2, as last time I’ve checked, it was client-only

On writing ‘better TCP’ (on top of UDP or via RAW sockets)

One thing which pops up in discussions about “TCP better suitable for games” on a regular basis, is writing “better TCP” (either on top of UDP, or on top of RAW sockets).

While it IS possible, I would rather stay away from doing it. TCP stack is quite a complicated beast, and re-implementing it is not easy (even very basic RFC793 is 80 pages long, and in practice you’ll need to implement much more than just this one RFC).

As a result, for protocols-over-UDP I would certainly prefer a ready-to-use library such as libquic.

On the other hand, using hacked-TCP over RAW sockets MIGHT have some merit because of being more firewall-friendly; however, I would still prefer if somebody implemented (and tested in the wild!) such a library for me instead of doing it myself. One of the things I would look for in such hacked-for-gaming-TCP, would be working without exponential backoff (but for a very limited time, to avoid congesting the Internet beyond what-we-really-need, see discussion about “time-critical mode” and “connection-seeking mode” in “Retransmission Policies” section above).

However, even if only 'hacking' the Server-Side of your TCP connections, it MIGHT provide some useful results

One further note about hacked-TCP over RAW sockets: you certainly SHOULD NOT count on RAW sockets being available "everywhere" (and even less on them allowing you to create packets-pretending-to-be-TCP). In practice, this means that your "hacked" version will most likely be limited to the Server-Side (most likely running Linux).¹⁵ However, even if only "hacking" the Server-Side of your TCP connections, it MIGHT provide some useful results (such as limitations on exponential backoff when sending data, and/or allowing to get out-of-order Client-Side data before retransmission-of-missing-packet happens).

¹⁵ as, for example, Windows do NOT allow to use RAW sockets for TCP packets)

Dealing with Firewalls

It is usually a not-so-bad idea to allow your players to play from behind the firewalls. In modern world, you never know where your player will be located next time she plays (and with mobile/hotel/public-WiFi networks, running into a firewall is rather likely).

Magic Port 443

to allow playing from behind firewalls, one very simple thing tends to help significantly: it is mere listening on TCP port 443

Surprisingly, to allow playing from behind firewalls, one very simple thing tends to help significantly: it is mere listening on TCP port 443¹⁶ 🙂 . Especially if you’re using standard TLS, ISP cannot possibly tell what’s going on inside encrypted portion of TLS, and port 443 is a standard port for HTTPS (which is usually enabled even in hotels), so – the traffic goes through (as a rule of thumb¹⁷).

¹⁶ And Client attempting to connect there, of course

¹⁷ it IS still possible to disable games over the firewalls – especially in high security work environments – but for hotels and other semi-public networks I didn’t see it happening (nor they have Good Reasons to do it either), and I certainly don’t have any intentions to mess with Really High-Security work environments (read: DoD etc.)

Web sockets

One other thing you may want to try to be more firewall-friendly, is to use Websockets (WSS variation a.k.a. WebSockets Secure) as one of fallbacks; I didn't do it myself, but there were reports that some of the firewalls are less annoying timeout-wise when you explicitly declare that you're using Websockets (while some other firewalls were reported to disallow Websockets completely). On the other hand, I do NOT expect the difference with plain-TCP-over-port-443 to be anywhere significant.

Note though that quite a few of HTTP proxies tend to have problems with non-HTTPS (WS opposed to WSS) variation of Websockets; as a result, using non-HTTPS Websockets is NOT recommended.

Be ready to Frequent Reconnects

Some of the more annoying firewalls out there happen to limit time of your HTTPS session to single-digit minutes. While it is not too big of a problem per se, you need to keep it in mind (and quite often methods described in "Dealing with Hanged Connections” section above, seems to help too).

Websockets: TCP in Disguise

For most of intents and purposes, you can think of a Websocket connection as of a TCP one (and for WSS connections - as of 'TLS-over-TCP').

In a nutshell, Websocket connection starts over TCP as your usual HTTP/HTTPS conversation, and then switches from HTTP Request/Response into good ol' TCP-as-bidirectional-stream. For most of intents and purposes, you can think of a Websocket connection as of a TCP one (and for WSS connections - as of "TLS-over-TCP"). Most of the differences between Websockets and TCP are related to their support in browsers, but other than that the differences are usually pretty much negligible (especially if you're using TLS-over-port-443).Bottom line:

if your game is browser-based - use Websockets over HTTPS (i.e. WSS)
- on the Server-Side - make sure to use those TCP options described above (as long as they're relevant to your game)
- on the Client-Side, you MAY have quite a bit of problems, which have their roots in the TCP stuff discussed above, but which MAY be more difficult to resolve within JS and without direct access to options such as TCP_NODELAY. See, for example, StackOverflow.Websockets.
other than that - feel free to experiment, but most likely, you won't see much difference
- note that websockets normally have a bit longer initial connection times (of the order of one extra RTT)

[[TODO: passing embedded Web browser over Client channel]]

[[To Be Continued...

This concludes beta Chapter 13(d) from the upcoming book "Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)". Stay tuned for beta Chapter 13(e), describing different APIs for handling sockets on different platforms, as well as ways to test network-related stuff.]]

References

[RFC6298] "https://tools.ietf.org/html/rfc6298"

[Stevens] W. Richard Stevens, Bill Fenner, Andrew M. Rudoff, "The Sockets Networking API (3rd Edition)"

[StackOverflow] "Is there an equivalent to TCP_CORK in Winsock?"

[Lightstreamer] "LightStreamer Server"

[SI98] Sergey Ignatchenko, "An Algorithm for Online Data Compression", C Users Journal

[QUIC] Jim Roskind, "Quick UDP Internet Connections. Multiplexed Stream Transport over UDP"

[libquic] "https://github.com/devsisters/libquic"

[libssh] "https://www.libssh.org/"

[libsodium] "https://download.libsodium.org/doc/"

[libchloride] "https://github.com/jedisct1/libchloride"

[RFC793] "https://tools.ietf.org/html/rfc793"

[StackOverflow.Websockets] "WebSocket TCP packets clumping together?"

6IT Hare

TCP and Websockets for Games