Why do we need a 3-way handshake? Why not just 2-way?

104

69

The TCP 3-way handshake works like this:

Client ------SYN-----> Server
Client <---ACK/SYN---- Server
Client ------ACK-----> Server

Why not just this?

Client ------SYN-----> Server
Client <-----ACK------ Server

smwikipedia

Posted 2015-11-04T14:36:05.997

Reputation: 621

23Why do we even need a handshake? Why can't the message be sent with the first packet? – Mehrdad – 2015-11-04T20:50:54.917

4If you want to skip the handshake you could use UDP instead. – OzNetNerd – 2015-11-04T22:26:14.983

5

@Mehrdad, if you have a question of your own, please use the Ask Question link at the top of the page to post your own.

– YLearn – 2015-11-04T23:12:12.590

38@YLearn: Sorry, it's not really a question of my own, but rather it was to motivate readers to give answers that dig a little deeper than what is literally stated in the question. – Mehrdad – 2015-11-05T00:03:37.847

2Don't forget about TCP Fast Open (RFC 7413) – Alnitak – 2015-11-06T23:25:34.813

QUIC https://en.m.wikipedia.org/wiki/QUIC

– kmiklas – 2018-03-05T23:16:50.757

Although the 3-way-handshake might be a waste, I think that it does not matter if the client starts to send data. Think of a simple HTTP request: The client can start to send payload immediately after the ack package without waiting for yet another server response. The bloated HTTP headers might be a much larger waste. – Mike76 – 2018-10-03T22:05:17.807

Answers

140

Break down the handshake into what it is really doing.

In TCP, the two parties keep track of what they have sent by using a Sequence number. Effectively it ends up being a running byte count of everything that was sent. The receiving party can use the opposite speaker's sequence number to acknowledge what it has received.

But the sequence number doesn't start at 0. It starts at the ISN (Initial Sequence Number), which is a randomly chosen value. And since TCP is a bi-directional communication, both parties can "speak", and therefore both must randomly generate an ISN as their starting Sequence Number. Which in turn means, both parties need to notify the other party of their starting ISN.

So you end up with this sequence of events for a start of a TCP conversation between Alice and Bob:

Alice ---> Bob    SYNchronize with my Initial Sequence Number of X
Alice <--- Bob    I received your syn, I ACKnowledge that I am ready for [X+1]
Alice <--- Bob    SYNchronize with my Initial Sequence Number of Y
Alice ---> Bob    I received your syn, I ACKnowledge that I am ready for [Y+1]

Notice, four events are occurring:

  1. Alice picks an ISN and SYNchronizes it with Bob.
  2. Bob ACKnowledges the ISN.
  3. Bob picks an ISN and SYNchronizes it with Alice.
  4. Alice ACKnowledges the ISN.

In actuality though, the middle two events (#2 and #3) happen in the same packet. What makes a packet a SYN or ACK is simply a binary flag turned on or off inside each TCP header, so there is nothing preventing both of these flags from being enabled on the same packet. So the three-way handshake ends up being:

Bob <--- Alice         SYN
Bob ---> Alice     SYN ACK 
Bob <--- Alice     ACK     

Notice the two instances of "SYN" and "ACK", one of each, in both directions.


So to come back to your question, why not just use a two-way handshake? The short answer is because a two way handshake would only allow one party to establish an ISN, and the other party to acknowledge it. Which means only one party can send data.

But TCP is a bi-directional communication protocol, which means either end ought to be able to send data reliably. Both parties need to establish an ISN, and both parties need to acknowledge the other's ISN.

So in effect, what you have is exactly your description of the two-way handshake, but in each direction. Hence, four events occurring. And again, the middle two flags happen in the same packet. As such three packets are involved in a full TCP connection initiation process.

Eddie

Posted 2015-11-04T14:36:05.997

Reputation: 8 911

4Why do we need ISNs at all? Humans don't need it, why do computers? Is there a proof of this, or do we just have them because they're convenient? – Mehrdad – 2015-11-04T20:52:50.080

16

@Mehrdad: You need sequence numbers for retransmissions to work properly (or indeed at all). The ISN can't just be zero because of sequence prediction attacks.

– Kevin – 2015-11-04T21:04:41.263

1@Mehrdad Two humans talking have non-verbal and visual queues to indicate reception of a message. Nodding, or affirming, or even sometimes repeating what the speaker said. Where as two computers talking are like two humans with no functioning senses except for hearing. Think of the theory of it, how would you devise a set of rules which both speakers and receiver could follow, that would guarantee the reception of each message, in the right order, and re-transmit as necessary when something didn't get through. You'll find you end up at a solution that will work just like Sequence numbers. – Eddie – 2015-11-04T21:37:45.857

1@Eddie: But: "that would guarantee the reception of each message"... that's not something you can ever guarantee with an algorithm either, so that's a moot point. Furthermore, are you telling me that you can't tcommunicate purely verbally with those around you without a SYN-ACK-SYNACK sequence? Like, when someone asks you a question from a nearby cubicle, you don't understand what they're saying unless they say "hello" and you say "hi" back first? – Mehrdad – 2015-11-04T21:40:58.833

@Mehrdad Maybe guarantee the reception isn't the right phrase, but guarantee the awareness that the message has either been received, or not received, and react appropriately. Either way, you're asking great questions, but the comments might not be the ideal way to talk through them. Shall we continue the discussion in chat?

– Eddie – 2015-11-04T22:07:04.703

1@Eddie: I'm too busy for real-time chat unfortunately. I intended these to be somewhat rhetorical question -- I wasn't so much expecting a response to myself, but rather trying to point out that the question deserves a deeper answer than just "because X needs it", since it's not fulfilling if the natural question of "why do we need X?" is left unanswered. – Mehrdad – 2015-11-04T22:13:00.667

4@Mehrdad The chat room doesn't necessarily have to be 'real time', we can leave messages for each other. The reason I thought to direct it elsewhere is because you are now asking a different question. The OP asked "why is it a 3 way handshake instead of 2", but now you questioning "why do we need Sequence numbers at all", which is different. Rather than derail this thread, I thought we should discuss the other question in chat. Alternatively, you can post a new question, I'm sure it will net some good answers. – Eddie – 2015-11-04T22:19:00.773

@Eddie I just realized that besides the need for sync sequence numbers of each party, there's some logical/psychological reason as well. Only after the active sender got the acknowledgement from the target receiver, can the sender be sure that the message he sent is actually well received by the receiver. So both parties of the communication need to be the active sender for at least one time, respectively. – smwikipedia – 2015-11-05T05:09:57.280

1@smwikipedia Yup, exactly. That is why the ACKnowledgmenet step is to acknowledge with [X+1], or [Y+1]. It proves X and Y were received, confirming to the active sender (as you put it) that full 2-way communication is working. – Eddie – 2015-11-05T07:17:21.850

4Great, concise answer. Reading "ACK SYN" feels fundamentally wrong but you even explained that so +1. – Lilienthal – 2015-11-05T12:52:30.397

@Mehrdad sequence numbers are essential to any stream of data that could arrive out of order, or be re-ordered for efficiency. IP packets are not guaranteed to arrive in-order. SAS/SATA requests are also reordered by drives/controllers and have command numbers for a similar reason. – mikebabcock – 2015-11-05T14:28:44.307

2

According to RFC 793, Transmission Control Protocol: "The principle reason for the three-way handshake is to prevent old duplicate connection initiations from causing confusion."

– Ron Maupin – 2015-11-05T18:05:20.357

@Mehrdad - people do need sequence numbers (or at least synchronization). Have you never been in the middle of a conversation and someone comes in the middle and says something like "Wait, so he did that thing before she did this other thing or was it the other way around? Most of the time ordering is obvious from contex (and because out of order transmission is rare in spoken conversation), but sometimes resynchronization is necessary. – Johnny – 2015-11-05T23:09:02.013

@Johnny: My point was that it's not necessary to initiate a conversation in many cases. My point was not that there exists no situation in which it's necessary. – Mehrdad – 2015-11-05T23:45:53.487

@Mehrdad I still think you should ask your questions as their own questions ;), it would bring about some fantastic responses. Even though I know you were only asking to spur depth in answering this question (which honestly, I don't think we quite answered to that level of depth). The attention your questions have gotten definitely deserve their own thread ;). – Eddie – 2015-11-06T00:02:21.733

1

@Eddie: Did you consider switching the position of Alice and Bob so that SYN ACK will be in the right order? e.g.: http://paste.ubuntu.com/13124370/

– Paul – 2015-11-06T14:24:49.370

2@Mehrdad Saw this in Hot Questions and wanted to point out: We do have such a feature. In linguistics, one of the purposes of common greetings ("Hi", "Hello", etc) is to acclimate the other person to your voice. – Izkata – 2015-11-06T15:00:48.500

@Paul Brilliant! Took your suggestion. Great call! – Eddie – 2015-11-06T17:20:23.303

1This is not a very good answer. One could easily imagine another reliable-delivery transport protocol wherein both parties start on a common ISN - for example, zero. The OP was asking fundamentally why a three way handshake is desirable, not requesting a rote regurgitation of the particulars of TCP. – Brian Gordon – 2015-11-07T02:46:37.023

1@Mehrdad Yes, you can have network communication that works like you think. It's called UDP and it's great for things like streaming video. It occasionaly drops packets, but most of the message gets through most of the time. If you're transfering things that need perfectly accurate transcription, however, (code, finance, documents...) then you need a rigorous protocol. I challenge you to remember your last conversation - word for word, every expression, timing, and piece of body language - and guarantee that you didn't make even the slightest mistake. TCP can do that. ISN is part of how. – J... – 2015-11-07T11:08:51.403

@BrianGordon Well said. It is amazing how people can be experts on a protocol without thinking about why the protocol exists. Se my answer below. – Tuntable – 2016-05-23T02:03:00.087

22

The three-way handshake is necessary because both parties need to synchronize their segment sequence numbers used during their transmission. For this, each of them sends (in turn) a SYN segment with a sequence number set to a random value n, which then is acknowledged by the other party via a ACK segment with a sequence number set to n+1.

dr01

Posted 2015-11-04T14:36:05.997

Reputation: 1 105

Why is the acknowledgement needed? – Paŭlo Ebermann – 2015-11-04T21:18:12.740

4@PaŭloEbermann: Because otherwise the Server has no idea if the client ever received the SYN, and it's important that the client receives that. – Mooing Duck – 2015-11-04T21:21:56.900

2@PaŭloEbermann And to prove it, the ACK step is to acknowledge with [X+1]. -- quoted from Eddie's comment to his answer. – smwikipedia – 2015-11-05T08:22:10.970

12

In order for the connection to work, each side needs to verify that it can send packets to the other side. The only way to be sure that you got a packet to the other side is by getting a packet from them that, by definition, would not have been sent unless the packet you sent got through. TCP essentially uses two kinds of messages for this: SYN (to request proof that this packet got through) and ACK (which only gets sent after a SYN gets through, to prove that the SYN got through). There's actually a third kind of message, but we'll get to that in a moment.

Before the connection starts, neither side really knows anything about the other. The client sends a SYN packet to the server, to request proof that its messages can get through. That doesn't tell either person anything, but it's the first step of the handshake.

If the SYN gets through, then the server knows that the client can send packets to it, because, well, it just happened. But that doesn't prove that the server can send packets back: clients can send SYNs for lots of reasons. So the server needs to send two messages back to the client: an ACK (to prove that the SYN got through) and a SYN (to request an ACK of its own). TCP combines these two messages into one -a SYN-ACK message, if you will- to reduce network traffic. This is the second step of the handshake.

Because a SYN-ACK is an ACK, the client now knows for sure that it can send packets to the server. And because a SYN-ACK is a SYN, it also knows that the server wants proof that this message got through. So it sends back an ACK: just a plain ACK this time, because it doesn't need proof anymore that its packets can get through. This is the final step of the handshake: the client now knows that packets can go both ways, and that the server is just about to figure this out (because it knows the ACK will go through).

Once that ACK gets through, now the server knows that it can send packets to the client. It also knows that the client knows this, so it can start sending data right away. The handshake is complete. We have a good channel.

Well, strictly speaking, we can't be certain we have a good channel. Just because this sequence of packets got through does not strictly guarantee that others will. We can't prove that without sending an infinite number of SYNs and ACKs, and then nothing else would ever get done, so that's not really a practical option. But in practice, three steps turns out to be good enough for most purposes.

The Spooniest

Posted 2015-11-04T14:36:05.997

Reputation: 221

This is untrue: "an ACK (which only gets sent in response to SYNs, and thus proves that the SYN got through)."

Only the first packet sent from each end has the SYN flag set, and all packets other than the very first packet of the 3-way handshake have the ACK flag set. The first packet can't ACK because the second party hasn't SYNed yet, but every packet after the first must ACK whatever has already been received from the other end, whether any data be sent back or not. – Monty Harder – 2016-10-31T14:43:09.330

Thanks. Rewording: ACKs get sent once a SYN gets through, rather than only being sent in response to SYNs. – The Spooniest – 2016-10-31T15:53:07.153

7

Actually, a 3-way handshake isn't the only means of establishing a TCP connection. Simultaneous SYN exchange is also allowed: http://www.tcpipguide.com/free/t_TCPConnectionEstablishmentProcessTheThreeWayHandsh-4.htm

That could be seen as a sort of double 2-way handshake.

Lexelby

Posted 2015-11-04T14:36:05.997

Reputation: 71

1Good point, however this is very rare as both devices will have to be using the same source/destination port and both devices will need to send a SYN before the other receives the SYN. Even when it does occur, it involves four packets being sent, which is more than the three packets required by the traditional 3-way handshake; ultimately only the possibility of being slightly faster to set up in terms of overall time at the cost of less overall efficiency (requires 33% more packets to be transmitted). – YLearn – 2015-11-05T16:39:44.320

4

TCP connection is bidirectional. What this means is that it actually is a pair of one-way connections. The initiator sends SYN, the responder sends ACK: one simplex connection begins. "Then" the responder sends SYN, the initiator sends ACK: another simplex connection begins. Two simplex connections form one duplex TCP session, agree? So logically there are four steps involved; but because SYN and ACK flags are different "fields" of TCP header, they can be set simultaneously - the second and the third steps (of the four) are combined, so technically there are three packet exchanges. Each simplex (half-)connection uses 2-way exchange, as you proposed.

Sergio

Posted 2015-11-04T14:36:05.997

Reputation: 81

1

It is not necessary at all. It is obvious that a short message should only require one packet to the server which includes the start + message, and one packet back acknowledging it.

The previous answers just describe the system without discussing the need for random sequence numbers etc. in the first place. The original question was about the design of TCP itself -- obviously if you use the TCP protocol then you need three messages because that is the protocol. But why was TCP designed that way in the first place?

I believe the original idea was that there was no distinction between clients and servers. Both knew the other's ports in a bidirectional manner, and either could start the conversation. And that required Syns etc.

But this is not, of course, how it is used today. The server listens on a well known port and does and "accept", the client port number is ephemeral. I do not even think it is possible for a server waiting on an "accept" to send a request to another on the same client port number in normal operating systems.

(Note that this is about bidirectional initiation of the connection, which is never done today. That is quite different from sending bidirectional messages down a connection once established.)

To work around the TCP inefficiency, we use protocols like HTTP 1.1 which can reuse the same connection for multiple requests, and thus avoid the TCP handshake which was not necessary in the first place.

But Http 1.1 is relatively new. And SSL/TLS needed a way to reuse session from the beginning due to the cost of the PKI algorithms. So that protocol includes its own session reuse mechanism which runs on top of Http 1.1 which runs on top of TCP.

Such is the way with software. Fudges on kludges which when combined, produce an acceptable result.

Tuntable

Posted 2015-11-04T14:36:05.997

Reputation: 119

Anything above OSI layer-4 (e.g. HTTP, FTP, etc.) is explicitly off-topic here. In layers 1 to 4, there is no such thing as client/server. TCP is a connection between peers. Yes, upper layer protocols create a client/server relationship, but that is off-topic here. – Ron Maupin – 2016-05-23T02:10:32.393

1

By the way, HTTP uses TCP, so the TCP handshake is still necessary. Read RFC 793 TRANSMISSION CONTROL PROTOCOL to undestand why. Protocols like HTTP require the application to do the multiplexing that TCP would normally do for the application.

– Ron Maupin – 2016-05-23T02:18:14.153

@RonMaupin The original question was why? And the answer is to support a use case that is never used by the upper level layers in practice. So, seems pretty relevant. – Tuntable – 2016-05-23T02:18:57.740

@RonMaupin Yes, HTTP uses TCP. Which I have clarified, thanks. But that does not make the TCP handshake necessary in any deep sense. – Tuntable – 2016-05-23T02:20:01.340

1The applications and application-layer protocols are explicitly off-topic here. @Eddie answered the question, and if you read and understand the TCP RFC, you will get why the handshake is necessary. I don't think it adds anything for you to claim, without any support, that the handshake is not necessary, when it clearly is. – Ron Maupin – 2016-05-23T02:21:42.643

@RonMaupin Eddie and the others describe why the double handshake is needed to establish Syn and the bidirectional establish. But they do not mention that nobody actually uses bidirectional establishment today, and therefor at more meaningful level the handshake is unnecessary because it supports a non-existent use case. – Tuntable – 2016-05-23T02:26:16.097

You are completely incorrect about that. TCP has a separate connection for each browser client to the HTTP server. The same thing happens for things like FTP. From the perspective of the network (what is actually on-topic here) The three-way handshake is used, and it is necessary. – Ron Maupin – 2016-05-23T02:30:22.157

@RonMaupin IP connections are identified by both server port and (ephemeral) client port. That is why they do not get mixed up. Nothing to do with the handshake. – Tuntable – 2016-05-23T07:38:17.453

IP knows nothing about ports. That is a layer-4 (TCP, UDP, etc.) address. You don't seem to understand the different network layers. Ethernet, and some other layer-2 protocols, use MAC addresses, IP uses IP addresses, TCP and UDP use ports as addresses, and some layer-4 protocols don't even have addressing. You cannot attribute ports to IP, since IP knows nothing about ports. – Ron Maupin – 2016-05-23T14:10:19.910

You're correct in saying that the top accepted answer doesn't offer any reason why a three-way handshake is better than a two-way handshake. However, you're wrong in saying that there is no reason. The reason you're looking for is in @TheSpooniest 's answer. – Brian Gordon – 2016-05-23T21:09:49.277

Thanks @Tuntable, you provided me the real answer I wanted to the actual question "why a sync/handshake is needed in the first place ?". The most satisfying answer (which you gave) is for me : to avoid both machines to start conversations simultaneously. With handshake, they first agree on who will be the initiator of the conversation so that there is at most one conversation at any given time. – ismax – 2018-11-03T10:43:49.733

1

After reading answer of Eddie (accepted as correct), there are still question why 1st host can not assign both ISN's with random numbers and 2nd just accept it. Real reason of using 3-way handshake is to avoid half-connections. Half connection scenario in 2-way handshake:
1) Client ---SYN--> Server
2) Client changes his mind and doesn't want to connect anymore
3) Client <-X-ACK-- Server //ACK was lost
Server doesn't see resent SYN, so he thinks that client got his ACK and connection is established. As a result Server has connection that will never be closed

Sanzhar Yeleuov

Posted 2015-11-04T14:36:05.997

Reputation: 11

Actually, if a host (clients and servers are an application concept about which TCP knows nothing) receives an ACK or any traffic on a non-existent connection (step 3 in your scenario), it will send a RST, not ignore the received segment. – Ron Maupin – 2018-05-15T14:42:06.543

@RonMaupin Then let's assume situation when ACK packet was lost. – Sanzhar Yeleuov – 2018-05-15T16:34:43.687

If the ACK is lost, then the initiated connection in step 1 will time out. RFC 793 has a full explanation of all types of scenarios, including diagrams. – Ron Maupin – 2018-05-15T16:38:08.707

@RonMaupin I mean if scenario from my post remain same, only thing that changed, that ACK was lost. – Sanzhar Yeleuov – 2018-05-15T16:43:13.023

It's all in the RFC. Until a connection is open, any traffic received will result in a RST. The three-way handshake negotiates the connection parameters, so the "server" cannot send anything back to the "client" but it's SYN/ACK until it receives an ACK from the "client". If the "server" SYN/ACK back to the "client" is lost, the "server" will try again. The RFC explains all this. – Ron Maupin – 2018-05-15T16:49:51.200

@RonMaupin Please re-read my answer again, I mean that accepted answer doesn't answer the question totally. I wanted to show that 2-way handshake may lead to half-connection, what is not good for server. – Sanzhar Yeleuov – 2018-05-15T16:55:08.537

There is no two-way handshake. You cannot negotiate the parameters in a two-way handshake, and any traffic received until the three-way handshake is complete will result in a RST. "Server doesn't see resent SYN, so he thinks that client got his ACK and connection is established." No, it doesn't because it has a timer waiting for an ACK back from its SYN/ACK. It's not very complicated. – Ron Maupin – 2018-05-15T17:02:13.823

@RonMaupin you are not even trying to understand what I meaning with my post. Be patient and re-read. In question there are asking why we do not use 2-way handshake (right under "Why not just this?" ). I gave a scenario in my answer (btw i modified it) when 2-way handshake may lead to a problem called half-connection. And to avoid such situations we use 3-way handshake – Sanzhar Yeleuov – 2018-05-15T17:10:48.010

I understand what you are trying to say, but you don't seem to understand what I am trying to say. I think part of it is that you are stuck on client/server, which TCP doesn't do. TCP is peer-to-peer, and there must be a negotiation for each peer. In client/server, one side can dictate, but TCP knows nothing about clients or servers, which is an application concept. Each TCP peer is equal, and each can both send and receive, so each has a two-way handshake that gets compressed to a three-way handshake. – Ron Maupin – 2018-05-15T17:17:32.987

@RonMaupin stop trying to apply TCP to my scenario. In my scenario there is a protocol that uses 2-way handshake to make a connection. Both sides can send SYN to request for connection, and both sides if got a SYN and agree to make connection, reply with ACK and think that connection is established. And after getting ACK also think that connection is established – Sanzhar Yeleuov – 2018-05-15T17:32:21.513

"stop trying to apply TCP to my scenario." The question is specifically about TCP. Answering for another, hypothetical protocol is gratuitous. – Ron Maupin – 2018-05-15T17:36:42.993

@RonMaupin question is not about TCP, it is about why TCP was developed in that way – Sanzhar Yeleuov – 2018-05-15T17:37:47.807

"question is not about TCP, it is about why TCP was developed in that way " That makes no sense. A question about how/why TCP was developed is a question about TCP. Your hypothetical, client/server protocol has nothing to do with the TCP peer-to-peer model. In client/server, one side is in charge, but in peer-to-peer, each side is equal. – Ron Maupin – 2018-05-15T17:43:21.343

@RonMaupin but my protocol is peer-to-peer, not client/server please read carefully. It is just names for example. You could change direction of arrows if you want. – Sanzhar Yeleuov – 2018-05-15T17:53:58.243

In peer-to-peer, like TCP, each side must negotiate the connection. Your example only has only one side negotiating the connection. That is why TCP must have a three-way (really a compressed four-way) handshake. There must be a SYN from each side that is ACKed by the other side to negotiate the two-way connection. – Ron Maupin – 2018-05-15T17:58:26.767

When one peer sends a SYN, it says, "I want to talk, and these are my capabilities." The ACK from the other side says, "OK. I can send that way." Then the other side sends a SYN that says, "I want to talk, and these are my capabilities," and the other side sends a SYN saying, "OK. I can send that way." That is peer-to-peer. When only one side send a SYN, then it is dictating the terms for one-way traffic, but TCP is two-way traffic. – Ron Maupin – 2018-05-15T18:08:57.690