1.\" $NetBSD: tcp.4,v 1.3 1994/11/30 16:22:35 jtc Exp $ 2.\" 3.\" Copyright (c) 1983, 1991, 1993 4.\" The Regents of the University of California. All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 3. All advertising materials mentioning features or use of this software 15.\" must display the following acknowledgement: 16.\" This product includes software developed by the University of 17.\" California, Berkeley and its contributors. 18.\" 4. Neither the name of the University nor the names of its contributors 19.\" may be used to endorse or promote products derived from this software 20.\" without specific prior written permission. 21.\" 22.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 23.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 25.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 26.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 28.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 29.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 30.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 31.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 32.\" SUCH DAMAGE. 33.\" 34.\" @(#)tcp.4 8.1 (Berkeley) 6/5/93 35.\" 36.Dd March 18, 2015 37.Dt TCP 4 38.Os BSD 4.2 39.Sh NAME 40.Nm tcp 41.Nd Internet Transmission Control Protocol 42.Sh SYNOPSIS 43.In sys/types.h 44.In sys/socket.h 45.In netinet/in.h 46.Ft int 47.Fn socket AF_INET SOCK_STREAM 0 48.Sh DESCRIPTION 49The 50.Tn TCP 51protocol provides reliable, flow-controlled, two-way 52transmission of data. 53It is a byte-stream protocol used to 54support the 55.Dv SOCK_STREAM 56abstraction. 57.Tn TCP 58uses the standard 59Internet address format and, in addition, provides a per-host 60collection of 61.Dq "port addresses" . 62Thus, each address is composed 63of an Internet address specifying the host and network, 64with a specific 65.Tn TCP 66port on the host identifying the peer entity. 67.Pp 68Sockets utilizing the 69.Tn TCP 70protocol are either 71.Dq active 72or 73.Dq passive . 74Active sockets initiate connections to passive 75sockets. 76By default, 77.Tn TCP 78sockets are created active; to create a 79passive socket, the 80.Xr listen 2 81system call must be used 82after binding the socket with the 83.Xr bind 2 84system call. 85Only passive sockets may use the 86.Xr accept 2 87call to accept incoming connections. 88Only active sockets may use the 89.Xr connect 2 90or 91.Xr connectx 2 92call to initiate connections. 93.Pp 94Passive sockets may 95.Dq underspecify 96their location to match 97incoming connection requests from multiple networks. 98This technique, termed 99.Dq "wildcard addressing" , 100allows a single 101server to provide service to clients on multiple networks. 102To create a socket which listens on all networks, the Internet 103address 104.Dv INADDR_ANY 105must be bound. 106The 107.Tn TCP 108port may still be specified 109at this time; if the port is not specified, the system will assign one. 110Once a connection has been established, the socket's address is 111fixed by the peer entity's location. 112The address assigned to the 113socket is the address associated with the network interface 114through which packets are being transmitted and received. 115Normally, this address corresponds to the peer entity's network. 116.Pp 117.Tn TCP 118supports a number of socket options which can be set with 119.Xr setsockopt 2 120and tested with 121.Xr getsockopt 2 : 122.Bl -tag -width ".Dv TCP_CONNECTIONTIMEOUT" 123.It Dv TCP_NODELAY 124Under most circumstances, 125.Tn TCP 126sends data when it is presented; 127when outstanding data has not yet been acknowledged, it gathers 128small amounts of output to be sent in a single packet once 129an acknowledgement is received. 130For a small number of clients, such as window systems 131that send a stream of mouse events which receive no replies, 132this packetization may cause significant delays. 133The boolean option 134.Dv TCP_NODELAY 135defeats this algorithm. 136.It Dv TCP_MAXSEG 137By default, a sender- and 138.No receiver- Ns Tn TCP 139will negotiate among themselves to determine the maximum segment size 140to be used for each connection. 141The 142.Dv TCP_MAXSEG 143option allows the user to determine the result of this negotiation, 144and to reduce it if desired. 145.It Dv TCP_NOOPT 146.Tn TCP 147usually sends a number of options in each packet, corresponding to 148various 149.Tn TCP 150extensions which are provided in this implementation. 151The boolean option 152.Dv TCP_NOOPT 153is provided to disable 154.Tn TCP 155option use on a per-connection basis. 156.It Dv TCP_NOPUSH 157By convention, the 158.No sender- Ns Tn TCP 159will set the 160.Dq push 161bit, and begin transmission immediately (if permitted) at the end of 162every user call to 163.Xr write 2 164or 165.Xr writev 2 . 166When this option is set to a non-zero value, 167.Tn TCP 168will delay sending any data at all until either the socket is closed, 169or the internal send buffer is filled. 170.It Dv TCP_KEEPALIVE 171.Tn The 172.Dv TCP_KEEPALIVE 173options enable to specify the amount of time, in seconds, that the 174connection must be idle before keepalive probes (if enabled) are sent. 175The default value is specified by the 176.Tn MIB 177variable 178.Va net.inet.tcp.keepidle . 179.It Dv TCP_CONNECTIONTIMEOUT 180.Tn The 181.Dv TCP_CONNECTIONTIMEOUT 182option allows to specify the timeout, in seconds, for new, non established 183.Tn TCP 184connections. This option can be useful for both active and passive 185.Tn TCP 186connections. The default value is specified by the 187.Tn MIB 188variable 189.Va net.inet.tcp.keepinit . 190.It Dv TCP_KEEPINTVL 191When keepalive probes are enabled, this option will set the amount of time in seconds between successive keepalives sent to probe an unresponsive peer. 192.It Dv TCP_KEEPCNT 193.Tn When keepalive probes are enabled, this option will set the number of times a keepalive probe should be repeated if the peer is not responding. After this many probes, the connection will be closed. 194.It Dv TCP_SENDMOREACKS 195When a stream of 196.Tn TCP 197data packets are received, OS X uses an algorithm to reduce the number of acknowlegements by generating a 198.Tn TCP 199acknowlegement for 8 data packets instead of acknowledging every other data packet. When this socket option is enabled, the connection will always send a 200.Tn TCP 201acknowledgement for every other data packet. 202.It Dv TCP_ENABLE_ECN 203Using Explicit Congestion Notification (ECN) on 204.Tn TCP 205allows bi-directional end-to-end notification of congestion without dropping packets. Conventionally TCP/IP networks signal congestion by dropping packets. When ECN is successfully negotiated, an ECN-aware router may set a mark in the IP header instead of dropping a packet in order to signal impending congestion. The 206.Tn TCP 207receiver of the packet echoes congestion indication to the 208.Tn TCP 209sender, which reduces it's transmission rate as if it detected a dropped packet. This will avoid unnecessary retransmissions and will improve latency by saving the time required for recovering a lost packet. 210.It Dv TCP_NOTSENT_LOWAT 211The send socket buffer of a 212.Tn TCP sender has unsent and unacknowledged data. This option allows a 213.Tn TCP sender to control the amount of unsent data kept in the send socket buffer. The value of the option should be the maximum amount of unsent data in bytes. Kevent, poll and select will generate a write notification when the unsent data falls below the amount given by this option. This will allow an application to generate just-in-time fresh updates for real-time communication. 214.It Dv TCP_FASTOPEN 215The TCP listener can set this option to use TCP Fast Open feature. After 216setting this option, an 217.Xr accept 2 218may return a socket that is in SYN_RECEIVED state but is readable and writable. 219.It Dv TCP_CONNECTION_INFO 220This socket option can be used to obtain TCP connection level statistics. The 221"struct tcp_connection_info" defined in <netinet/tcp_var.h> is copied to the 222user buffer. 223.El 224.Pp 225The option level for the 226.Xr setsockopt 2 227call is the protocol number for 228.Tn TCP , 229available from 230.Xr getprotobyname 3 , 231or 232.Dv IPPROTO_TCP . 233All options are declared in 234.In netinet/tcp.h . 235.Pp 236Options at the 237.Tn IP 238transport level may be used with 239.Tn TCP ; 240see 241.Xr ip 4 . 242Incoming connection requests that are source-routed are noted, 243and the reverse source route is used in responding. 244.Ss "Non-blocking connect" 245.Pp 246When a 247.Tn TCP 248socket is set non-blocking, and the connection cannot be established immediately, 249.Xr connect 2 250or 251.Xr connectx 2 252returns with the error 253.Dv EINPROGRESS , 254and the connection is established asynchronously. 255.Pp 256When the asynchronous connection completes successfully, 257.Xr select 2 258or 259.Xr poll 2 260or 261.Xr kqueue 2 262will indicate the file descriptor is ready for writing. 263If the connection encounters an error, the file descriptor 264is marked ready for both reading and writing, and the pending error 265can be retrieved via the socket option 266.Dv SO_ERROR . 267.Pp 268Note that even if the socket is non-blocking, it is possible for the connection 269to be established immediately. In that case 270.Xr connect 2 271or 272.Xr connectx 2 273does not return with 274.Dv EINPROGRESS . 275.Sh DIAGNOSTICS 276A socket operation may fail with one of the following errors returned: 277.Bl -tag -width Er 278.It Bq Er EISCONN 279when trying to establish a connection on a socket which 280already has one; 281.It Bq Er ENOBUFS 282when the system runs out of memory for 283an internal data structure; 284.It Bq Er ETIMEDOUT 285when a connection was dropped 286due to excessive retransmissions; 287.It Bq Er ECONNRESET 288when the remote peer 289forces the connection to be closed; 290.It Bq Er ECONNREFUSED 291when the remote 292peer actively refuses connection establishment (usually because 293no process is listening to the port); 294.It Bq Er EADDRINUSE 295when an attempt 296is made to create a socket with a port which has already been 297allocated; 298.It Bq Er EADDRNOTAVAIL 299when an attempt is made to create a 300socket with a network address for which no network interface 301exists; 302.It Bq Er EAFNOSUPPORT 303when an attempt is made to bind or connect a socket to a multicast 304address; 305.It Bq Er EINPROGRESS 306returned by 307.Xr connect 2 308or 309.Xr connectx 2 310when the socket is set nonblocking, and the connection cannot be 311immediately established; 312.It Bq Er EALREADY 313returned by 314.Xr connect 2 315or 316.Xr connectx 2 317when connection request is already in progress for the specified socket. 318.It Bq Er ENODATA 319returned by 320.Xr recv 2 321or 322.Xr send 2 323in case a connection is experiencing a data-stall (probably due to a middlebox issue). 324It is advised that the current connection gets closed by the application and a 325new attempt is being made. 326. 327.El 328.Sh SEE ALSO 329.Xr connect 2 , 330.Xr connectx 2 , 331.Xr getsockopt 2 , 332.Xr kqueue 2 , 333.Xr poll 2 , 334.Xr select 2 , 335.Xr socket 2 , 336.Xr sysctl 3 , 337.Xr inet 4 , 338.Xr inet6 4 , 339.Xr ip 4 , 340.Xr ip6 4 , 341.Xr netintro 4 , 342.Xr setkey 8 343.Sh HISTORY 344The 345.Tn TCP 346protocol appeared in 347.Bx 4.2 . 348.Pp 349The socket option 350.Dv TCP_CONNECTIONTIMEOUT 351first appeared in Mac OS X 10.6. 352