On Thu, Jul 28, 2005 at 01:15:55AM -0400, Brenda J. Butler wrote: > > Once connections are established, they're tracked independently. > > It's the combination of source address, source port, target > > address, and target port that defines a TCP connection. [...] > The thing (eg apache) listening to a well-known port either answers > very fast, or sends a random port back to the client for the longer- > duration transaction and resumes listening on the well-known port. > > If the server spent significant time answering via the well-known > port, then other people trying to access that service would not be > able to connect because it would be busy. Not quite; HTTP may be fast, but it's not *that* fast. On a busy webserver, there can be dozens of requests happening simultaneously. There's also the possibility that some clients will use "keepalive" mode, where they keep the connection open for many seconds while they make additional requests. TCP is a stateful protocol. When a client wants to connect to a server, it always uses an unused *local* port over 1024, such that all outbound connections are unique at all times, but it uses the well- known service port as its remote port. I'm a little rusty on the process, but as I recall, the client sends a SYN (connection request), the server sends back a SYN+ACK (acknowledge and fully open the connection), and the client sends an ACK to fully open the connection. After that, every connection is uniquely identified based on the source host, the source port, the destination host, and the destination port. This combination of four is guaranteed to be unique according to the TCP protocol specification, and this is how both sides keep track of the packets for that connection. The ports cannot change throughout this process. But the target port remains willing to accept more connections, because all it's really doing is listening for SYNs and opening brand new connections every time it receives them. This is why Apache has several settings relating to how many "spare" servers should be running to handle additional requests while the others are busy. For example, if all thirty servers are busy serving requests non-stop, and MinSpareServers is set to 5, it'll spawn five more. The way Apache is written, each child can only handle one client at a time, but they take turns. Here's an example of an Apache status page on a server I run: Total accesses: 250449 - Total Traffic: 3.5 GB 14 requests currently being processed, 17 idle servers __K_K___K__K____K_K_KKKK__W_KWW................................. That last line is the 'dashboard' of what's currently going on. 17 servers are standing by for incoming connections but are otherwise idle. 14 are processing clients. Of those, three (W) are actually handing out pages, while the other 11 (K) are clients with 'keepalive' set, who are holding the connection open to avoid constantly connecting and disconnecting. My Apache currently has 31 servers total, but the status page shows it's been as high as 80 just in the past 5 days. With MinSpareServers set to 10, that means at one point, it was serving 70 clients simultaneously. This all happens on port 80. Go ahead and try it out -- run "tcpdump port 22" as root. Then SSH in to that machine twice. On tcpdump, you'll see the ports never change, and both connections remain open. Every time you send stuff on the SSH session, data will be sent down that connection. If you do netstat -nt | grep :22 you'll see something like this tcp 0 0 1.2.3.4:22 6.7.8.9:1025 ESTABLISHED tcp 0 0 1.2.3.4:22 6.7.8.9:1026 ESTABLISHED where 1.2.3.4 is the local server and 6.7.8.9 is the remote client. Or, if you do it on the client end, you'll see the two positions reversed. Now UDP, on the other hand, is stateless. Packets are sent without concern for existing connections, since connections don't exist in UDP. Think of listening on TCP as willingness to build a bridge, while listening on UDP is just willingness to accept stuff people throw over the river. :)
Attachment:
signature.asc
Description: Digital signature