Understanding Sockets in Unix, NT, and Java
By Ken Nordby2003-05-26
Basic sockets concepts
Sockets belong to a group of software mechanisms that enable interprocess communications (IPC), which simply means that concurrently executing processes can exchange data. Some IPC mechanisms only support data exchange between processes running on the same machine, while other forms of IPC connect processes running on geographically dispersed machines. In System V Unix, message queues, semaphores, and shared memory are examples of IPC mechanisms. In Windows NT, named pipes, mailslots, and memory mapped files are part of the IPC family. Sockets are one of the most common IPC mechanisms; they support both local and remote communications and are available on Unix and NT.
Sockets technology was developed in the early 1980s at the University of California at Berkeley and was included with Release 4.1 of BSD (Berkeley Software Distribution) Unix. Because of the connection with BSD, sockets are sometimes referred to as "Berkeley sockets." However, sockets support was gradually adopted by other versions of Unix, and eventually by other families of operating systems, including both Intel-based systems and mainframes. Sockets were added to the Microsoft Windows world in the form of the Winsock API, and to the Java world as Socket and ServerSocket objects in the java.net package.
The simplest way to understand a socket is to think of it as a byte stream between computers. In the original Unix implementation, sockets were treated like files. For example, the standard Unix system calls open( ) , read( ) , write( ) , and close ( ) are used for processing files. But the same system calls can also be used to process sockets.
When you create a file with open ( ) , Unix sets up internal data structures to manage the file, and returns an integer, referred to as the file descriptor, by which you can manipulate the file. In similar fashion, when you create a socket with the socket ( ) system call, the system sets up data structures to manage the socket and returns an integer, referred to as the socket descriptor, that gives you access to the socket. However, the socket ( ) call, unlike open ( ) , does not take a name argument.
Another important concept is support for multiple protocols. Sockets were designed to support different kinds of communications protocols and data streams. For example, the protocol family argument in the socket ( ) call allows you to specify different protocols. You can specify Internet protocols, but you can also specify the Xerox Network Systems (XNS) protocols and other protocols.
Type of connection is another important concept in sockets programming. When you create a socket you decide whether data transfer will be connection-oriented or connectionless. Connection-oriented data transfer implies the existence of a session, or dialogue, between computers. Much like a dialogue between human beings, a computer communications session has an existence separate from the data messages exchanged within it. For example, a human dialogue may begin when both parties say "hello," and continue until both parties say "good-bye" even though the conversation includes periods of silence.
A human example of connection-oriented communication is a telephone call. The information transfer, or exchange of messages, takes place in the context of a session that persists until one person hangs up.
In contrast, connectionless data transfer refers to the transmission of data messages without benefit of dialogue or session. A human example of connectionless data transfer is mailing a letter at the US Post Office. The information transfer consists only of the message itself; there is no session established between the person sending the letter and the recipient. In computer terms this form of communication consists of launching messages onto the network without establishing a session with the target system.
Tutorial Pages:
» Understanding Sockets in Unix, NT, and Java
» Basic sockets concepts
» A simple example
» Sample sockets application
» Server program (Unix)
» Client program (NT)
» Client program (Java)
» Further study
First published by IBM DeveloperWorks
