An electrical socket provides a point of connection to external devices that require electrical power. A network socket implements the similar functionality by providing a point of interface to applications to access the network. This post aims to be an introduction to socket programming in Python.
- A network socket is an endpoint of the communication across a network.
- A socket is an abstract reference that a local program can pass to the networking application programming interface (API) to use the connection, for example “send this data on this socket”.
- Sockets are internally often simply integers, which identify which connection to use.
- A socket API is an application programming interface (API), usually provided by the operating system, that allows application programs to control and use network sockets.
- Today, most communication between computers is based on the Internet Protocol (IP), therefore most network sockets are Internet sockets.
- Internet socket APIs are usually based on the Berkeley sockets standard. In the Berkeley sockets standard, sockets are a form of file descriptor (a file handle), due to the Unix philosophy that “everything is a file“.
- In inter-process communication, each end will generally have its own socket, but these may use different APIs: they are abstracted by the network protocol.
- A socket address is the combination of an IP address and a port number, much like one end of a telephone connection is the combination of a phone number and a particular extension. Based on this address, internet sockets deliver incoming data packets to the appropriate application process or thread.
- Communicating local and remote sockets are socket pairs. Each socket pair is described by a unique 4-tuple consisting of source and destination IP addresses and port numbers, i.e. of local and remote socket addresses.
- In the TCP case, each unique socket pair 4-tuple is assigned a socket number, while in the UDP case, each unique local socket address is assigned a socket number.
Types of Sockets
Several types of Internet socket are available. The most common ones are,
- Datagram sockets, also known as connectionless sockets, which use User Datagram Protocol.
- Stream sockets, also known as connection-oriented sockets, which use Transmission Control Protocol or Stream Control Transmission Protocol.
- Raw sockets (or Raw IP sockets), typically available in routers and other network equipment. Here the transport layer is bypassed, and the packet headers are made accessible to the application.
Python “socket” module
The Python’s socket module provides access to BSD socket interface which is available on all modern OSes — Unix, MAC OS X, Linux, Windows, Solaris, OpenBSD and others.
The Python interface is a straightforward transliteration of the Unix system call and library interface for sockets to Python’s object-oriented style: the
socket() function returns a socket object whose methods implement the various socket system calls. Parameter types are somewhat higher-level than in the C interface: as with
write() operations on Python files, buffer allocation on receive operations is automatic, and buffer length is implicit on send operations.
Without creating a socket object we can obtain some important information using the socket library functions.
Returns the hostname of the machine as a string where the Python interpreter is currently executing.
>>> socket.gethostname() 'DeepakD-Laptop'
Given the hostname, returns its IPv4 address.
>>> socket.gethostbyname('DeepakD-Laptop') '192.168.56.1'
Of course, you can use gethostname() to return the hostname required for gethostbyname().
>>> socket.gethostbyname(socket.gethostname()) '192.168.56.1'
Returns a tuple (hostname, aliaslist,ipaddrlist) where hostname is the primary hostname responding to the given ip_address, aliaslist is a list of alternative host names for the same address, ipaddrlist is a list of IPv4 addresses for the same interface.
>>> socket.gethostbyname_ex(socket.gethostname()) ('DeepakD-Laptop', , ['192.168.56.1', '192.168.0.2'])
4. socket.gethostbyaddr(IP Address)
Returns a tuple (hostname, aliaslist,ipaddrlist) just like gethostbyname_ex().
>>> socket.gethostbyaddr('192.168.56.1') ('DeepakD-Laptop', , ['192.168.56.1'])
Refer the Python socket– Low-level Network Interface to know all the built-in functions.
Creating a Socket
# Import the socket module >>> import socket # Instantiate the socket object >>> s = socket.socket() >>> type(s) <class 'socket._socketobject'>
The socket method is required to be passed with two parameters do define its nature.
1. Address Family
2. Socket Type
Computer processes that provide application services are referred to as servers, and create sockets on start up that are in listening state. These sockets are waiting for initiatives from client programs.
A server socket involves the following tasks:
1. Create a STREAM(TCP) or DGRAM(UDP) socket.
2. Bind the socket to the IP Address and Port.
3. Listen for connections made to the socket. Set the maximum number of concurrent requests the server should process.
4. Wait for connections in an infinite loop.
5. Establish the connection for incoming client requests.
6. Go back to listening for next client request.
Let’s write a simple server that sends the current time string to the client:
# server.py import socket import time # create a socket object serversocket = socket.socket( socket.AF_INET, socket.SOCK_STREAM) # get local machine name host = socket.gethostname() port = 9999 # bind to the port serversocket.bind((host, port)) # queue up to 5 requests serversocket.listen(5) while True: # establish a connection clientsocket,addr = serversocket.accept() print("Got a connection from %s" % str(addr)) currentTime = time.ctime(time.time()) + "\r\n" clientsocket.send(currentTime.encode('ascii')) clientsocket.close()
A quick summary of socket methods :
- socket.socket(): Create a new socket using the given address family, socket type and protocol number.
- socket.bind(address): Bind the socket to address.
- socket.listen(backlog): Listen for connections made to the socket. The backlog argument specifies the maximum number of queued connections and should be at least 0; the maximum value is system-dependent (usually 5), the minimum value is forced to 0.
- socket.accept(): The return value is a pair (conn, address) where conn is a new socket object usable to send and receive data on the connection, and address is the address bound to the socket on the other end of the connection.
At accept(), a new socket is created that is distinct from the named socket. This new socket is used solely for communication with this particular client.
For TCP servers, the socket object used to receive connections is not the same socket used to perform subsequent communication with the client. In particular, the accept() system call returns a new socket object that’s actually used for the connection. This allows a server to manage connections from a large number of clients simultaneously.
- socket.send(bytes[, flags]): Send data to the socket. The socket must be connected to a remote socket. Returns the number of bytes sent. Applications are responsible for checking that all data has been sent; if only some of the data was transmitted, the application needs to attempt delivery of the remaining data.
- socket.colse(): Mark the socket closed. all future operations on the socket object will fail. The remote end will receive no more data (after queued data is flushed). Sockets are automatically closed when they are garbage-collected, but it is recommended to close() them explicitly.
Note that the server socket doesn’t receive any data. It just produces client sockets. Each clientsocket is created in response to some other client socket doing a connect() to the host and port we’re bound to. As soon as we’ve created that clientsocket, we go back to listening for more connections.
# client.py import socket # create a socket object s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # get local machine name host = socket.gethostname() port = 9999 # connection to hostname on the port. s.connect((host, port)) # Receive no more than 1024 bytes tm = s.recv(1024) s.close() print("The time got from the server is %s" % tm.decode('ascii'))
The output looks like this :
The overall conversation flow between the server socket and the client socket is shown in the following diagram.