Joey's TCP/IP Socket Primer

Site hosted by Angelfire.com: Build your free website today!

A Beginner's Guide To Socket Programming

Written by Jose Mari Reyes
Copyright(C)1999, Jose Mari Reyes

email: jet_reyes@jetemail.net
Manila, Philippines

No part of this document may be used or reproduced in any form or by any means, or stored in
a database or retrieval system, without prior written permission of the author. Making copies
of any part of this document for any purpose other than your own personal use is prohibited

This document is presented as is, without warranty of any kind, either express or implied,
respecting the contents of this document. The author disclaim all types of liabilities, loss
or damage caused or alleged to be caused directly or indirectly by this document.

Introduction

As the popularity of the Internet grows, a large number of network enabled application spawned like a wild grass in the wilderness, but what does all this application have in common is that most of them employs one of the Internet's hardworking Application Programming Interface (API), the Socket API. This article will show you enough knowledge on how to create a network enabled application using the Socket Interface. By the way I do not claim myself a guru in this topic, if you think that I did err'd please correct me by sending me email or try to live with it. All the code example that appeared in this article are written in C, I do not chose C because I like it but because it is much more flexible than the other programming language (in my own opinion), but I'll try to come up with a version of this article for other popular language like JAVA, PEARL, VB, that is if I got the time to do it.

If you are browsing the Web (you're probably doing it right now), checking or sending emails, downloading a file, the application you are using is programmed using Socket API, that is from Web Browser, FTP, Email Program, Telnet, Gopher..etc are programmed using Socket.

By the way the source code provided as a sample was compiled using the UNIX operating system (Linux, Digital, Solaris..etc) and the good ol' cc compiler. If you don't have cc on your system, you can try GNU's gcc compiler instead, if it is runnable on your O/S.

Winsock is the Microsoft Windows alternative for this API, however the code that we are going to present will not run on this platform without modification, but most function calls are similar.

Disclaimer

Like I said, I am not a TCP, Socket, Unix GURU, I'm just a plain guy who wants to share what I know. I did my best to keep this document informative but even the best can have bugs too. All the source are provided as is, I disclaim all liabilities to all damage that may arise cause by the use of the source code provided in this document.

Client/Server as a model

Remember, going to a local fastfood trying to order burgers, the first thing you do is to get the attention of the food servers and when they do noticed you they'll ask you for your orders, then what do you do, of course you going to ask for a burger and a soda, but what if the food server replies back with "sir, were already out of burger bun", what are you going to do, you walk out right?. This very same transaction is the very same model that most of the Network applications mimics, so the Client/Server model was born. In this model you, the purchaser is the Client and our good food server is our Server, The Server always waiting for a connection from the Client, after the connection have been made the Server is now ready to accept orders from the Client then a conversation between the two has been made, this is the very same model that your current Web browser is following. But of course if the guy you ordered with gives you the burger and soda you ordered you will also reply by paying your food and by saying the magic word "T.Y.".

Examples of Client/Server Systems are:

Telnet
ftp
Finger
NFS
World Wide Web
SMTP

What is TCP/IP?

TCP/IP (Transmission Control Protocol/Internet Protocol) is a set of protocols that are used to let computers connected in the network(internet, wan, lans, etc...) to communicate with each other, it is also a rules for sending and receiving data over a heterogeneous network.

Internet Transport Protocol

The Internet has two types of transport protocol, a connection oriented protocol and a connectionless oriented one, TCP and UDP.

The TCP was designed to provide a reliable end to end connection over an unreliable internetwork. TCP was defined in RFC 793, bugs and new requirements are detailed in RFC 112 and the extensions are in RFC 1323. TCP is obtained by having both sender and receiver, each having an end points called sockets. Each end points or socket has a address (IP) of the host and a 16bit called port.

The UDP is a connectionless protocol that allows application to send encapsulated raw IP datagrams and send them without the need to establish a connection. UDP too needs the sockets end point.

The Socket

Each host must have an end point, this end point is called socket. socket is way to speak to other programs(interprocess)using a standard file descriptor. There are many types of socket, examples are: DARPA internet socket, Unix socket, X.25 socket and probably there are more implementation of socket depending on the operating system. For simplicity sake, we are going to deal only with Internet socket.

There are many types of Internet socket(stream socket, datagram socket, raw socket, sequenced packet stream, etc...). This document will going to discuss the stream socket also know as connection oriented socket.

Lets Get Started...

The application that I am going to create must follow the stuff discuss above, so this requires us to create a client server application, the one that I can think of and simple enough is the echo server. Echo server is an application that listen on a specific port (usually on port 7) for an incomming connection from the client and echoes back the data
received from the client. First we must build our server, our server needs a mechanism to create a pipe for the data stream to flow, hence our end-point must be created, this is our socket. Below is the format of the function socket.

int socket(int domain, int type, int protocol);

The function requires us to include types.h and socket.h, so our source looks like this:

#include <sys/types.h>
#include <sys/socket.h>

#define ECHO_PORT 7

int sockfd;
struct sockaddr_in server;

if (( sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
exit(-1);

      server.sin_family = AF_INET;
      server.sin_addr.s_addr = INADDR_ANY;
      server.sin_port = htons((u_short) ECHO_PORT);

Our source above create a socket descriptor by calling the socket function, the domain should be AF_INET, the second arguments tells the kernel what type of socket this is: SOCK_STREAM in our case, you can replace this by SOCK_DGRAM for datagram type of socket or SOCK_RAW for raw socket. The return value is the socket descriptor value. If an error occured, socket will return -1. You can retrieve a more specific error on the global variable errno.

Handling of the network address and port is done automatically by the sockaddr structure. sockaddr_in("in" for "Internet") definition is:

    struct sockaddr_in {
           short int            sin_family;                   /*address family*/
           unsigned short int   sin_port;                     /*port number*/
           struct in_addr       sin_addr;                     /*Internet address*/
           unsigned char        sin_zero[8];                  /*same size as sockaddr*/
    };

sockaddr is provided to let application specify a local or remote endpoint address to which to connect a socket. Notice the htons, the sin_port is in Network Byte Order. This may vary from system to system so watch out for this (see your man pages.).

Binding and Listening

if (bind(sockfd, (struct sockaddr *) &server, sizeof(server)))
exit(-1);

listen(sockfd, 5);

The bind function associates a local address with a socket. The bind function is necessary if the application is going to listen, it is not necessary if the application is going for connection(connect). After binding the local address, our ECHO server is now ready to listen for an incomming connection. We are going to call the Listen function. The Listen function establishes a socket to listen for incoming connection. Below is the definition of the two function:

int bind(int socket, struct sockaddr *addr, int addrlen);
int listen ( int socket, int backlog);

Accepting the Incomming...

After calling the listen, our ECHO server is now ready for accepting incomming connection. This can be achieve by calling the function:

int accept(int socket, void *addr, int *addrlen);

Below is the working source to accomplished our accept:

int    clisock;                  /*include this first three*/
int    addrlen;                  /*declaration above the source file*/
struct sockaddr_in client;      /*together with the previous declaration*/

addrlen = sizeof(client);
clisock = accept(fd, (struct sockaddr *) &client, &addrlen);

Notice the sockaddr_in client. You can use this for yanking the end-point of the connections network address.

Sending and Receiving...

send() and recv() are the two main function for communicating over connected stream socket. If you are going to use DATAGRAM socket use the related function sendto() and recvfrom(). send and recv has a format of:

int send(int socket, const void *msg, int len, int flags);
int recv(int socket, void *buf, int len, unsigned int flags);

A sample code implements the send:

int msglen, bytes_sent;
char *msg = "The quick brown fox";

msglen = strlen(msg);
bytes_sent = send(socketfd, msg, msglen, 0);

The send function returns the number of bytes sent, -1 is returned if there is an error. Similar format is use for recv function except that the recv returns the number of bytes received.

Closing the Socket Connection

There are two ways to close an active socket. The first one is use the close() function and the other one is use the shutdown() function. The difference between the two is that the first one close the socket so that any attemp to read or write to the socket in the other end will receive and error. The second provides a little more control over how the socket closes. The sypnosis for these functions are:

int close(socket);
int shutdown(int socket, int how);

The how argument from the shutdown function is one of the following:

0 - Further receives are disallowed
1 - Further sends of data are disallowed
2 - Further send and receives are disallowed(more like close());

Both functions return 0 if success and -1 if error occured.

The Echo Server...

Below is our completed source for our Echo server, this is the most simplest implementation that I can provide, this application will listen to the known port and accept an incomming connection from the client, if the client disconnects the server will terminate. You can modify this, for example you want our Echo server to accept more than one connection at a time, to do this you can try the good old Unix function fork() (see your man pages) or you can create a separate thread for each socket descriptor or you can use the select() function. Its up to you, the posibilities are endless, you can even use the source code as a core to build an chat server.

#include <include here other necessary headers>
#include <sys/types.h>
#include <sys/socket.h>

#define ECHO_PORT 7

void main()
{
int    sockfd, clisock;
struct sockaddr_in server, client;
int    addrlen;
char   buffer[1024];
int    bytes_recv, msglen;

if (( sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
exit(-1);

      server.sin_family = AF_INET;
      server.sin_addr.s_addr = INADDR_ANY;
      server.sin_port = htons((u_short) ECHO_PORT);

if (bind(sockfd, (struct sockaddr *) &server, sizeof(server)))
exit(-1);

listen(sockfd, 5);

addrlen = sizeof(client);
clisock = accept(sockfd, (struct sockaddr *) &client, &addrlen);

     do {
          memset(buffer, '\0', 1024);
          bytes_recv = recv(clisock, buffer, 1024, 0);
          if (bytes_recv < 1){
              close(clifd);
              exit(0);
           }

          if (bytes_recv > 0)
          {
              printf("\nrecv:%s", buffer);
              msglen = bytes_recv;
              send(clisock, buffer, msglen, 0);
          }

}while (1 == 1)

}/*end of main()*/

Client Connects to the Server

Our Server application is now ready to accept connection, we need something like a client application to augment our Client Server application. Our Client in order to establish a stream connection needs to do the same procedure our Server did, this means that our Client application need to create socket, handle network address then connect. Unlike our Server, our Client don't need to bind, listen and accept. Below is sypnosis for the connect():

int connect(int socket, struct sockaddr *addr, int addrlen);

Again, this function will return -1 on error.

Below is the full source code for our Client application:

#include <include here other necessary headers>
#include <sys/types.h>
#include <sys/socket.h>

#define ECHO_PORT 7
#define ECHOSERVER 203.187.192.84

void main()
{
int                  sock;
struct sockaddr_in   sa;
struct hostent       *hp;
char                 buffer[1024];
int                  msglen, bytes_recv;

   hp = gethostbyname(ECHOSERVER);
   memcpy(&sa.sin_addr,hp->h_addr, hp->h_length);
   sa.sin_family = hp->h_addrtype;
   sa.sin_port = htons((u_short) ECHO_PORT);

   if (( sock = socket(hp->h_addrtype, SOCK_STREAM, 0)) < 0)
   {
      close(sock);
      exit(-1);
   }

   if (connect(sock, (struct sockaddr *)&sa, sizeof (sa)) < 0)
   {
      close(sock);
      exit(-1);
   }

   do {
        printf("\n>>");gets(buffer);
        if (strcmp(buffer, "end") == 0)
            exit(0);
        sprintf(buffer, "%s\r\n",buffer);
        msglen = strlen(buffer);
        send(sock, buffer, msglen);

        bytes_recv = recv(sock, buffer, 1024, 0);
        if (bytes_recv < 0)
            break;
        if (bytes_recv > 0)
            printf("\nrecv:%s", buffer);
   }while (1 == 1)

close(sock);

}/* end of main*/

Notice the function gethostbyname. gethostbyname() is a function that gets host information corresponding to a host name. It returns the name of the computer that your program is running on. The sypnosis for this function is:

int gethostbyname(char *hostname, size_t size);

You can use gethostbyname to determine your local IP address.

Who's at the other end?

If theres a need to know what is the address of the machine connected at the other end. This can be accomplished by calling the function getpeername(). This function will tell you who's at the other end of the socket. sypnosis:

int getpeername(int socket, struct sockaddr *addr, int *addrlen);

Once you have yank their address, you can use inet_ntoa() to print information or use gethostbyaddr() to get more information.

Blocking

Lots of function we mention before are blocking in nature, this means it waits or sleep until something happens. A good example is when you call the ANSI function gets, it waits or blocks until you press return(ASCII 13). The accept function, the recv, and many more are blocking function. If you are developing a network enabled application there will be a time that you may encounter a need for this functions not to block, making this function not to block can be accomplished by calling fcntl().

fcntl() is defined in the fcntl.h and related header unistd.h. You must include this two headers in order fcntl to function properly. Below is the source for such method.

int DoNonBlock(int sockfd)
{
if ( fcntl(sockfd, F_SETFL, O_NONBLOCK) < 0)
return -1;

return 0; /*no error*/
}

Synchronous I/O

We mentioned before about the modification for our ECHO server to accept multiple connection, this process can be achieve in many alternative ways, one good way is by using fork() function, fork() is used create new process. Another way is to use threads, a good standard thread package is pthread (see your manpages about pthread). Another way is to use the good and tested select() function. select() function lets your application to monitor several socket at the same time. the sypnosis of select():

int select(int numfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
struct timeval *timeout);

Don't forget to include the following headers: time.h, types.h, unistd.h.

The numfds specifies the number of descriptors to be tested 0 up to maximum number fd - 1 are tested. The readfds contains the information of which file descriptor is ready for reading. The writefds contains the information of which file descriptor is ready for writing. The exceptfds contains the information of which file descriptor have exceptional condition. The timeout change the behavior of the select function.

The values for timeout argument are the following:

NULL : returns when one specific descriptor is ready for I/O.
tv_sec = 0 and tv_usec = 0 : returns immediately after checking a specified descriptor.
tv_sec != 0 or tv_usec !=0 : returns when one specific descriptor is ready for I/O, but not block beyong a fixed amount of time.

Below are some related function to manage select:

FD_ZERO(fd_set *fd_set);            - clears all fdset bit
FD_SET(int fd, fd_set *fdset);      - turn on the bit of fd in the fdset
FD_CLR(int fd, fd_set *fdset);      - turn off the bit of fd in the fdset
FD_ISSET(int fd, fd_set *fdset);    - test the bit of fd in the fdset

Below is the timeval structure:

struct timeval {
long tv_sec; /*seconds*/
long tv_usec; /*microseconds*/
};

The select is a system call function that is, it can be use not just for socket, I personally used the select system call to create my usec precision sleep function, I call it jusleep. click here for the source code of jusleep.c

References

Computer Networks by Andrew S. Tanenbaum. Published by Prentice Hall. ISBN 981-307646-1.

TCP/IP Illustrated, Vol 1-3 by W. Richard Stevens and Gary R. Wright. Published by Addison Wesley. ISBNs:0-201-63346-9, 0-201-63354-X, 0-201-63495-3.

Advance Programming in the UNIX Environment by W. Richard Stevens. Published by Addison-Wesley. ISBN 0-201-88872-6.

RFCs

RFC-791 - The Internet Protocol
RFC-793 - The Transmission Control Protocol
RFC-768 - The User Datagram Protocol

WEB

Beej Guide to Network Programming

Unix-socket-faq for network programming

An Introductory 4.4BSD Interprocess Communication Tutorial

JULEEP.C

Below is my sample implementation of the system function select(). Please read the disclaimer before using the source.

/********************************************************************

     jusleep.c
     micro sleep implementation
     written by Jose Mari Reyes

     The function will wait up to the number of seconds or
     microseconds, you can override the sleeping process by
     hitting the enter key on line buffered terminal.

returns -1 if override or 0 if complete

param: seconds, micro seconds , 0 - override off
1 - override on

********************************************************************/
#include <stdio.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>

#define STDIN 0

extern int jusleep(int seconds, int u_seconds, int override)
{
struct timeval tv;
fd_set fds;

tv.tv_sec = seconds;
tv.tv_usec = u_seconds;

FD_ZERO (&fds);
FD_SET (STDIN, &fds);

    if (override == 0)
        select(STDIN , &fds, NULL, NULL, &tv);
    else if (override == 1)
        select(STDIN + 1, &fds, NULL, NULL, &tv);

   if (FD_ISSET(STDIN, &fds))
        return -1;
    else
        return 0;
}

This page have been accessed

Email: jet_reyes@jetemail.net