Site hosted by Angelfire.com: Build your free website today!
Home
About Me
Articles
  Socket Proxies
  Ftp Explained
  Key Managers
Projects
Downloads
Java Quiz
Useful Links

FTP Client Explained (In Java)



The File Transfer Protocol

The file transfer protocol is one of the oldest protocols based on TCP/IP that is used to transfer files. The protocol is still widely used and is surprisingly quite advanced. Most of the clients and servers do not implement all the features of the protocol. This document explains the internals of the file transfer protocol. It grew out of my interest in developing a ftp client library in Java. (). If I do a good job on this document, the reader should be able to develop a fully usable ftp client in the platform of his/her choice. The various interactions are explained here with excerpts of Java code that can be used to experiment with a live FTP server. This document is not (and will never be) as comprehensive as the FTP specification (RFC 959). If you are an expert on the protocol and need specific details, this may not be the place to be. However, (good) criticism and corrections are always welcome.

FTP Servers

The stanard port allotted for the FTP service is 21 by IANA (Internet Assigned numbers Authority). Unike other protocols like telnet, SSH , etc, FTP uses dual ports for communication. The socket connection on port 21 is called the control connection. This is used to send commands to the FTP server. Some commands initiate data transfer. The data transfer could be listing of a directory or streaming the contents of a file. The data transfer is always on a different port. The socket connection that is used to transfer data in a FTP process is called Data Connection. The port used for data connection is dynamic. The server and the client negotiate a port to use for data connection when data transfer is required.


    ________                         _______ 
   |        | ctrl connection       |       |
   |Client X|<--------------------->|Server |
   |        |   (port 21)           |       |
   |        |                       |       |
   |        | data connection       |       |
   |       X|<--------------------->|X      |
   |________|                       |_______|


The client issues commands over the control connection, to which the server responds with a specific action and a response. The response contains a respose code and a text message. The response code determines if the intended action was successfully performed by the server. Thus the interaction between the client and the server is sychronous. The client issues a command to the server and waits for the response before issuing the next command. (The client could possibly send commands without waiting for responses, but the effect would be unpredictible. The protocol does not support asynchronous interaction between the client and the server).

Initiation

The FTP interaction begins with a client making a socket connection to the FTP server on the control port (21 mostly). The server responds with a response code and a greeting terminated by (\r\n). The client needs to read the entire response (containg the response code and greeting) before starting to send any commands. The response, depending on the verbosity of the greeting, could span multiple lines. Each line ends with a \r\n and begins with a response code. If the line is not the last line of greeting, it contains a hyphen ('-') after the response code. The last line (or the end of the response) is when the line contains the same response code as the first line followed by a space character.

For example, a terse server responds to the client connect with a single line greeting like below


220 Welcome to a very silent FTP server. You are not so Welcome
While a more chatty FTP server yacks away like below -
220  Welcome to our Ftp server. We are not only chatty but
230- like to send messages that are long and need the client
220- to waste time parsing unnecessary text. We love our
220  job!!!!

These are not actual responses by any server. I just made them up to explain. Below is a code excerpt that will connect successfully a ftp server and print the response on the screen. It wont do anything more useful.


public class VerySimpleFtpConnection
{
    public static void main(String[] args)
    {
        if(args.length != 1)
        {
            System.out.println("Usage: java VerySimpleFtpConnection ");
            System.exit(1);
        }
        
        Socket controlSocket = null;
        BufferedReader reader = null;
        String host = args[0];


        try
        {
            // Open the client socket to the server
            controlSocket = new Socket(host, 21);

            //construct the streams required
            reader = new BufferedReader(
                new InputStreamReader(
                    controlSocket.getInputStream()));
            String line = null;
            String responseCode = null;

            do
            {
                line=reader.readLine();
                if(responseCode == null)
                    responseCode=line.substring(0,3);
                System.out.println(line);
            }
            while( !(line.startsWith(responseCode) && 
                        line.charAt(3) == ' '));
        }
        catch(Exception e)
        {
            e.printStackTrace();
        }
        finally
        {
            try
            {
                if(reader != null)
                    reader.close();
                if(controlSocket != null)
                    controlSocket.close();
            }
            catch(Exception e)
            {
            }
        }
    }

}

If you ignore the portions of the code that does the mundane clean up of sockets and streams, there is really nothing much going on. The substansive part of the code is only opening a socket and reading the input stream obtained from it. The data in the stream is read one line at a time in the do-while loop until the last line. Notice that the last line is determined by the similar response code as the first line followed by a space character.

Also notice that we didn't try to read the stream until <EOF> ( BufferedReader.readLine() returns null). This would never occur until the control connection is terminated with one of the termination commands (more on this later). We disconnect the socket from the client instead of sending a command that will instruct the server to close the connection.

Client PI

One of the components in any FTP client controls request-reply mechanism to the FTP server and manages the data transfers. This is referred to as "Client Protocol Interpreter" or "Client PI" by the specification. The client PI issues the FTP commands to the server, parses the response and interprets the same. It also opens data connections for data transfers. In the above section the VerySimpleFTPConnection is a client PI in the making. Any Client PI implementation will need to be aware of the FTP responses.

FTP Responses

The RFC contains a good deal of specifics on the responses. Below are some of the more significant rules-

  • Each response has a response code and a (helpful) text message.
  • If the text message is in multiple lines, each line that is not the last line will begin with a response code suffixed with a '-'. The last line will contain the same response code as the first line and will not have a '-' suffix. (Refer the example in the previous section).
  • The response codes indicate the veriety of states of the server PI (Server Protocol Interpreter). The state could be "Preliminary Positive", "Positive", "Positive Intermediate", "Transient Negative" and Permanent Negative".

    Response TypeResponse CodeWhat it means
    Preliminary Positive 1xx The command issued by the client is accepted by the server and is being processed. If the command requires data connections from/to the client, it can be opened now (by the client).
    Positive Completion 2xx The command is successfully processed by the server.
    Positive Intermediate 3xx The command is accepted by the server, but needs additional information from the client.
    Transient Negative 4xx The command was not accepted by the server and the requested action could not be performed. But this may be temporary. The client may retry the operation again.
    Permenant Negative 5xx The command was not accepted by the server. Retries from the client may not help.
For a comprehensive list of FTP response codes, refer the specification.
Simple Client PI

With the rules above, a simple, but reasonably functional client PI can be built. We'll assume that any command is successful with a response ~2XX, and unsuccessful with a response code > 3XX. The sample code VerySimpleFtpConnection has almost everything that is required by a client PI. The code excerpt below is from a refactored version of the same.


    public class ClientPI 
    {
        private BufferedWriter writer;
        private BufferedReader reader;

        public ClientPI(String hostname, int port)
        {
            
        }

        public void open() throws IOException
        {
            Socket socket = new Socket(hostname, port);
            writer = new BufferedWriter(
                      new OutputStreamWriter(socket.getOutputStream()));
            reader = new BufferedReader(
                      new InputStreamReader(socket.getInputStream()));
        }


        public FTPResponse sendCommand(String sendCommand) throws IOException
        {
            writer.write(sendCommand+"\r\n");
            writer.flush();
            return getReply();
        }


        public FTPResponse getReply() throws IOException
        {
            StringBuffer sb = new StringBuffer();
            String line = null;
            String resp = null;

            do
            {
                line = reader.readLine();
                sb.append(line).append("\r\n");
                if(line == null ||
                    line.length < 5 )
                    throw new IOException("Illegal FTP response! "+line);
                if(resp == null)
                    resp = line.substring(0,3);
            }
            while(!(line.startsWith(resp) &&
                line.charAt(3) == ' '));

            if(resp.startsWith("4") || resp.startsWith("5"))
                throw new FTPException("Ftp error! "+
                    resp+":"+sb);

            return new FTPResponse(resp, sb.toString());

        }

        .
        .
        .
        .
        // Other (more useful methods)

    }

FTPResponse is a value object to hold the ftp resoponse text and response code. FTPException is an IOException. All the code does is open the streams from the socket that connects to the control port of the FTP server (usually 21). The streams are used to write commands and read responses. The sendCommand writes the command to the output stream. Notice that all commands end with CR-LF as required by the specification. (But most ftp servers are forgiving in this regard). The response code is expected to be atleast 3 characters long as required by the specification. Any response that begins with '4' or '5' is assumed a failure. The command is not retried. (Most standard FTP clients do nothing more than this).

Authentication

User authentication is limited to password verification. The client starts the user session with the command USER <userid>. If the server wants a password it responds with a 3XX reply. The client, then issues the PASS <password> command. If the user id and password are valid, the server responds with a 2XX response code, else it responds with a 4XX / 5XX response code.

        Client                          Server

            |   USER johndoe               |
            |----------------------------->|
            |   331 password required      |
            |<-----------------------------|
            |                              |
            |                              |
            |   PASS nobody                |
            |----------------------------->|
            |   230 Login ok.              |
            |<-----------------------------|
            |                              |

Assuming that "nobody" is a valid credential for the user "johndoe", the above stick diagram shows a typical interaction. Implementing the above interaction using our code discussed in the previous section is really easy. Watch below -

     public class FtpClient
     {
         public static void main(String args[]) throws Exception
         {
             if(args.length != 4)
             {
                 System.err.println("Usage: java FtpClient    ");
                 System.exit(1);
             }

             String host = args[0];
             int port = Integer.parseInt(args[1]);
             String user = args[2];
             String pass = args[3];

             // create a client pi instance

             ClientPI cpi = new ClientPI(host, port);
             cpi.open();
             System.out.println(cpi.getReply().getResponseText());

             FTPResponse resp = null;

             resp = cpi.sendCommand("USER "+user);
             System.out.println(resp.getResponseText());
             resp = cpi.sendCommand("PASS "+pass);
             System.out.println(resp.getResponseText());

             System.out.println("Viola! I am in!");

         }
     }

There is little to explain with the code. It simply implements the interaction explained above for authentication. You can download the code discussed so far from here.

More FTP Commands

The most often used FTP commands that don't involve data transfer are
  • CWD : Change working directory.
  • CDUP : Change to parent directory.
  • PWD : Current working directory.
  • TYPE : Data type for transfer (binary / ascii).
  • REST : Re-set the file marker at the server to the the offset in the argument.
  • DELE : Delete the file in the argument.
  • RMD : Delete the directory.
  • MKD : Make directory.
  • NOOP : No operation. Used by the client to check if the server is responsive.
  • SYST : To find out the type of operating system that hosts the server process.

All the above commands are have a very simple interaction between the client and the server. The client issues the command along with its arguments to the server over the control socket and waits for the server response. The stick diagram below can represent any of the above list of commands.

        Client                          Server

            |   <command> [args]              |
            |-------------------------------->|
            |   xxx [text message]            |
            |<--------------------------------|
            |                                 |
            |                                 |
            |   PWD                           |
            |-------------------------------->|
            |   257 /pub is current directory |
            |<--------------------------------|
            |                                 |

Below is an excerpt from the FTP client that impliments the API to obtain the current working directory of the server session and change the working directory on the session.

     public class FtpClient
     {
         .
         .
         .
         // Some usefule code

         public String pwd() throws IOException
         {
             FTPResponse respone = clientPI.sendCommand("PWD");
             return response.getResponseText();
         }


         public FTPResponse cd(String path) throws IOException
         {
             return clientPI.sendCommand("CWD "+path);
         }


     }

As evident, the commands are simply passed to the server using the API in our (little) client PI. The API issues the command to the server through the output stream opened on the control port and waits for a resonse on the input stream. The rename command is slightly different than the above commands. Its more like the authentication command. The client issues a RNFR (rename from) with the old name of the file to which the server response with a 3XX response code (if the file name is valid and the file exists). The client then issues the RNTO (rename to) with the new name.


     public class FtpClient
     {
         .
         .
         .
         // some usefule code

         public FTPResponse rename(String oldName, String newName)
                                     throws IOException
         {
             FTPResponse response = clientPI.sendCommand("RNFR "+oldName);
             System.out.println(response.getResponseText());
             response = clientPI.sendCommand("RNTO "+newName);
             System.out.println(response.getResponseText());
             return response;
         }
     }
The TYPE command is used to specify the data type being transferred. If the TYPE is set to "A" (sendCommand("TYPE A")), the server assumes ascii transfers. All encoding conversions are done by the server and/or client to retain the ascii nature of the data. If the TYPE is set to "I" (image), no encoding conversion is performed at the client DTS or the server DTS.

Data Transfer Commands

The following commands initiate data transfer.
  • LIST : List the files / directories in the path specified in the argument.
  • NLST : Similar to list, but uses a shorter listing format. Only file names are listed - no attributes of the files are contained in the listing.
  • RETR : Retrieve a file specified in the argument from the server.
  • STOR : Store a file with the name specified in the argument on the server.

Implementing commands that cause data transfer is more complex than the ones discussed so far. As explained before the data is transferred through a socket connection seperate from the control connection. For any of these commands the server and the client negotiate the port on which the data transfer takes place. The connection mode determines the role of the client (or server) in the port decision.

In the Passive mode, the client issues the PASV command - requesting the server to go passive. The server starts a ServerSocket on an available port. If the ServerSocket is successfully created, the server responds with a 2XX response code along with the Socket address. The client connects to the server using a client socket. The socket connection is then used for transferring the data.

In the Active mode, the client goes passive by openning a ServerSocket on a available port. It issues a "PORT" command with the socket address to the server. The server connects to the client using a client socket.

Passive Mode Transfers

The response to a PASV command is a socket address on the server where the server DTP (data transfer process) resides. The socket address is enclosed between the first pair of brackets in the response. It consists of a 6 words delimited by ','. Example -

227 Entering Passive Mode (192,87,102,36,201,85)
The first 4 words is the IP address of the host. The 5th and the 6th word is the port number with the 5th word being the high order 8 bit. The stick diagram below shows the interaction for a PASSIVE download of a file from a FTP server -

        Client                             Server (on h1.h2.h3.h4)

            |   PASV                          |
            |-------------------------------->|
            |                                 |
            |   227-Entering Passive mode     |
            |   227 (h1,h2,h3,h4,p1,p2)       |
            |<--------------------------------|
            |                                 |
            |   RETR abc.txt                  |
            |-------------------------------->|
            |                                 |
            |    Socket = (h1,h2,h3,h4:p1,p2) |
            |================================>|
            |                                 |
            |   226 Transfer Complete.        |
            |<--------------------------------|
            |                                 |
The gory code below implements the above interaction
       public class FileTransfer
       {
           .
           .
           .

            public void downloadFileInPassive(String fileName) throws IOException
            {

                // Set type to binary
                System.out.println(clientPI.sendCommand("TYPE I").getResponseText());

                // Get the port address of the server DTP
                String[] address = getDataSocketAddress(clientPI.sendCommand("PASV")
                        .getResponseText());

                if(address == null)
                {
                    throw new FTPException("Could not obtain socket address");
                }

                String hostName = address[0];
                for(int i=1;i<4;i++)
                {
                    hostName += "."+address[i];
                }

                int port = (Integer.parseInt(address[4]) *256)+
                    Integer.parseInt(address[5]);


                // connect to the server DTP
                Socket dataTransferSocket = new Socket(hostName,port);



                // Issue a RETR command
                System.out.println(clientPI.sendCommand("RETR "+fileName).getResponseText());



                // Download the data from the socket to the 
                // file system (Below is really the client DTP!)
                FileOutputStream outputStream = new FileOutputStream(fileName);
                InputStream inputStream = dataTransferSocket.getInputStream();
                byte[] buffer = new byte[4096];
                int bytesRead = -1;
                while((bytesRead = inputStream.read(buffer)) > -1)
                {
                    outputStream.write(buffer, 0, bytesRead);
                }
                outputStream.flush();
                outputStream.close();
                inputStream.close();

                dataTransferSocket.close();


                // Get the completion reply from the server.
                System.out.println(clientPI.getReply().getResponseText());
            }
           .
           .
           .
Active Mode Transfers

The client goes passive on a port of choice and sends the socket address to the server using the "PORT" command. The socket address It consists of a 6 words delimited by ','. Example -

PORT 192,87,102,36,201,85
The first 4 words is the IP address of the host (usually the client IP). The 5th and the 6th word is the port number with the 5th word being the high order 8 bit. The stick diagram below shows the interaction for a ACTIVE mode download of a file from a FTP server -

        Client (on h1.h2.h3.h4)             Server 

            |   PORT h1,h2,h3,h4,p1,p2        |
            |-------------------------------->|
            |                                 |
            |   200 Port command successful.  |
            |<--------------------------------|
            |                                 |
            |   RETR abc.txt                  |
            |-------------------------------->|
            |                                 |
            |    Socket = (h1,h2,h3,h4:p1,p2) |
            |<================================|
            |                                 |
            |   226 Transfer Complete.        |
            |<--------------------------------|
            |                                 |
The gory code below implements the above interaction.

    public class FileTransfer
    {
        .
        .
        .
        //some code

        public void downloadFileInActive(String fileName) throws IOException
        {
            ServerSocket serverSocket = new ServerSocket(0);
            String socketAddress = getSocketAddress(serverSocket.getLocalPort());

            // Set type to binary
            System.out.println(clientPI.sendCommand("TYPE I").getResponseText());


            // Send the port command with the local socket
            // address
            System.out.println(clientPI.sendCommand("PORT "+socketAddress).getResponseText());

            // send the RETR command with the file name
            System.out.println(clientPI.sendCommand("RETR "+fileName).getResponseText());

            // Start listening for socket connections
            Socket dataTransferSocket = serverSocket.accept();

            // Client DTP process
            FileOutputStream outputStream = new FileOutputStream(fileName);
            InputStream inputStream = dataTransferSocket.getInputStream();
            byte[] buffer = new byte[4096];
            int bytesRead = -1;
            while((bytesRead = inputStream.read(buffer)) > -1)
            {
                outputStream.write(buffer, 0, bytesRead);
            }
            outputStream.flush();
            outputStream.close();
            inputStream.close();

            dataTransferSocket.close();


            // Get the completion reply from the server.
            System.out.println(clientPI.getReply());

        }


    }


The file uploads for active and passive modes are similar. They only differ the the command -- STOR filename (instead of RETR filename) and the nature of the stream obtained from the data trnasfer socket. (OutputStreams are obtained by the client instead of the input stream in the above example.). The source code here has the examples for upload and download for both modes. The above examples show file transfers in binary. To use ASCII transfers set the type to A and wrap the input/output stream obtained from the data transfer socket with reader/writer.

Listing Directories

Listing directories is similar to the data transfers. It needs one of the connection modes. Below is a stick diagram showing the interaction in passive mode-

        Client                             Server (on h1.h2.h3.h4)

            |   PASV                          |
            |-------------------------------->|
            |                                 |
            |   227-Entering Passive mode     |
            |   227 (h1,h2,h3,h4,p1,p2)       |
            |<--------------------------------|
            |                                 |
            |   TYPE ASCII                    |
            |-------------------------------->|
            |                                 |
            |   LIST /home/nobody             |
            |-------------------------------->|
            |                                 |
            |    Socket = (h1,h2,h3,h4:p1,p2) |
            |================================>|
            |                                 |
            |   226 Transfer Complete.        |
            |<--------------------------------|
            |                                 |
The example source code is here.

The specification is not very definitive on the output format of the LIST command. Due to this, the format of the list output varies between server implementations and platforms. The popular ones are the 9 column Unix listing and the 5 column dos listing. The client implementation needs to be aware of the listing format to parse the output for file attributes.

Below are examples of list output.

Unix 9 column
total 52
-rw-rw-r--  1 nobody nobody   2188 Nov  2 22:27 abc.dat
-rw-rw-r--  1 nobody nobody   6542 Nov  5 08:17 index.html

DOS 5 column
total 52
-rw-rw-r--  1 nobody nobody   2188 Nov  2 22:27 abc.dat
-rw-rw-r--  1 nobody nobody   6542 Nov  5 08:17 index.html

Mainframe listing
Volume Unit    Referred Ext Used Recfm Lrecl BlkSz Dsorg Dsname
TST831 3390   2003/05/30  1   15  VB     256  6233  PS  ABCDEF.DEFGHI
TSTC27 3390   2004/10/19  1   15  VB     256  6233  PS  MAINFRAME.FILE

Copyright (C) 2004 Abhilash Koneri