 
 
 
 Communication Protocols
The various client-server communications described in the previous section consisted 
of sending a string of characters ending in a carriage-return and receiving another.
However simple, this communication pattern defines a protocol.
If we wish to communicate more complex values, such as floats, matrices of 
floats, a tree of arithmetic expressions, a closure, or an object, we introduce
the problem of encoding these values. Many solutions exist according to 
the nature of the communicating programs, which can be characterized by 
the implementation language, the machine architecture, and in certain 
cases, the operating system.
Depending on the machine architecture, integers can be 
represented in many different ways (most significant bits on the left, on the 
right, use of tag bits, and size of a machine word).
To communicate a value between different programs, it is necessary 
to have a common representation of values, referred to as the external
representation2.
More structured values, such as records, just as integers, must have an external
representation.
Nonetheless, there are problems when certain languages allow 
constructs, such as bit-fields in C, which do not exist in other
languages.
Passing functional objects or objects, which contain pieces of code, poses 
a new difficulty. Is the code byte-compatible between the sender and receiver, and does 
there exist a mechanism for dynamically loading the code? 
As a general rule, the problem is simplified by supposing that the 
code exists on both sides. It is not the code itself that is transmitted, 
but information that allows it to be retrieved. 
For an object, the instance variables are communicated along with the object's
type, which allows retrieval of the object's methods.
For a closure, the environment is sent along with the address of its code.
This implies that the two communicating programs are actually the same 
executable.
A second difficulty arises from the complexity of linked exchanges 
and the necessity of synchronizing communications involving 
many programs.
We first present text protocols, later discussing acknowledgements 
and time limits between requests and responses. We also mention the 
difficulty of 
communicating internal values, in particular as it relates to 
interoperability between programs written in different languages.
 Text Protocol
 Text protocols, that is, communication in ASCII format, 
are the most common because they are the simplest to implement and the 
most portable.
When a protocol becomes complicated, it may become difficult 
to implement. In this setting, we define a grammar to describe 
the communication format. This grammar may be rich, but it will 
be up to the communicating programs to handle the work of 
coding and interpreting the text strings sent and received.
As a general rule, a network application does not allow viewing 
the different layers of protocols in use. This is typified 
by the case of the HTTP protocol, which allows a browser 
to communicate with a Web site.
 The HTTP Protocol
The term ``HTTP'' is seen frequently in advertising. 
It corresponds to the communication protocol used by Web applications.
The protocol is completely described on the page of the 
W3 Consortium:
 Link 
 
http://www.w3.org
This protocol is used to send requests from browsers (Communicator,
Internet Explorer, Opera, etc.) and to return the contents of 
requested pages. A request made by a browser contains the name 
of the protocol (HTTP), the name of the machine 
(www.ufr-info-p6.jussieu.fr), 
and the path of the requested page (/Public/Localisation/index.html). 
Together these components define a URL (Uniform Resource
Locator):
http://www.ufr-info-p6.jussieu.fr/Public/Localisation/index.html
When such a URL is requested by a browser, a connection over a socket 
is established between the browser and the server running on the 
indicated server, by default on port 80. Then the browser sends 
a request in the HTTP format, like the following:
GET /index.html HTTP/1.0
The server responds in the protocol HTTP, with a header: 
HTTP/1.1 200 OK
Date: Wed, 14 Jul 1999 22:07:48 GMT
Server: Apache/1.3.4 (Unix) PHP/3.0.6 AuthMySQL/2.20
Last-Modified: Thu, 10 Jun 1999 12:53:46 GMT
 
Accept-Ranges: bytes
Content-Length: 3663
Connection: close
Content-Type: text/html
This header indicates that the request has been accepted (code 200 OK), the kind of server,
the modification date for the page, the length of the send page and the type of content which
follows.
Using the GET commmand in the protocol (HTTP/1.0), only the HTML page is transferred. 
The following connection with telnet allows us to see what is actually transmitted:
$ telnet www.ufr-info-p6.jussieu.fr 80
Trying 132.227.68.44...
Connected to triton.ufr-info-p6.jussieu.fr.
Escape character is '^]'.
GET
<!-- index.html -->
<HTML>
<HEAD>
<TITLE>Serveur de l'UFR d'Informatique de Pierre et Marie Curie</TITLE>
</HEAD>
<BODY>
<IMG SRC="/Icons/upmc.gif" ALT="logo-P6" ALIGN=LEFT HSPACE=30>
Unité de Formation et de Recherche 922 - Informatique<BR>
Université Pierre et Marie Curie<BR>
4, place Jussieu<BR>
75252 PARIS Cedex 05, France<BR><P> 
....
</BODY>
</HTML>
<!-- index.html -->
Connection closed by foreign host.
The connection closes once the page has been copied.
The base protocol is in text mode so that the language may 
be interpreted. Note that images are not transmitted with the 
page. It is up to the browser, when analyzing the syntax of the 
HTML page, to observe anchors and images (see the IMG tags
in the transmitted page). At this time, the browser sends a new 
request for each image encountered in the HTML source; there
is a new connection for each image. The images are displayed when 
they are received. For this reason, images are often 
displayed in parallel.
The HTTP protocol is simple enough, but it transports information
in the HTML language, which is more complex.
 Protocols with Acknowledgement and Time Limits
When a protocol is complex, it is useful that the receiver 
of a message indicate to the sender that it has received the
message and that it is grammatically correct.
The client blocks while waiting for a response before
working on its tasks. If the part of the server handling this 
request has a difficulty interpreting the message, the server
must indicate this fact to the client rather than ignoring the 
request. The HTTP protocol has a system of error codes.
A correct request results in the code 200. A badly-formed request
or a request for an unauthorized page results in an error code 4xx or 5xx 
according to the nature of the error. These error codes allow 
the client to know what to do and allow the server to record the 
details of such incidents in its log files.
When the server is in an inconsistent state, it can always 
accept a connection from a client, but risks never sending 
it a response over the socket. For avoiding these blocking 
waits, it is useful to fix a limit to the time for transmission
of the response. After this time has elapsed, the client 
supposes that the server is no longer responding. 
Then the client can close this connection in order to go 
on to its other work. This is 
how WWW browsers work. When a request has no response 
after a certain time, the browser decides to indicate that 
to the user. Objective CAML has input-output with time limits. In 
the Thread library, the functions wait_time_read and
wait_time_write suspend execution until a 
character can be read or written, within a certain time limit.
As input, these function take a file descriptor and a time limit 
in seconds:
Unix.file_descr -> float -> bool. 
If the time limit has passed, the function returns 
false, otherwise the I/O is processed.
 Transmitting Values in their Internal Representation
The interest in transmission of internal values comes from 
simplifying the protocol. There is no longer any need to 
encode and decode data in a textual format. The inherent 
difficulty in sending and receiving values in their 
internal representation are the same as those encountered 
for persistent values (see the Marshal library, 
page ??).
In effect, reading or writing a value in a file is 
equivalent to receiving the same value over a socket.
 Functional Values
In the case of transmitting a closure between two Objective CAML programs,
the code in the closure is not sent, only its environment and 
its code pointer (see figure 12.9 page ??).
For this strategy to work, it is necessary that the server 
possess the same code in the same memory location. This implies
that the same program is running on the server as on the client.
Nothing, however, prevents the two programs from running different 
parts of the code at the same time. We adapt the matrix calculation service 
by sending a closure with an environment containing the data for 
calculation. When it is received, the server applies this closure
to () and the calculation begins.
 Interoperating with Different Languages
The interest in text protocols is that they are independent 
of implementation languages for clients and servers. In effect, 
the ASCII code is always known by programming languages.
Therefore, it is up to the client and to the server to analyze 
syntactically the strings of characters transmitted. 
An example of such an open protocol is the simulation of 
soccer players called RoboCup. 
 Soccer Robots
A soccer team plays against another team. Each member of the team 
is a client of a referee server. The players on the same team cannot communicate
directly with each other. They must send information through the server, which 
retransmits the dialog. The server shows a part of the field, 
according to the player's position.
All these communications follow a text protocol. A Web page that describes the protocol, 
the server, and certain clients:
 Link 
 
http://www.robocup.org/
The server is written in C. The clients are written in different languages:
C, C++, SmallTalk, Objective CAML, etc.
Nothing prevents a team from fielding players written in different languages.
This protocol responds to the interoperability needs between programs 
in different implementation languages. It is relatively simple, but 
it requires a particular syntax analyzer for each family of languages.
 
 
