Monday, August 31, 2009

THE DESIGN PHILOSOPHY OF THE DARPA INTERNET PROTOCOLS

D.D. Clark published this paper in 1988 almost fifteen years after the first TCP/IP proposal which was developed by DARPA. It is interesting to understand why certain directions were taken by the original designers and what was the evolutionary road internet took during those fifteen years.

The DARPA internet project was launched in order to effectively interconnect two distinct networks being ARPANET and ARPA packet radio network and this effective connection was the fundamental goal of the project. Being able to effectively connect networks can be seen as a genetic mutation in the networks evolutionary pattern which enabled much more scalable and complex networks to be built compared to networks which are designed and upgraded uniformly!

Two of the original approaches of the internet, being datagram and packet switching , were already existing in the networks under investigation and were the default candidates. Furthermore they had shown by that time (and continued to show in the future) to be valid choices compared to their competitors such as virtual networks paradigm.

The author also lists secondary and lower tier goals across his paper. Since these networks were being originally designed for military applications, reliability and robustness to failures was put at the top of the secondary list while the accountability was put at the end of secondary list. Noting that this paper was written in 1988, the author claims that if internet was being designed for a commercial application these two choices would needed to be swapped! I find this point of view interesting since it suggest that in 1988 the importance of reliability was not as evident in commercial applications. Today's commercial applications count on the reliabilty of the internet and an unreliable internet can have a destructive effect on many commercial applications today. I would assume if Clark had written this paper in 2009 he would have only ranked accountability higher and keep the reliability at the top two in the secondary list.

It is also notable the couple of places that Clark implicitly points to the usage of end-to-end argument as described by the first paper we have read. One being the "fate-sharing" story in which by keeping the state information at the end-entities rather than network and the other one being the separation of services from the datagram facility (TCP and IP layers).

I would also like to point out that throughout this paper it does not look like scalability is considered to be an important issue. This is likely due to the year this paper was published and might suggest that the number of networks was still not too big. For example the suggestion at the end of the paper about the "soft state" does not consider the possible scalability problems due to such structure.

I would very much like this paper to be kept in the syllabus since it makes students think about the evolutionary path of internet and helps them obtain a more fundamental understanding of it.

Friday, August 28, 2009

END-TO-END ARGUMENTS IN SYSTEM DESIGN

In this paper, Saltzer et. al. present a design principle called “end-to-end argument” which should help designers with the placement of functionalities over a distributed computer system. Usually when the computer involves some kind of communication, a modular boundary is drawn between the part which is responsible for communication (called the communication subsystem in this paper) and the rest of system. The question is: when it is possible to implement a functionality at different levels what should be the choice of designer? One can choose to implement the target functionality at either one of the following ways:

1) by the communication subsystem

2) by the end clients

3) as a joint venture

4) all of the above (redundantly)

In brief, the “end-to-end argument” states that when the function under investigation can only be correctly and completely implemented with the help and knowledge of the end-user application, the communication subsystem can not provide this functionality to the application as a feature.

This does not mean that communication subsystem features and helper functionalities are not important but rather restates that one should look at them from an engineering tradeoff point of view and in support of (and not in parallel with) application required functionalities. The implementation of a functionality at lower levels can become too costly for two reasons. First, the commonality of lower level subsystems to many applications makes other applications also pay for the functionality. Second, the lower levels have access to less information and therefore might not be able to do the job efficiently.

Furthermore, the authors of the paper go over a series of examples that the “end-to-end argument” applies and gives insight to the designers of such systems including: delivery guarantees, secure transmission of data (end-to-end encryption), duplicate message suppression, guaranteeing FIFO message delivery and transaction management.

The most important point of this paper in my point of view is that do not expect a system to accomplish what is not possible! The description of the “end-to-end argument” as given in this paper seems trivial and still very important. As given by several examples in the paper, it is easy to confuse the scope of the required functionality.

I still have a point of disagreement with the authors. The “end-to-end argument” as stated in the paper only applies when it is not possible for the communication subsystem to provide the exact target functionality. In that case it is obvious that the end-to-end application needs to implement the target functionality. But if it is possible that certain functionality be implemented in end clients or communication subsystem or a mixture of both, then we are entering the world of engineering tradeoffs and no unique answer can be given to all systems. On case by case basis, according to the class of target applications and functionalities and system sub-elements, one will try to find the architecture which works close to optimal in some sense!

For example the authors have an example of automatic recovery which goes as follows: “ In telephone exchanges, a failure that could cause a single call to be lost is considered not worth providing explicit recovery for, since the caller will probably replace the call if it matters” which claim is an example of end-to-end argument. I would like to argue that since it is possible to implement this functionality by communication-subsystem or end-users, the end-to-end argument as given above does not hold. Rather the authors are implicitly solving an optimization problem which minimizes certain cost function. If one was designing this system for very lazy kind of people, he/she would probably design it differently.

I believe this is a good paper to be kept in the syllabus since it points to a very complex and confusing topic being the function placement.