SELFMAN Longer Summary


The goal of SELFMAN is to make large-scale distributed applications that are self managing by combining the strong points of component models and structured overlay networks. One of the key obstacles to deploying large-scale applications running on networks such as the Internet or company intranets is the issue of management. Currently many specialized personnel are needed to keep large Internet applications running. SELFMAN will contribute to removing this obstacle, and thus enable the development of many more Internet applications and Internet-based companies that depend on such applications.

In the context of SELFMAN, we define self management along four axes: self configuration (systems configure themselves according to high-level management policies), self healing (systems automatically handle faults and repair them), self tuning (systems continuously monitor their performance and adjust their behavior to optimize resource usage and meet service level agreements), and self protection (systems protect themselves against security attacks). SELFMAN will provide self management by combining a component model with a structured overlay network. The component model will support dynamic configuration, the ability of part of the system to reconfigure other parts at run-time, which is the key property that underlies the self-management abilities. Basing the system on a structured overlay network enables SELFMAN to extend the abilities of the component model to large-scale distributed systems. Structured overlay networks have made much progress since their origins in peer-to-peer file-sharing applications. In contrast to file-sharing applications, structured overlay networks provide guarantees for efficient communication and reorganization in case of failures. These are already low-level self-management properties. Combining this with the component model, SELFMAN will build high-level self-management properties on top of these low-level properties.

SELFMAN will do both foundational research and applied research. The foundational research will design a distributed service architecture that combines structured overlay networks (for communication and basic self management) with component models (for the higher self management primitives). To make the research concrete we will target multi-tier applications, and specifically we will build two-tier applications using a self-managing storage (database) service. We will use industrial trace data to measure the effectiveness of our self managing architecture. We will do implementation work in two directions: first, to explore how an industrial standard platform (J2EE) can be made self-managing, and second, to push self management as far as we can, in terms of fundamental programming language research, without being restrained by existing tools. The second implementation will be based on the Mozart Programming System and the Oz language. The interplay between these two implementations will be to the benefit of both. The industrial partners will use the results of SELFMAN to guide their strategic decisions for distributed systems development.