Progress in distributed computing can be framed in terms of an ever tightening integration and standardization of commodity hardware followed (not always quickly) by the exploitation of hardware via software. Viewed through this lens, clustered computing is here now and has been here for a while. Only the attributes that describe it at any time have been changing. But of course, these aspects determine its practicality. What are some of these attributes? Hardware attributes relate to both the processor and the network. For the processor the attributes are instruction speed, memory, and IO. The latter two have become problems. For the network the attributes are latency, bandwidth and scalability. There is also an additional attribute which is uniqueness. Uniqueness here refers to how commercial-off-the-shelf are the components, and can be seen as how commercial vendors distinguish their products. This affects availability and cost. We are in a time when plug and play hardware components provide ever lower cost computation. Three directions for concern are balance, fault tolerance, and system administration.
Balance refers to how hardware attributes match the attributes of problem suites. What do problems of interest to industry require in terms of hardware resources. How balanced a system can be created for low cost, medium cost, and high cost? What does lack of balance cost US industry? What are the needs of emerging small business. What impact does information technology have on this. How is balance and its impact expected to change over the next decade.
Fault tolerance increases in importance as the time and space of computation grows. The dispersed or protracted computation is more likely to experience a failure. Failure here refers to hardware faults. How much does fault tolerance in clusters cost in terms of hardware and in terms of productivity? What is its impact on system balance and how is this expected to change over time?
How can the overhead of system administration be reduced? How automated can administration of parallel and distributed systems be made? Are there specific utilities which can be developed which will simplify this? What are the tradeoffs in terms of productivity of users and in terms of costs of running computer systems i.e. hardware and personnel costs?
In terms of software, desired attributes include usability, portability, reliability, and scalability. How can parallel and distributed systems be made more usable? What libraries are needed to shorten development and run times? Are there specific classes of libraries which would have the most impact on industry in general, in specific? How can parallel software development be simplified, hastened, and be made more reliable? As hardware speeds change, how will this affect software which is scalable today? How can parallel software development be disassociated from particular vendor requirements?
There has been a drift of users from supercomputers to the lower cost and increasingly powerful, higher volume computers. These computers provide user-friendly environments with a wide range of supporting packages. In addition, code which was written a decade ago for them not only still runs, but runs faster. The lower cost machines have preserved the users' time investments.
What will be the ultimate mix of use of parallel and distributed systems? Will most users prefer to use parallel systems to run multiple sequential programs? Will most parallel computation be runs of pre-written packages (as with industry supercomputer users)? Are there specific packages that would be parallel computing analogues to the popular personal computing packages? One use of supercomputers has been to run turnkey applications such as crash tests by the automotive industry. As parallel computers drop in price it becomes cost effective to pay to have these applications re-written as turnkey applications for parallel computers. Are these packages going to constitute the bulk of true parallel jobs in the future?