OVERVIEW OF RECENT SUPERCOMPUTERS
In this report we give an overview of high-performance computers which are
currently available or will become available within a short time frame from
vendors; no attempt is made to list all machines that are still in the
development phase. The machines are described according to their
macro-architectural class. Shared and distributed-memory SIMD an MIMD machines
are discerned. The information about each machine is kept as compact as
possible. Moreover, no attempt is made to quote price information as this is
often even more elusive than the performance of a system. In addition, some
general information about high-performance computer architectures and the
various processors and communication networks employed in these systems is given
in order to better appreciate the systems information given in this report.
This document reflects the technical state of the supercomputer arena
as accurately as possible. However, the author nor NCF take any
responsibility for errors or mistakes in this document. We encourage
anyone who has comments or remarks on the contents to inform us, so we
can improve this report.
INTRODUCTION AND ACCOUNT
This is the 17th edition of a report in which we attempt to give
an overview of high-performance computer systems that are commercially
available or are expected to become available within a short time frame
(typically a few months to half a year). We choose the expression "attempt"
deliberately because the market of high-performance machines is highly
volatile: the rate with which systems are introduced — and disappear again
— is high (although not as high as a few years ago) and therefore the
information may be only approximately valid. Nevertheless, we think that such
an overview is useful for those who want to obtain a general idea about the
various means by which these systems strive at high-performance, especially
when it is updated on a regular basis.
We will try to be as up-to-date and compact as possible and on these grounds we
think there is a place for this report. At this moment systems are disappearing
from the market in a certain sense balance against the ones that are newly
appearing. This is because the spectrum of integrated systems has lost a few
members but also some (few) new systems have appeared. Besides that, an
enormous growth in the use of clusters, some very large, can be observed. A
larger amount of new systems may be expected in the next few years because of
the renewed interest in computer architectures both on the processor level and
the macro-architecture level. This new interest was sparked by the introduction
of the Earth Simulator system in Japan which by TOP500 standards, see [49], for a long time the most powerful
system in the world. As a result a new discussion emerged in the USA to look at
new (and old) architectures as the gap between Theoretical Peak Performance and
application performance for systems born from the ASCI (see [3]) initiative has grown steadily and for
many users of such systems to an unacceptable level.
Programs like the DARPA-funded HPCS program should curb this trend which cannot
be done by staying on the same track of SMP-clustered RISC processor technology
without further enhancements in memory access and intra- and inter-node
bandwidth. Furthermore, it was realised that that no one processor type is best
for all possible types of computation. So, a trend is emerging of diversifying
processor types within a single system. A first sign of this is the appearance
of FPGAs, high-end graphical cards and other computation accelerators in systems
with standard processors. We may expect that this trend will continue in the
coming years which will make the high-performance computer landscape more
diverse and interesting.
Still, the majority of systems still look like minor variations on the same
theme: clusters of RISC(EPIC)-based Symmetric Multi-Processing (SMP) nodes which
in turn are connected by a fast network. Culler
Culler et.al., [9] consider this as
a natural architectural evolution. However, it may also be argued that the
requirements formulated in the ASCI programs has steered these systems in this
direction and that this will change in the coming years for the reasons given
above.
The supercomputer market is a very dynamic one and this is especially
true for the cluster world that have emerged at a tremendous rate in
the last few years. The number of vendors that sell pre-configured
clusters has boomed accordingly and, as for the last few issues, we have
decided not to include such configurations in this report: the
speed with which cluster companies and systems appear and disappear
makes this almost impossible. We will briefly comment on cluster
characteristics and their position relative to other supercomputers in
section Clusters though.
For the tightly-coupled or ``integrated'' parallel systems, however,
we can by updating this report at least follow the main trends in
popular and emerging architectures. The details of the systems be
reported make this a rather bulky document.
As of the 11th issue we decided to introduce a section that
describes the dominant processors in some detail. This seems fit as the
processors the heart of the systems. We do that in section Processors. In addition, as in the 13th
issue on we include a section that discusses specific network implementations,
being also constituents of primary importance apart from the general discussion
about communication networks.
The rule for including systems is as follows: they should be either available
commercially at the time of appearance of this report, or within 6 months
thereafter. This excludes some interesting cluster systems at the Sandia, Los
Alamos, and Lawrence Livermore National Laboratories in the USA (all with
measured performances in the range of 10--100 Tflop/s) and the Japanese Earth
Simulator system (with a performance around 40 Tflop/s) because they are not
marketed or represent standard cluster technology (be it on a grand scale).
The rule that systems should be available within a time-span of 6
months is to avoid confusion by describing systems that are announced
much too early, just for marketing reasons and that will not be
available to general users within a reasonable time. We also have to
refrain from including all generations of a system that are still in
use. Therefore, for instance, we do not include the euarler IBM SP or the Cray
T90 series anymore although some of these systems are still in use.
Generally speaking, we include machines that are presently marketed or
will be marketed within 6 months. To add to the information given in
this report, we quote the Web addresses of the vendors because the
information found there may be more recent than what can be provided
here. On the other hand, such pages should be read with care because it
will not always be clear what the status is of the products described
there.
Some vendors offer systems that are identical in all respects except in
the clock cycle of the nodes (examples are the SGI Altix series
and the Fujitsu PRIMEQUEST). In these cases we always only mention the
models with the fastest clock as it will be always possible to get the
slower systems and we presume that the reader is primarily interested
in the highest possible speeds that can be reached with these systems.
Until the eighth issue of this report we ordered the systems by their
architectural classes as explained in section
architecture. However, this distinction became more and more artificial as
is explained in the same section. Therefore all systems described are simply
listed alphabetically. In the header of each system description the machine
type is provided. There is referred to the architectural class for as far this
is relevant. We omit price information which in most cases is next to useless.
If available, we will give some information about performances of systems based
on user experiences instead of only giving theoretical peak performances. Here
we have adhered to the following policy: We try to quote best measured
performances, if available, thus providing a more realistic upper bound than
the theoretical peak performance. We hardly have to say that the speed range of
supercomputers is enormous, so the best measured performance will not always
reflect the performance of the reader's favorite application. In fact, when the
HPC Linpack test is used to measure the speed it is almost certain that for the
average user the application performance will be significantly lower. When we
give performance information, it is not always possible to quote all sources
and in any case if this information seems (or is) biased, this is entirely the
responsibility of the author of this report. He is quite willing to be
corrected or to receive additional information from anyone who is in the
position to do so.
Although for the average user the appearance of new systems in the last years
tended to become rapidly more and more alike, it is still useful to dwell a
little on the architectural classes that underlie this appearance. It gives
some insight in the various ways that high-performance is achieved and a
feeling why machines perform as they do. This is done in section architecture which will be referred to repeatedly
in sections that describe the various systems.
Up till the 10th issue we included a section
Systems disappeared from the list on
systems that disappeared from the market. We reduced that section
in the printed and PostScript versions because it tends to
take an unreasonable part of the total text. Still, because this
information is of interest to a fair amount of readers and it gives
insight in the field of the historical development of supercomputing
over the last 15 years, this information will still be available in
full in the afore mentioned section. In
section Systems under development we present some systems that are under development
and have a fair chance to appear on the market. Because of the addition
of the section on processors that introduces many technical terms, also
a glossary is included.
The overview given in this report concentrates on the computational
capabilities of the systems discussed. To do full justice to all assets
of present days high-performance computers one should list their I/O
performance and their connectivity possibilities as well. However, the
possible permutations of configurations even for one model of a certain
system often are so large that they would multiply the volume of this
report, which we tried to limit for greater clarity. So, not all
features of the systems discussed will be present. Still we think (and
certainly hope) that the impressions obtained from the entries of the
individual machines may be useful to many. We also omitted some systems
that may be characterised as ``high-performance'' in the fields of
database management, real-time computing, or visualisation. Therefore,
as we try to give an overview for the area of general scientific and
technical computing, systems that are primarily meant for database
retrieval like the former AT&T GIS systems or concentrate
exclusively on the real-time user community, like Concurrent Computing
Systems, are not discussed in this report. Furthermore, we have set
a threshold of about 200 Gflop/s for systems to appear in this report as,
at least with regard to theoretical peak performance, single CPUs often
exceed 2.5 Gflop/s although their actual performance may be an entirely
other matter.
Although most terms will be familiar to many readers, we still think it
is worthwhile to give some of the definitions in section
archictecture section
because some authors tend to give them a meaning that may slightly
differ from the idea the reader already has acquired.
Lastly, we should point out that the Web version is available at
various places. The URLs are:
Europe:
http://www.arcade-eu.org/overview.
Europe:
www.phys.uu.nl/~steen/web06/overview.html.
Europe:
http://www.euroben.nl/reports/web06/overview.html
General:
www.top500.org/in_focus/orsc/ .
From this issue on we will attempt to keep the web version up-to-date by
refreshing the contents more frequently than once a year. So, the printed
version may lag a little behind the web version over the year.
|