The Cray Inc. XT4

Introduction
HPC Architecture
  1. Shared-memory SIMD machines
  2. Distributed-memory SIMD machines
  3. Shared-memory MIMD machines
  4. Distributed-memory MIMD machines
  5. ccNUMA machines
  6. Clusters
  7. Processors
    1. AMD Opteron
    2. IBM POWER5+
    3. IBM BlueGene processors
    4. Intel Itanium 2
    5. Intel Xeon
    6. The MIPS processor
    7. The SPARC processors
  8. Networks
    1. Infiniband
    2. InfiniPath
    3. Myrinet
    4. QsNet
    5. SCI
Available systems
  1. The Bull NovaScale
  2. The C-DAC PARAM Padma
  3. The Cray X1E
  4. The Cray XT3
  5. The Cray XT4
  6. The Cray XMT
  7. The Fujitsu/Siemens M9000
  8. The Fujitsu/Siemens PRIMEQUEST
  9. The Hitachi BladeSymphony
  10. The Hitachi SR11000
  11. The HP Integrity Superdome
  12. The IBM eServer p575
  13. The IBM BlueGene/L&P
  14. The Liquid Computing LiquidIQ
  15. The NEC Express5800/1000
  16. The NEC SX-8
  17. The SGI Altix 4000
  18. The SiCortex SC series
  19. The Sun M9000
Systems disappeared from the list
Systems under development
Glossary
Acknowledgments
References

Machine type Distributed-memory multi-vector processor
Models XT4
Operating system UNICOS/lc, Cray's microkernel Unix
Connection structure 3-D Torus
Compilers Fortran 95, C, C++
Vendors information Web page www.cray.com/products/xt4/index.html/
Year of introduction 2006

System parameters:

Model Cray XT4
Clock cycle 2.6 GHz
Theor. peak performance  
Per Processor 10.4 Gflop/s
Per Cabinet 998.4 Gflop/s
Max. Configuration 319 Tflop/s
Memory  
Per Cabinet ≤ 768 GB
Max. Configuration 196 TB
No. of processors  
Per Cabinet 96
Max. Configuration 30,508
Communication bandwidth  
Point-to-point ≤ 7.6 GB/s
Bisectional/cabinet 667 GB/s

Remarks:

Architecturally the Cray XT4 is very much alike its predecessor, the Cray XT3, which is still marketed next to the XT4 (see the XT3). The main structure with respect to the internode network and the nodes themselves are still the same. However, there are some significant improvements compared to the XT3: the point-to-point network speed has doubled to a 7.6 GB/s bi-directional link and within the nodes dual-core Opterons at a clock frequency of 2.6 GHz are used, also doubling the peak performance per node. In addition, the memory bandwidth is doubled from 6.4 GB/s to 12.8 GB/s. However, the speed to the communication router, the SeaStar2 has stayed the same at 6.4 GB/s, the standard HyperTransport speed.

Measured Performances:

As already given in the Performance section of the XT3, ORNL in the USA reports in [49] a speed of 101.4 Tflop/s out of 127.4 Tflop/s is reported on a 26,544 core system: an efficiency of 80%. ORNL in the USA reports in [49] a performance of 101.7 Tflop/s on a 23,016-processor regular XT3/XT4 system mixture for a linear system of size 2,220,160 with an efficiency of 85%. Furthermore, additional benchmark results for this system are available from Kuehn et. al., ([26]).