These days, there are a number of different approaches to high-performance computing, systems usually referred to as supercomputers. Most of these systems use a massive number of Xeon processors, but we are starting to see the most interesting, such as Nvidia’s Tesla or Intel Xeon Phi. There’s even some talk that massive ARM-based systems could be effective in the future. But what if you could try all of these architectures in one location?
That’s the challenge and promise of the new MareNostrum 4 computer, which is being readied for installation at the Barcelona Supercomputing Center. The new design includes a main system for general-purpose use based on traditional Xeons, plus three new emerging technology clusters, based on IBM Power and Nvidia, Xeon Phi, and ARM-based computing. While I was in Barcelona for Mobile World Congress, I had a chance to talk to Sergi Girona, Operations Director for the BSC, who explained the reasoning behind the four different clusters.
Girona said the center’s main mission is to provide supercomputing services for Spanish and other European researchers, in addition to
For the general computing cluster, Girona says the center chose a traditional Xeon design because it was easier to migrate applications that run on the current MareNostrum 3, slated to be disconnected next week. The design also had to fit the existing space, within a chapel. (I visited the center last year and thea year ago.)
The new design, to be built by Lenovo, will be based on the new Xeon v5 (Skylake), with 3,456 nodes, each with two sockets, and each chip will contain 24 cores each, for a total theoretical peak performance of 11.14 petaflops per second. Most cores will have 2GB of memory, but 6 percent will have 8GB, for a total of 331.7TB of RAM. Each node will have a 240GB SSD, though
One thing I found interesting here is how clearly the move to the new generation demonstrates the progression of technology. The previous generation had a peak performance of about 1 petaflop, and this system should be more than 10 times
For emerging technology, the site will have three new clusters. One will consist of IBM Power 9 processors and Nvidia GPUs, designed to have a peak processing capability of over 1.5 Petaflop/s. This cluster will be built by
The second cluster will be made up of Intel Xeon Phi processors, with Lenovo building a system that uses the forthcoming Knights Hill (KNH) version and OmniPath, with a peak processing capability over 0.5 Petaflop/s. This also mimics the American CORAL
Finally, a third cluster will be formed of 64-bit ARMv8 processors that Fujitsu will provide in a prototype machine, which is designed to use the same processors that Fujitsu is developing
Overall, the system will cost $34 million, in a contract won by IBM and funded by the Spanish government. One major reason for having all four types of computing on
Girona said that BSC wants to influence the development of new technologies, and is planning on using the new machine to analyze what will happen in the future, in
Another topic researchers are considering is whether or not it would be worth developing a European processor for IT, likely based on the ARM architecture.
Barcelona won’t have the fastest supercomputer in the world; that record is currently held by the Chinese, with the Americans and Japanese trying to catch up. But MareNostrum 4 will be the most diverse, and potentially the most interesting.