Free Republic
Browse · Search
General/Chat
Topics · Post Article

Skip to comments.

IBM US nuke-lab beast 'Sequoia' is top of the flops (petaflops, that is)
The Register ^ | 18th June 2012 09:14 GMT | Timothy Prickett Morgan

Posted on 06/18/2012 10:09:06 AM PDT by Ernest_at_the_Beach

ISC 2012

For the second time in the past two years, a new supercomputer has taken the top ranking in the Top 500 list of supercomputers – and it does not use a hybrid CPU-GPU architecture. But the question everyone will be asking at the International Super Computing conference in Hamburg, Germany today is whether this is the last hurrah for such monolithic parallel machines and whether the move toward hybrid machines where GPUs or other kinds of coprocessors do most of the work is inevitable.

LLNL's Sequoia BlueGene/Q super being assembled by IBM

LLNL's Sequoia BlueGene/Q super being
assembled by IBM (click to enlarge)

No one can predict the future, of course, even if they happen to be Lawrence Livermore National Laboratory (LLNL) and even if they happen to have just fired up IBM's "Sequoia" BlueGene/Q beast, which has been put through the Linpack benchmark paces, delivering 16.32 petaflops of sustained performance running across the 1.57 million PowerPC cores inside the box.

Sequoia has a peak theoretical performance of 20.1 petaflops, so 81.1 per cent of the possible clocks in the box that could do work running Linpack did so when the benchmark test was done. LLNL was where the original BlueGene/L super was commercialized, so that particular Department of Energy nuke lab knows how to tune the massively parallel Power machine better than anyone on the planet, meaning the efficiency is not a surprise. And the supercomputing lab was absolutely banking on the Sequoia machine's power efficiency; the super-efficient beast burns only 7.89 megawatts to deliver that 16.32 petaflops of oomph.

The former top flopper on the list – the K massively parallel Sparc64-VIIIfx machine built by Fujitsu for the Japanese government, which shifts down to number two – had a sustained Linpack performance of 10.5 petaflops against a peak of 11.3 petaflops, for an impressive efficiency of 93.2 per cent. But this monster Sparc box sucks down 12.7 megawatts when it is running, or a mere 830 megaflops per watt. Sequoia is 2.5 times as energy efficient as K, at least when running Linpack.

Thus far, the largest hybrid CPU-GPU machine on the list, the Tianha-1A super at the National Supercomputing Center in Tianjin, China – which uses a mix of Intel Xeon X5760 processors and Nvidia Tesla M2050 GPU coprocessors – wastes 45.4 per cent of its aggregate number-crunching capability because of the hybrid programming model and the latencies in talking between the CPU and the GPU. The Tianha-1A, which delivers 2.57 petaflops of performance, only delivers 635 megaflops per watt. By contrast, the Sequoia machine is 3.25 times as energy efficient per unit of real work done – again, assuming that you consider Linpack indicative of real work.

The Top 500 list does not include the cost of the machines, which is also a very important factor. The BlueGene/Q machine costs millions of dollars per rack – IBM does not say how much precisely, since this is essentially a custom product – and can scale to 512 racks and up to 100 petaflops of aggregate peak performance. The trouble is, who has the estimated $1bn to build such a behemoth? Governments have a hard time coming up with that kind of cash these days, even if they do want to simulate nuclear weapons.

The point is that a real ranking of the world's supercomputers would look at sustained performance, computational efficiency, performance per watt, and bang for the buck. Three out of four ain't bad, but it also ain't enough. (Moreover, the power draw figures are not available for all of the machines on the Top 500 list, so it is more like two-and-a-half out of four.)

LLNL awarded Big Blue the contract to build Sequoia back in February 2009. The massively parallel machine is based on IBM's 18-core PowerPC A2 processor, which is a 64-bit chip that has one core to run the Linux kernel; one spare in case one goes dead; and 16 cores for doing compute tasks. One chip and 16GB of memory are packaged up on a compute card, while 32 cards are plugged into a node card – which has optical modules to link into the 5D torus that allows all the nodes to talk to each other. You put 16 of these node cards in a chassis with eight I/O drawers to make a half-rack midplane, and then stack two of these to make a rack.

The BlueGene/Q interconnect runs at 40Gb/sec and has a node-to-node latency hop of 2.5 microseconds. The logic for that 5D torus interconnect is embedded on the PowerPC A2 chips, which run at 1.6GHz, with 11 links running at 2GB/sec. Two of these can be used for PCI-Express 2.0 x8 peripheral slots. The 14-port crossbar switch/router at the center of the chip supports point-to-point, collective, and barrier messages and also implements direct memory access between nodes.

Like K and its "Tofu" 6D torus/mesh interconnect, this flagship BlueGene/Q super is no slouch on any dimension you want to measure. Fujitsu has commercialized the K super as the PrimeHPC FX10 line, which has a 16-core Sparc64-IXfx processor and which scales to 23 petaflops. The only problem is the all-out FX10 machine with 1,024 racks – that's 98,304 compute nodes and 6PB of main memory – burns 23 megawatts and costs $655.4m at list price. That's a big number, even for the HPC racket. (And no, it cannot play Crysis, and neither can BlueGene/Q. A Windows-based ceepie-geepie certainly could, so there is that to consider.)

IBM takes five out of the top 10

IBM is having a very good Top 500 this time around, with five of the top 10 systems bearing its stripey moniker.

Number three on the list behind the K super is another BlueGene/Q machine called "Mira," which is installed at Argonne National Laboratory. This machine is essentially half of Sequoia.

Number four on the list is SuperMUC, another IBM box, but this one is based on Intel's latest Xeon E5-2680 processors, which are plopped into IBM's iDataPlex dx360 M4 rackish-bladish servers. SuperMUC was built by IBM under contract from the Partnership for Advanced Computing in Europe (PRACE) for the Leibniz-Rechenzentrum (LRZ) located in Germany. The SuperMUC contract was awarded in January 2011, and the neat thing about this box is that there are water blocks on processors and main memory on the iDataPlex system boards, and a closed-loop water-cooling system uses relatively warm water (up to 45 degrees Celsius, which is 113 degrees Fahrenheit) to keep these active components from overheating. (We'll be taking a separate look at SuperMUC later.) SuperMUC cost $110.9m to build and operate over five years, according to the contract; the machine currently has 147,456 Xeon cores and delivers just under 2.9 petaflops of sustained performance on the Linpack test with a computational efficiency of 91 per cent. That's pretty good, and is no doubt helped by the 56Gb/sec FDR InfiniBand network linking those iDataPlex nodes together. But the machine does burn 3.42 megawatts, so it only delivers 847 megaflops per watt. LLNL's Sequoia is 2.44 times as energy efficient.

Number five is the Tianhe-1A machine that ceepie-geepie weighs in at 2.57 petaflops and which was the fastest machine in the world back in November 2010 and signaled the arrival of China as a contender in the exascale supercomputing arms race.

The "Jaguar" massively parallel supercomputer ranks sixth on the list and is installed at Oak Ridge National Laboratory, another nuke lab controlled by the US Department of Energy. Jaguar is in the process of being upgraded to the 20-petaflops "Titan" super ceepie-geepie. But this process is only just beginning, with nodes being upgraded by Cray to the latest Opteron 6274 processors and boosted with the latest "Gemini" XE interconnect and some of the nodes getting Nvidia Tesla M2090 coprocessors.

As it now stands, Jaguar has 298,592 cores and delivers 1.94 petaflops of sustained oomph (at a computational efficiency of 73.9 percent across those CPUs and GPUs), but Jaguar consumes an incredible 5.14 megawatts of electricity as it runs. That works out to only 377.5 megaflops per watt. At this point in the transformation from Jaguar to Titan, Sequoia is 5.5 times as energy efficient. That gap will close considerably as "Kepler" Tesla K20 GPUs are added to the Titan machine this fall and Oak Ridge takes advantage of GPUDirect and all of the funky innovations that Nvidia has put into these GPU coprocessors.

The rest of the best

Numbers seven and eight on the June 2012 Top 500 supers ranking are both BlueGene/Q machines. Number seven is nick-named "Fermi" and is installed at CINECA, a consortium of 54 universities in Italy that has a long history of buying IBM and Cray supers. The Fermi super has 163,830 cores and delivers 1.73 petaflops of sustained Linpack performance. Number eight is called "JuQueen" and is at Forschungszentrum Juelich (FZJ) in Germany. This machine is a 131,072-core BlueGene/Q that delivers 1.38 petaflops sustained on Linpack.

Groupe Bull's "Curie" thin node machine – based on the Bullx B510 server nodes with Xeon E5-2680 processors and using 40Gb/sec InfiniBand interconnect – had 120,640 cores and delivered 1.27 petaflops of double-precision matrix math performance. This machine has a computational efficiency of 81.5 per cent, which is not bad, but it only delivers 603.7 megaflops per watt, which is not all that great in terms of energy efficiency. (But, if your code is tuned for x86 and InfiniBand, then maybe this is what matters more.)

Rounding out the top 10 is the "Nebulae" ceepie-geepie built by China's Dawning for the National Supercomputing Center in Shenzhen. Nebulae pairs Xeon X5690 processors from Intel with Tesla M2050 GPU coprocessors from Nvidia. This machine came into the number two position on the June 2010 Top 500 list and is unchanged from that time. This box has 120,640 cores in total and delivers 1.27 petaflops of performance but burns a staggering 2.58 megawatts. Nebulae has a computational efficiency of only 42.6 per cent and delivers only 492.6 megaflops per watt.

Sequoia, as you would expect from a giant redwood tree, is being true to its name and setting the performance and efficiency bars pretty high in the HPC arms race.

Incidentally, the United Kingdom nearly edged into the top 10, with the 114,688-core BlueGene/Q machine nick-named "Blue Joule" at Daresbury Laboratory, weighing in at 1.21 petaflops and giving the lab the number 13 spot in the world Linpack rankings this time around. That's a little shy of the 1.4 petaflops it was expecting and the number 10 ranking Daresbury was hoping for. The November 2012 list will have the "Blue Waters" XK6 hybrid CPU-GPU super from Cray on it as well as the fully upgraded Titan machine at Oak Ridge, so if nothing else changes, Daresbury needs to add a few hundred teraflops to make the top 10 this autumn.


TOPICS: Computers/Internet
KEYWORDS: exascale; hitech; hpcibmlinpavk; lawrencelivermore; llnl; oakridge; q; sequoia; svs

1 posted on 06/18/2012 10:09:15 AM PDT by Ernest_at_the_Beach
[ Post Reply | Private Reply | View Replies]

To: Ernest_at_the_Beach

Big jump from my 1410/7010 Autocoder days!


2 posted on 06/18/2012 10:19:20 AM PDT by duckman (Go Newt...)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Ernest_at_the_Beach

Impressive but let it be self-powering, or even be able to turn the power on for itself, and that’ll really be a show stopper.

No power == tons of nothing. No programs == big power bill for nothing.

Wonder if the chinese can turn off any of those chips.


3 posted on 06/18/2012 10:25:43 AM PDT by Secret Agent Man (I can neither confirm or deny that; even if I could, I couldn't - it's classified.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Ernest_at_the_Beach
I assume most, if not all run *nix

4 posted on 06/18/2012 10:27:16 AM PDT by Uri’el-2012 (Psalm 119:174 I long for Your salvation, YHvH, Your law is my delight.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: duckman

Let’s see, that works out to the computing power of how many Commodore 64’s?


5 posted on 06/18/2012 10:27:28 AM PDT by In Maryland (Liberal logic - the ultimate oxymoron!)
[ Post Reply | Private Reply | To 2 | View Replies]

To: duckman
That was a good machine.

The Sequoia

Is a big jump from my 360/370/3090 with Vector days....

6 posted on 06/18/2012 10:28:03 AM PDT by Ernest_at_the_Beach (The Global Warming Hoax was a Criminal Act....where is Al Gore?)
[ Post Reply | Private Reply | To 2 | View Replies]

To: UriÂ’el-2012; ShadowAce
I was reading an article the other day where 91+ % of the top (50 or 100 or 500 ) ran Linux,...and there was only one running windows....think the rest were various flavors of Unix....

Got to find the article again ....to clear up the number in the list....forget where I read it.

7 posted on 06/18/2012 10:33:06 AM PDT by Ernest_at_the_Beach (The Global Warming Hoax was a Criminal Act....where is Al Gore?)
[ Post Reply | Private Reply | To 4 | View Replies]

To: rdb3; Calvinist_Dark_Lord; Salo; JosephW; Only1choice____Freedom; amigatec; stylin_geek; ...

8 posted on 06/18/2012 10:35:05 AM PDT by ShadowAce (Linux -- The Ultimate Windows Service Pack)
[ Post Reply | Private Reply | To 1 | View Replies]

To: UriÂ’el-2012
Found this from last year:

Where Linux crushes Windows like a bug: Supercomputers

***********************************************EXCERPT**************************************

By Steven J. Vaughan-Nichols | November 14, 2011, 10:11am PST

Summary: Linux is tiny on desktops, powerful on servers, mighty on Web servers, and rules over all on supercomputers.


9 posted on 06/18/2012 10:43:51 AM PDT by Ernest_at_the_Beach (The Global Warming Hoax was a Criminal Act....where is Al Gore?)
[ Post Reply | Private Reply | To 4 | View Replies]

To: Ernest_at_the_Beach

With ten years in HPC under my belt, I can vouch for that. Linux flat-out is the rule in HPC.


10 posted on 06/18/2012 10:52:14 AM PDT by RightOnline (I am Andrew Breitbart!)
[ Post Reply | Private Reply | To 9 | View Replies]

To: RightOnline
I have seen a couple of distros on Distrowatch...for clusters...Like Dragonfly ...

I suppose on the top stuff they roll there own.

Guess it is based on BSD:

DragonFly BSD

The following distributions match your criteria:


1. PelicanHPC GNU Linux
PelicanHPC is a Debian-based live CD image with a goal to make it simple to set up a high performance computing cluster. The front-end node (either a real computer or a virtual machine) boots from the CD image. The compute nodes boot by Pre-Execution Environment (PXE), using the front-end node as the server. All of the nodes of the cluster get their file systems from the same CD image, so it is guaranteed that all nodes run the same software. The CD image is created by running a single script, which makes it possible to customise the live CD image with extra Debian packages.

2. Rocks Cluster Distribution
Rocks is a complete "cluster on a CD" solution for x86 and x86_64 Red Hat Linux clusters. Building a Rocks cluster does not require any experience in clustering, yet a cluster architect will find a flexible and programmatic way to redesign the entire software stack just below the surface (appropriately hidden from the majority of users). Although Rocks includes the tools expected from any clustering software stack (PBS, Maui, GM support, Ganglia, etc), it is unique in its simplicity of installation.

11 posted on 06/18/2012 11:01:52 AM PDT by Ernest_at_the_Beach (The Global Warming Hoax was a Criminal Act....where is Al Gore?)
[ Post Reply | Private Reply | To 10 | View Replies]

To: Ernest_at_the_Beach
I had to do a double-take when I saw "Sequoia"... but there are probably only so many names to go around.

Back in the early 90s our SMP lab had machines from many different vendors. Sequoia is one of the companies we had contact with at that time.

12 posted on 06/18/2012 11:29:35 AM PDT by ken in texas (I was taught to respect my elders but it keeps getting harder to find any.)
[ Post Reply | Private Reply | To 6 | View Replies]

To: Ernest_at_the_Beach

Petaflops and PETA...

I know there’s a joke in there somewhere!


13 posted on 06/18/2012 11:47:45 AM PDT by Jack Hydrazine (It's the end of the world as we know it and I feel fine!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Jack Hydrazine

petaflops?


14 posted on 06/18/2012 12:23:07 PM PDT by brivette
[ Post Reply | Private Reply | To 13 | View Replies]

To: duckman; RightOnline; Jack Hydrazine; NormsRevenge; Fred Nerks; TigersEye
From this article:

IBM's BlueGene/Q super chip grows 18th core

*********************************************EXCERPT**************************************


15 posted on 06/18/2012 12:28:49 PM PDT by Ernest_at_the_Beach (The Global Warming Hoax was a Criminal Act....where is Al Gore?)
[ Post Reply | Private Reply | To 2 | View Replies]

To: brivette
FLOPS/flops is a float-point operations per second.

A petaflop is 1015 flops or a million billion floating-point operations per second.
16 posted on 06/18/2012 12:31:58 PM PDT by Jack Hydrazine (It's the end of the world as we know it and I feel fine!)
[ Post Reply | Private Reply | To 14 | View Replies]

To: Jack Hydrazine

thank you for the clarification.


17 posted on 06/18/2012 12:54:33 PM PDT by brivette
[ Post Reply | Private Reply | To 16 | View Replies]

To: Ernest_at_the_Beach

Most HPC clusters run Red Hat or SUSE. As for apps., they fall into two categories: roll-your-own or off-the-shelf. I’d say 99.9% of Blue Gene apps. are RYO. In fact, I’m sure of it.


18 posted on 06/18/2012 6:50:11 PM PDT by RightOnline (I am Andrew Breitbart!)
[ Post Reply | Private Reply | To 11 | View Replies]

To: ken in texas

There is an election / voting machine company called Sequioa Voting Systems as well.


19 posted on 11/20/2020 9:54:48 PM PST by piasa (Attitude adjustments offered here free of charge)
[ Post Reply | Private Reply | To 12 | View Replies]

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson