This week at the Amazon developer’s conference in Vegas we got to see the latest supercomputer. I have been fascinated with this genre for many years. In the olden days, these were mammoth beasts, occupying rooms full of gear and burning up electricity like crazy. They cost millions of dollars, and had minions of folks tending to their care and feeding.
Back in 2004, I was fortunate enough to go out to San Francisco where a bunch of random folks were trying to assemble the first “flash mob” computer. You brought your own laptop (or desktop if you were strong enough) and left it for the weekend while they hooked it up to a switching fabric and tried to get every PC in sync.
Both the flash mob and the traditional supercomputer are old style. Today we have supercomputers in the cloud. With a click of a few mouse clicks, you could be running on thousands of virtual cores. It was bound to happen, and this week the folks at CycleComputing showed what they were doing. I have to say I was impressed.
In one case, they managed to put together 30,000 cores, which cost less than $1300 an hour to run for a big pharma company to do molecular modeling. For the Amazon show, they had virtual machines running across the globe on all eight of Amazon’s data centers. They were able to provision150,000 cores to run 264 years of compute time in less than a day’s actual elapsed time for a materials modeling application. Wow!
CycleComputing worked with Amazon to set this all up, so they could get all their virtual machines running in about the same time frame. If you had to create this computer in the real world, it would be $68 million. Cycle had a bill from Amazon of $30,000. While that is a lot of money, for the horsepower that they put together it really isn’t. I remember when some high-end PC servers cost that much for a single core not too long ago.
Think about that for a moment: in the past, you couldn’t get all this hardware set up in a matter of moments, let along months. Most supercomputers take years to build, and then they are almost obsolete, because someone else is building a bigger one. On the Top500.org list of the biggest ones, the current champ is a Chinese computer with more than three million cores. Just on cores alone, the CycleComputing assemblage would rank in the top 20 on this list.
Pretty amazing. Silicon Angle has this video interview from the show floor with Jason Stowe of the company.
If you doubted that the cloud is just a passing fad, this should convince you otherwise.
This isn’t anything all that “new.” But, these tools are becoming more accessible to the masses….
Distributed parallel computing like this has been going on for quite some time. People have been either donating or renting out their compute power so others could run protein folding and other tasks with their “spare” compute cycles. The hard part is the coordination and being able to divide the problem up into a lot of parallel tasks that can be performed independently yet synthesized together. Latency can’t be a big issue. Heck, even malware writers have been using similar technology to spew spam and engage in Denial of Service Attacks forever.
Some applications/problems run exceptionally well on massively parallel clusters of GPUs that are designed for some heavy duty math to be run in parallel. nVidia has been producing such clusters for years as well. One such cluster with the correct software can do the work of thousands of general purpose computers.
Storage and operating systems have existed for some time that can spread themselves across multiple physical machines and operate as if they had the power of a much larger entity.
In short, we have had the ability to perform distributed parallel processing for quite some time. But, this only works if you have the ability to divide up the problem and apportion the workload “reasonably.” This isn’t easy and not all problems fit this compute model.