‘Cloud-bursting’ system allows researchers to scale up quickly when needed
Rice University is preparing to offer its researchers who deal in “big data” the opportunity to compute in the cloud with fewer barriers.
Rice is installing the Big Research Data Cloud (BiRD Cloud), which will allow for cloud bursting. That means data-intensive tasks can spill over into outside cloud-computing systems when necessary, essentially providing unlimited computing capacity.
Big data refers to collections of information too large for anything but a supercomputer to process. Scientists who construct computer models of genomes, weather patterns, astrophysics and biological systems – to name just a few examples – rely on processing power to get their results quickly.
“(BiRD Cloud) will be a great resource for fitting statistical and machine-learning models to big data with massive numbers of variables,” said neuroscientist Genevera Allen, Rice’s Dobelman Family Junior Chair of Statistics and an assistant professor of statistics and electrical and computer engineering. In one of her projects, Allen is developing statistical tools to make the best use of the wealth of cancer data collected by hospitals. “Such tasks require large amounts of computer memory and typically run on single nodes.”
Rather than one node, or access point, BiRD Cloud will incorporate 88 Hewlett-Packard SL230 nodes, each a computer on a card with two Intel eight-core Ivy Bridge processors. The nodes will be interconnected via 10-gigabit Ethernet. With a total of 1,408 computational cores, the system’s peak computing power will be 29.3 teraflops.
Rice and the National Science Foundation’s Major Research Instrumentation Program funded the system, which officials expect to be available to all Rice researchers by April 2015. Rice’s Ken Kennedy Institute for Information Technology and the Research Computing Support Group, in collaboration with the Office of the Vice Provost for Information Technology, will administer BiRD Cloud.
The university’s science, engineering, bioinformatics and educational technology researchers already use a large amount of computing power for their multidisciplinary projects, and their needs are growing, said project leader Moshe Vardi.
“BiRD Cloud will serve as our ‘on-ramp’ to external cloud infrastructures, thus enabling cloud bursting,” said Vardi, Rice’s Karen Ostrum George Professor in Computational Engineering and director of the Ken Kennedy Institute. “Rice’s stake in the cloud is one small step when compared with the benefits it will realize from this capital investment.”
Vardi said BiRD Cloud enables the university to experiment with operating a local cloud infrastructure as well as take seamless advantage of commercial providers. Cloud bursting occurs when an application exceeds local computing capacity and is allowed to “burst” — with the user’s permission — so that it can share the load with remote servers.
“Research will no longer be confined to the size of Rice’s shared computing infrastructure,” said Jan Odegard, executive director of the Ken Kennedy Institute. “Instead, BiRD Cloud will supply access to boundless computing resources from commercial providers like Google and Amazon, and access to such national resources as XSEDE’s cloud resources.”
The collaborative effort and expertise of more than 30 Rice faculty members contributed to the creation of BiRD Cloud, including the project’s co-principal investigators: Allen; Stephen Bradshaw, an assistant professor of physics and astronomy and the William V. Vietti Junior Chair of Space Physics; Lydia Kavraki, the Noah Harding Professor of Computer Science and a professor of bioengineering; and Ashok Veeraraghavan, an assistant professor of electrical and computer engineering and of computer science.
BiRD Cloud will support the campus user community while helping Rice’s IT professionals learn to support on-premise cloud computing and develop an understanding of how it can become part of an expansive, powerful and sustainable computing resource pool, said Kamran Khan, vice provost for information technology at Rice. “Our IT department has always been innovative in its efforts to provide the best solutions and applications to meet the varied needs of its user community,” Khan said. BiRD Cloud “will bolster our current and future research cloud and big-data strategy and hybrid initiatives at Rice.”
As a single system, BiRD Cloud will by no means be the largest computing system at Rice, but it will significantly accelerate research with the use of HP Helion, a portfolio of cloud products and services that enable organizations to manage workloads in hybrid IT environments, Odegard said.
He expects the higher bandwidth between BiRD Cloud and users’ desktops will enhance code development and interactive data analysis using such tools such as Matlab, R and Hadoop, popular software packages often used to analyze massive amounts of data. BiRD Cloud will be expandable, he said, and will leverage Rice’s condominium computing platform by offering researchers the option to buy additional computational resources.
More broadly, he said, BiRD Cloud will enhance the training of hundreds of undergraduate and graduate students and postdoctoral researchers in science and engineering, making a significant and direct impact on the educational experience offered by Rice.
Administrators expect to work with a select group of users early next year to help Rice optimize BiRD Cloud’s potential.