WHAT DID THE 2010 EARLY CAREER AWARD ALLOW YOU TO DO?
The objectives of this research were to enable and optimize reversible computing as a way to overcome the formidable challenges in the exascale and beyond. Reversible computing enables extreme-scale computations to proceed, but with the ability to dynamically detect and recover from transient hardware faults and other errors that are likely to occur in computing systems of extreme complexity. It directly addressed the challenges of memory wall, concurrency, resilience, and emerging architectures.
This study was the first to identify several new uses of reversible computing in terms of parallel synchronization, debugging and fault detection in supercomputing.
The research supported by the award helped address fundamental computational-science questions with respect to limits of energy and computation time. It enabled us to gain valuable insights on how to make massively large number of processors effectively work together without waiting on each other.
While meeting immediate goals of supercomputing, it also helped us revisit and reinterpret the very concept of computation. Moreover, since quantum computing is essentially reversible in nature, the reversible computing research from this career award work uniquely equips us to tackle the next generation challenge of quantum programming.
ABOUT:
Kalyan R S Perumalla is a Distinguished Research and Development Staff Member and Manager in the Computer Science and Mathematics Division at the U.S. Department of Energy’s Oak Ridge National Laboratory.
SUPPORTING THE DOE SC MISSION:
The Early Career Research Program provides financial support that is foundational to early career investigators, enabling them to define and direct independent research in areas important to DOE missions. The development of outstanding scientists and research leaders is of paramount importance to the Department of Energy Office of Science. By investing in the next generation of researchers, the Office of Science champions lifelong careers in discovery science.
For more information, please go to the Early Career Research Program.
THE 2010 PROJECT ABSTRACT:
ReveR‐SES: Reversible Software Execution Systems
The objective of this project is to develop novel software tools for increasing the efficiency and usability of codes on extreme‐scale computing systems. The research will pursue new approaches for high‐performance computing on extreme‐scale systems. Challenges to high-performance include the high costs of coordination and synchronization on systems with potentially one million or more processor cores. This research effort is based on the development and use of a so‐called reversible software execution paradigm that enables extreme‐scale computations to proceed, but with the ability to dynamically detect and recover from transient hardware faults and other errors that are likely to occur in computing systems of extreme complexity.
RESOURCES:
Perumalla, K.S. and Yoginath, S.B., “Towards reversible basic linear algebra subprograms: A performance study.” United States: N. p., 2014. Web. [DOI: 10.1007/978-3-662-45711-5_4].
Perumalla, K.S., and Park, A.J. “Reverse computation for rollback-based fault tolerance in large parallel systems.” Cluster Comput 17, 303 (2014). [DOI: 10.1007/s10586-013-0277-4].
Perumalla, K.S. and V.A. Protopopescu, V.A., “Reversible simulations of elastic collisions.” ACM Transactions on Modeling and Computer Simulation 23, 2, Article 12 (2013). [DOI:10.1145/2457459.2457461]
Additional profiles of the 2010 Early Career Award winners can be found at: https://www.energy.gov/science/listings/early-career-program.
Boilerplate: The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit www.energy.gov/science.