As we approach the exascale computing era, performance and power usage are increasingly being determined by two factors. How quickly and how efficiently can we move data between the main memory, processors, accelerators, compute nodes, and disks?
To keep millions of compute cores active, we must feed them with data fast enough. That requires mitigating the bottleneck associated with data movement.
This project set out to investigate three complementary approaches to reducing data movement in high-performance computing (HPC):
The DOE Early Career Research Award allowed us to make progress on all three fronts. The highlight of this project was the development of zfp, the first random-accessible compressed representation of numerical floating-point arrays. Such arrays constitute the bulk of data generated in science and engineering simulations, observations, and experiments. zfp allows data volumes to be significantly reduced over conventional number representations, often by one to two orders of magnitude.
zfp is currently being developed on DOE’s Exascale Computing Project for use in science applications. It will effectively expand available memory, reduce communication between compute nodes, and minimize input/output time and offline storage.
zfp has also seen widespread adoption in HPC support tools and libraries such as ADIOS, HDF5®, Intel® Integrated Performance Primitives, and VTK-m. It underpins new open file formats that were developed in the oil and gas industry for storing and accessing massive seismic data sets.
Peter Lindstrom is a computer scientist and project leader in the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory, where he is currently a member of CASC’s Data Science and Analytics group.
The Early Career Research Program provides financial support that is foundational to early career investigators, enabling them to define and direct independent research in areas important to DOE missions. The development of outstanding scientists and research leaders is of paramount importance to the Department of Energy Office of Science. By investing in the next generation of researchers, the Office of Science champions lifelong careers in discovery science.
For more information, please go to the Early Career Research Program.
Combating the Data Movement Bottleneck
On next‐generation supercomputers, the power cost of moving data is the critical metric for software while limited bandwidth further constrains the amount of data that can be moved. The objective of this project is to alleviate the data‐movement bottleneck in extreme‐scale computing to accelerate numerical simulation and data analysis.
The research will focus on software solutions to reduce the amount of data transferred between memory banks, across distributed compute nodes, and between main memory and secondary storage.
This project will take a three‐pronged approach based on maximizing data locality by optimally reordering data elements, improving compute locality using parallel stream processing, and integrating high‐speed data compression.
These complementary techniques will limit the total size of data moved, minimize data accesses, and make effective use of multilevel caches to keep data as close to the processor as possible.
This effort will lead to new tools for greatly reducing data movement with commensurate increases in performance and reductions in power consumption on next‐generation, massively multi‐core, computer architectures.
D. Laney, S. Langer, C. Weber, P. Lindstrom, and A. Wegener, “Assessing the Effects of Data Compression in Simulations Using Physically Motivated Metrics.” International Conference on High Performance Computing, Networking, Storage and Analysis (SC ’13) 76, 1 (2013). [DOI: 10.1145/2503210.2503283]
P. Lindstrom, “Fixed-Rate Compressed Floating-Point Arrays.” IEEE Transactions on Visualization and Computer Graphics, 20, 2674 (2014). [DOI: 10.1109/TVCG.2014.2346458]
P. Lindstrom, P. Chen, and E-J. Lee, “Reducing Disk Storage of Full-3D Seismic Waveform Tomography (F3DT) Through Lossy Online Compression.” Computers & Geosciences 93, 45 (2016). [DOI: 10.1016/j.cageo.2016.04.009]
DOE Explains… offers straightforward explanations of key words and concepts in fundamental science. It also describes how these concepts apply to the work that the Department of Energy’s Office of Science conducts as it helps the United States excel in research across the scientific spectrum. For more information on nanoscience and DOE’s research in this area, please go to “DOE Explains… exascale computing.”
Additional profiles of the Early Career Research Program award recipients can be found on the Early Career Program Page.
The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit www.energy.gov/science.
sourced from https://www.sourcearu.com