January 6, 2020
Fully exploiting the potential of supercomputers
An EU initiative has designed and developed a computing platform based on a new memory technology. It will help improve the input/output (I/O) performance of high-performance computing (HPC) systems.
Thanks to their ability to implement simulation-based predictability with high accuracy, HPC systems are increasingly used in various applications, spanning virtually all industries and sectors. HPC, which involves thousands of processors working in parallel to analyze billions of pieces of data in real time, isn't without challenges.
More complex demands on scientific modeling and simulation highlight the need for faster HPC systems that can currently perform over a hundred quadrillion floating-point operations per second (FLOPS). The next stage is exascale computing, which could deliver at least one exaFLOPS, or a billion billion operations per second—a level expected to be reached by 2021. Exascale technology will enable far more accurate, detailed and larger-scale modeling and simulation than those provided by existing systems, but there are several challenges involved. A major one is the I/O bottleneck where a system doesn't have fast enough I/O performance. The EU-funded NEXTGenIO project addressed exactly this issue.
The project website states: "Current systems are capable of processing data quickly, but speeds are limited by how fast the system is able to read and write data. This represents a significant loss of time and energy in the system. Being able to widen, and ultimately eliminate, this bottleneck would majorly increase the performance and efficiency of HPC systems." NEXTGenIO designed and built a prototype hardware platform to achieve massive gains in I/O capabilities in supercomputing, using a new non-volatile random access memory (NVRAM) technology. NVRAM can retrieve stored data even after a power outage. In addition to the hardware, the project also developed a full software stack that is deployed on the prototype. The new system is seen as game-changing due to its ability to bridge the gap between memory and storage.
Dr. Michèle Weiland from EPCC, the supercomputing center at NEXTGenIO project coordinator The University of Edinburgh, summarizes the objectives of the project in a "Primeur Magazine' interview: "The project goals were to remove the I/O bottleneck as much as possible from HPC simulations, and not just traditional HPC simulations but also the upcoming data intensive and data analytics type of applications. The aim was to try and use this new memory technology to get rid of the performance gap that you have between DRAM [dynamic random access memory] and the power advances and put a layer in between."
Up and running
Dr. Weiland adds that the system developed by the NEXTGenIO (Next Generation I/O for Exascale) project will continue running for three years. In the same interview, EPCC's Adrian Jackson says: "We now have a nice stable usable system and we now have some two or three years to make good use of this. It is going to be a lot of work taking applications and optimize them, seeing how users will use them and how the industry will interact."
A news item on "HPCwire' highlights several HPC use cases for the project. One of them is project partner the European Centre for Medium-Range Weather Forecasts (ECMWF). "Using the NEXTGenIO platform, ECMWF demonstrated the ability to output the data to the new class of memory and significantly increase performance."