NCSA Facilitates Performance Comparisons with China’s #1 Supercomputer

William M. Tang

William M. Tang

Paper describes ‘porting’ and running of a discovery-science-capable code from the plasma physics domain onto Sunway TaihuLight’s ‘home-grown’ architecture

China has topped supercomputer rankings on the international TOP500 list of fastest supercomputers for the past eight years. They have maintained this status with their newest supercomputer, Sunway TaihuLight, constructed entirely from Chinese processors.

While China’s hardware has “come into its own,” as Foreign Affairs wrote in August, no one can say objectively at present how fast this hardware can solve scientific problems compared to other leading systems around the world. This is because the computer is new—having made its debut in June, 2016.

Researchers were able to use seed funding provided through the Global Initiative to Enhance @scale and Distributed Computing and Analysis Technologies (GECAT) project administered by the National Center for Supercomputing Application’s (NCSA) Blue Waters Project to port and run codes on leading computers around the world. GECAT is funded by the National Science Foundation’s Science Across Virtual Institutes (SAVI) program, which focuses on fostering and strengthening interaction among scientists, engineers and educators around the globe. Shanghai Jiao Tong University and its NVIDIA Center of Excellence matched the NSF support for this seed project, and helped enable the collaboration to have unprecedented full access to Sunway TaihuLight and its system experts.

It takes time to transfer, or “port,” scientific codes built to run on other supercomputer architectures, but an international, collaborative project has already started porting one major code used in plasma particle-in-cell simulations, GTC-P. The accomplishments made and the road towards completion were laid out in a recent paper that won “best application paper” from the HPC China 2016 Conference in October.

“While LINPACK is a well-established measure of supercomputing performance based on a linear algebra calculation, real world scientific application problems are really the only way to show how well a computer produces scientific discoveries,” said Bill Tang, lead co-author of the study and head of the Intel Parallel Computing Center at Princeton University. “Real @scale scientific applications are much more difficult to deploy than LINPACK for the purpose of comparing how different supercomputers perform, but it’s worth the effort.”

The GTC-P code chosen for porting to TaihuLight is a well-traveled code in supercomputing, in that it has already been ported to seven leading systems around the world—a process that ran from 2011 to 2014 when Tang served as the U.S. principal investigator for the G8 Research Council’s “Exascale Computing for Global Scale Issues” Project in Fusion Energy, or “NuFuSE.” It was an international high-powered computing collaboration between the US, UK, France, Germany, Japan and Russia.

A major challenge that the Shanghai Jiao Tong and Princeton Universities collaborative team have already overcome is adapting the modern language (OpenACC-2) in which GTC-P was written, making it compatible with TaihuLight’s “homegrown” compiler, SWACC. An early result from the adaptation is that the new TaihuLight processors were found to be about three times faster than a standard CPU processor. Tang said the next step is to make the code work with a larger group of processors.

“If GTC-P can build on this promising start to engage a large fraction of the huge number of TaihuLight processors, we’ll be able to move forward to show objectively how this impressive, new, number-one-ranking supercomputer stacks up to the rest of the supercomputing world,” Tang said, adding that metrics like time to solution and associated energy to solution are key to the comparison.

“These are important metrics for policy makers engaged in deciding which kinds of architectures and associated hardware best merit significant investments,” Tang added.

The top seven supercomputers worldwide on which GTC-P can run well all have diverse hardware investments. For example, NCSA’s Blue Waters has more memory bandwidth than other U.S. systems, while TaihuLight has clearly invested most heavily in powerful new processors.

As Tang said recently in a technical program presentation at the SC16 conference in Salt Lake City, improvements in the GTC-P code have for the first time enabled delivery of new scientific insights. These insights show complex electron dynamics at the scale of the upcoming ITER device, the largest fusion energy facility ever constructed.

“In the process of producing these new findings, we focused on realistic cross-machine comparison metrics, time and energy to solution,” Tang said. “Moving into the future, it would be most interesting to be able to include TaihuLight in such studies.”

About the National Center for Supercomputing Applications

The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign provides supercomputing and advanced digital resources for the nation’s science enterprise. At NCSA, University of Illinois faculty, staff, students, and collaborators from around the globe use advanced digital resources to address research grand challenges for the benefit of science and society. NCSA has been advancing one third of the Fortune 50 for more than 30 years by bringing industry, researchers, and students together to solve grand challenges at rapid speed and scale.