Thought the Chinese supercomputer was millions of times faster than the US, turned out to be just a trick – LLODO


Recently, China is becoming an emerging name in the supercomputer community, both in terms of hardware and software. As of October this year, at least two Chinese supercomputers have surpassed the exascale limit – many times faster than today’s most powerful supercomputers.

Especially recently, Chinese researchers published test results that said that one of the two supercomputers mentioned above, Sunway Oceanlite of the National Research Center for Parallel Computing Technology and Engineering (NRPCC) ), won the Gordol Bell Prize, an award for supercomputing systems that are millions of times faster than another famous American supercomputer, the Summit supercomputer.

Thought the Chinese supercomputer was millions of times faster than the US, it turned out to be just a trick - Photo 1.

Specifically, to receive Gordon Bell, a system must emulate the Sycamore 53-qubit circuit board – the quantum computer architecture introduced by Google a few years ago. The Sunway Oceanlite supercomputer did this in just 304 seconds. Meanwhile, according to estimates by a research team from the US Oak Ridge National Laboratory (ORNL), the US supercomputer may take up to 10,000 years to perform this simulation – thousands of times slower than that. million times more than competitors from China.

But as it turns out, faster speed does not mean that the Chinese supercomputer is actually more powerful than the US supercomputer, the difference lies in the accuracy of doing that calculation.

The truth behind the speed is millions of times faster than the opponent

Instead of clock speeds as found on conventional processors, performance measurement of supercomputers is measured in floating point calculations with double precision (64-bits) per second (or FLOPS – write) stands for floating-point operations per second), or FP64 FLOPS, as measured by the LINPACK benchmark score.

If the processor can execute FLOPS with less precision, the computation time will be significantly shortened, so the common standard for measuring supercomputer performance is the FP64 FLOPS index achieved in this article. benchmark LINPACK.

Thought the Chinese supercomputer was millions of times faster than the US, it turned out to be just a trick - Photo 2.

That is how the Chinese supercomputer can perform the simulation calculation in a very short time compared to the opponent. According to the news site NextPlatforms, Chinese engineers have reduced the accuracy of the calculation, from double precision (64-bit) to single precision (32-bit). This helps the Chinese supercomputer to do the above calculation in such a short time – the same thing as the trick when benchmarking PC computers.

Dmitry Liakh, a developer from ORNL, said: “In their Gordon Bell-certified work, Chinese researchers introduce a systematic design process, including algorithms, parallel computing capabilities, and the architecture required for simulation… Their simulation system achieved performance of 1.2 EFLOPS (each EFLOPS equals 1 billion billion floating point operations per second) with single precision, or 4.4 EFLOPS with mixed precision, using 41, 9 million Sunway multipliers.”

According to an estimate by the Asian Technology Information Program (ATIP), the stable performance of the Sunway Oceanlite supercomputer is about 1050 PFLOPS (1.05 EFLOPS). With this level of performance, Sunway Oceanlite is currently the second most powerful supercomputer system in China, behind the Tianhe-3 supercomputer, located at the National Supercomputing Center in Guangzhou, China. According to ATIP estimates, the stable performance of Tianhe-3 lies at 1300 PFLOPS (1.3 EFLOPS).

Thought the Chinese supercomputer was millions of times faster than the US, turned out to be just a trick - Photo 3.

The scores of these two supercomputers are much higher than the US Summit supercomputer when it only reached 200 PFLOPS when benchmarking according to LINPACK. However, it is worth noting that these new Chinese supercomputers do not post their benchmark scores on specialized sites like Top500.org, but post the results of Gordon Bell – where they do the trick. for higher performance.

China’s supercomputer ambitions

While deceiving the Sycamore simulation capabilities is to blame, it also shows that the Sunway Oceanlite system is capable of performing up to 1.2 EFLOPS FP32 in this particular algorithm. This raises another question about its performance: Why is it that a supercomputer system that scored 1.05 EFLOPS FP64 in one benchmark only scored 1.2 EFLOPS FP32 in another benchmark? .

The inconsistent numbers on the system performance of Sunway Oceanlite make one doubt whether the LINPACK benchmark of the performance of Tianhe-3, China’s top supercomputer today, is correct or not?

Even if Chinese companies can design supercomputer hardware with Petascale performance, exascale systems with reasonable power consumption seem unlikely. Even so, even if China’s processors and accelerators are not as fast as the competition, they can still mass produce and create supercomputing systems with more powerful performance, for regardless of energy consumption.

The biggest challenge right now to this ambition is the fact that supercomputer processor makers Sunway and Phytium are both on the US blacklist, making developing and building new processors a breeze. much more difficult than before.

Refer to TomsHardware

.



Link Hoc va de thi 2021

Chuyển đến thanh công cụ