Put in perspective, if a large capacity language model proposed by Google is trained on a cluster consisting of 8 accelerators, using a GPU accelerator equipped with HBM-PIM can save 2,100 GWh of energy per year and cut down 960 thousand tons of carbon emissions. From the point of reducing environmental impact, this is equivalent to the amount of carbon absorbed by 16 million urban trees in 10 years.
Furthermore, with software integration, pairing commercially available GPUs with HBM-PIM can dramatically reduce the bottleneck caused by memory capacity and bandwidth limitations in Hyperscale AI data centers.
In support of these efforts, and to show confidence in the promise of HBM-PIM solutions, Samsung has prepared software using SYCL, an open software standard, to define specifications that can utilize GPU accelerators. With this software, customers will be able to use PIM memory solutions in an integrated software environment. We plan to make a full announcement in mid-October with software releases following in November.
"Codeplay is proud to have been deeply involved in defining the SYCL standard and playing a role in creating the first conformant product." said Charles Macfarlane, Chief Business Officer for at Codeplay Software, and the one in charge of working together on the SYCL standardization. "Our work with Samsung in simplifying software development via Samsung's PIM systems opens up a much greater ecosystem of tools for scientists, allowing them to focus on algorithm development rather than hardware-level details."
Development of CXL-based PNM Solution for High-capacity AI Models
The other part of our solution involves CXL™ (Compute Express Link™), an open standard for high-speed processer to device and processer to memory interface which allows for more efficient use of memory and accelerators used with processors. CXL™ can be utilized in conjunction with other technologies such as Processing-near-Memory (PNM) to help facilitate memory capacity expansion.
* Shortcut to Samsung Electronics CXL memory technology description↗
PNM, like PIM, is a technology that incorporates memory and logic chips into an advanced integrated circuit package which reduces data movement between CPU and memory by utilizing memory for data calculation. In the case of PNM, as the name implies, calculation functions are performed closer to the memory in order to reduce the bottleneck that occurs between the CPU and memory data transference.