Skip to content

Samsung Electronics Semiconductor Unveils Cutting-edge Memory Technology to Accelerate Next-generation AI

Samsung Leading the Industry with HBM-PIM that supports Hyperscale AI and PNM technology based on CXL™

  • mail


One of the important trends in the field of artificial intelligence (AI), Hyperscale AI is an AI that can replicate human thinking and decision making by learning on its own. Doing so, it can accomplish incredible tasks, such as creating images based on human language prompts by comparing existing image analysis functions. Pushing AI to this extent, however, requires training and computation using computing infrastructures capable of handling large volumes of data – infrastructures which require next-generation technology like Samsung’s HBM-PIM. Delving deeper, contained within these volumes are large-capacity recommendation systems and language models that help improve accuracy for Hyperscale AI. For these models, accuracy tends to have a direct correlation with volume size, which points to a major hurdle. With existing memory solutions, computing this amount of data can be bottlenecked if the DRAM capacity and bandwidth for data transference are not adequately supported for Hyperscale AI models. As a workable solution, Samsung Electronics is preemptively developing memory technology that can overcome this problem. Samsung approached these challenges by utilizing PIM (Processing-in-Memory) and PNM (Processing-near-Memory) technologies as solutions. We have already put into development memory solutions that utilize these technologies and have standardized the software required for implementing these solutions. Implementation of AI-accelerator Equipped with HBM-PIM and GPU
The first of these solutions is PIM technology which improves performance and energy efficiency by offloading some of the data calculation work from the processor to inside the memory. In systems without PIM, the processor calls and executes commands from the memory and saves the results back in the memory (memory storage). This requires large amounts of data to be moved back and forth which takes significantly more energy compared to doing the processing of the data. PIM optimizes this process by reducing the data movement between the CPU and memory, improving the performance and energy efficiency of the AI accelerator system. With use cases requiring high bandwidth memory, it’s easy to see why putting HBM-PIM into practice is a great solution. * Shortcut to Samsung Electronics HBM-PIM technology description↗ In cooperation with AMD, Samsung installed HBM-PIM memory in AMD instinct™ Mi100 accelerator. Samsung then materialized an HBM-PIM Cluster based on this, and then applied it to various large-scale AI and High-Performance Computing (HPC) applications.
Compared to existing GPU accelerators, our testing confirmed that, on average, the addition of HBM-PIM improved performance by more than double and energy consumption reduced by more than 50%.
Put in perspective, if a large capacity language model proposed by Google is trained on a cluster consisting of 8 accelerators, using a GPU accelerator equipped with HBM-PIM can save 2,100 GWh of energy per year and cut down 960 thousand tons of carbon emissions. From the point of reducing environmental impact, this is equivalent to the amount of carbon absorbed by 16 million urban trees in 10 years. Furthermore, with software integration, pairing commercially available GPUs with HBM-PIM can dramatically reduce the bottleneck caused by memory capacity and bandwidth limitations in Hyperscale AI data centers. In support of these efforts, and to show confidence in the promise of HBM-PIM solutions, Samsung has prepared software using SYCL, an open software standard, to define specifications that can utilize GPU accelerators. With this software, customers will be able to use PIM memory solutions in an integrated software environment. We plan to make a full announcement in mid-October with software releases following in November. "Codeplay is proud to have been deeply involved in defining the SYCL standard and playing a role in creating the first conformant product." said Charles Macfarlane, Chief Business Officer for at Codeplay Software, and the one in charge of working together on the SYCL standardization. "Our work with Samsung in simplifying software development via Samsung's PIM systems opens up a much greater ecosystem of tools for scientists, allowing them to focus on algorithm development rather than hardware-level details." Development of CXL-based PNM Solution for High-capacity AI Models The other part of our solution involves CXL™ (Compute Express Link™), an open standard for high-speed processer to device and processer to memory interface which allows for more efficient use of memory and accelerators used with processors. CXL™ can be utilized in conjunction with other technologies such as Processing-near-Memory (PNM) to help facilitate memory capacity expansion. * Shortcut to Samsung Electronics CXL memory technology description↗ PNM, like PIM, is a technology that incorporates memory and logic chips into an advanced integrated circuit package which reduces data movement between CPU and memory by utilizing memory for data calculation. In the case of PNM, as the name implies, calculation functions are performed closer to the memory in order to reduce the bottleneck that occurs between the CPU and memory data transference.
Samsung's industry first CXL™ interface-based PNM technology, unveiled at Memory Tech Day on October 5, has proven to be an excellent solution for use with high-capacity AI model processing. In testing, CXL™ interface-based PNM solutions have been confirmed to more than double performance in applications such as recommendation systems or in-memory databases that require high memory bandwidth.


Memory Solutions Catered to Characteristics of AI Models With data-rich and complex AI models, data bandwidth and data transference solutions also require customization to maximize their benefits. Data used in AI models are classified into dense data and sparse data according to their characteristics. Dense data happens when the ratio of valid data is high inside the whole data cluster and thus dense while connected and sparse data has a low ratio of valid data. AI applications such as autonomous driving and voice recognition fall into dense data category, and user-based recommendation algorithms (Facebook friend recommendation) are examples of sparse data. Each model requires use-specific memory solutions matched to the application. As such, we’ve applied PIM technology to AI models based on dense data and PNM technology for AI models based on sparse data to respond to diverse customer needs. We expect to integrate our latest PIM and PNM technologies, we can provide sustainable memory solutions that both meet the needs of our customers today as well as their needs in the future as AI computing becomes more demanding. “HBM-PIM Cluster technology is the industry’s first customized memory solution for large-scale artificial intelligence." said Cheolmin Park, head of the New Business Planning Team at Samsung Electronics Memory Business Division, “By integrating CXL-PNM solutions with HBM-PIM through comprehensive software standardization process, we can offer a new standard of high-efficiency, high-performance memory solutions that can contribute to eco-conscious data management by reducing and optimizing the movement of massive data volumes needed for AI applications.” Other experts in the field have also weighed in on Samsung’s new memory solutions and the possibilities they represent for the industry. "At AMD we work with the industry to continue to look for new and innovative technologies that can benefit our customers.” said Josh Friedrich, Corporate Vice President, Accelerated Data Center Business, AMD. “We applaud the continued efforts from Samsung to research and create innovative memory technologies, such as Processing-in-Memory, that are targeted to improve performance in data intensive workloads.” "We are very interested in applying computational memory techniques to address the memory bandwidth and power efficiency challenges common in many of our high-performance computing and AI applications.” added Jeffrey Vetter, Corporate Fellow and Section Head of Computer Science and Math Division at the Oak Ridge National Laboratory. “We look forward to working with Samsung to evaluate how these developing technologies can be applied to Oak Ridge National Laboratory systems to enhance efficiency." Samsung Electronics plans to actively communicate with the IT industry and academic institutions to promote PIM/PNM technology in the future. The integrated software that supports HBM-PIM and CXL-based PNM solutions will be exhibited as demonstrations at SC22, the industry's largest supercomputing conference.